Reconstructed energy is written in logscale while true energy in linear scale #139

HealthyPear · 2021-05-16T16:26:08Z

This is because we write to file the direct estimation from the energy regressor which is decided by the target value in the model, log10_true_energy by default.

This creates 2 problems:

classifier features related to reconstructed energy need then to be written as

log10_reco_energy: reco_energy # Averaged-estimated energy of the shower

which is horrible

benchmarking code is not elastic enough so results can seem wrong but only because cuts are done in the wrong scale...

The text was updated successfully, but these errors were encountered:

kosack · 2021-05-17T09:06:06Z

I think the solution is to allow a transformation to normalize/re-scale the predicted variable. E.g. the predicted value should always be "energy" (not log10_energy), but you should have an option

transform: np.log10
inverse_transform: lambda p: 10**p

And then during training you call the transform so all computations are in log10_energy, and after predict call the inverse transform to go back to energy. You could also include the scaling there to TeV, unless that is just assumed that energies are in TeV.

So the sequence of steps is:

input training data → transform → train
input testing data → predict → inverse_transform → prediction

The same could even be used for input data to the training, if you really want to be general. I.e. you could allow a column name + transform + inverse_transform for all variables (e.g. intensity → log10(intensity) → training)
However, I guess the user-defined features solve that problem, so it's probably only needed for the input/output parameter

HealthyPear · 2021-05-17T20:24:40Z

Just a small clarification: I found this problem only now because in the previous AdaBoost config the true target was true_energy and not log10_true_energy so the estimated value was always in linear scale (not sure if this was also one of the factors for which resolution was bad before)

HealthyPear added the wrong behaviour The code works but produces clearly wrong results label May 16, 2021

HealthyPear added this to Needs triage in Bugs and wrong behaviours via automation May 16, 2021

This was referenced May 17, 2021

Comparison with CTA-MARS: Energy estimation #92

Closed

Ensure that estimated energy is always recorded in linear scale #141

Merged

HealthyPear closed this as completed in #141 May 19, 2021

Bugs and wrong behaviours automation moved this from Needs triage to Closed May 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reconstructed energy is written in logscale while true energy in linear scale #139

Reconstructed energy is written in logscale while true energy in linear scale #139

HealthyPear commented May 16, 2021

kosack commented May 17, 2021 •

edited

Loading

HealthyPear commented May 17, 2021

Reconstructed energy is written in logscale while true energy in linear scale #139

Reconstructed energy is written in logscale while true energy in linear scale #139

Comments

HealthyPear commented May 16, 2021

kosack commented May 17, 2021 • edited Loading

HealthyPear commented May 17, 2021

kosack commented May 17, 2021 •

edited

Loading