Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comparison with CTA-MARS: Energy estimation #92

Closed
12 tasks done
HealthyPear opened this issue Jan 22, 2021 · 0 comments
Closed
12 tasks done

Comparison with CTA-MARS: Energy estimation #92

HealthyPear opened this issue Jan 22, 2021 · 0 comments
Labels
enhancement New feature or request summary A summary of issues related to the same subject

Comments

@HealthyPear
Copy link
Member

HealthyPear commented Jan 22, 2021

This issue is part of a project described in issue #24.

The following is a "real-time" list of points that are found to be differences between the pipelines using the comparison.
Not all features are critical to recovering the missing performance, but all should be implemented (as more similar as possible) in order to allow their optional use when comparing different algorithms.


  • Add the possibility to use a Random Forest

Currently, protopipe uses an Adaptive Boost Regressor based on a Decision Tree, while CTAMARS uses a Random Forest regressor.

  • Add missing parameters

    • Concentration (fraction of the total Intensity which is contained in the two brightest pixels of the cleaned image) - Add concentration #132

This is not defined in the same way in ctapipe, not sure if we should add it this way, better wait to see if the difference in definition plays a role in the overall performance of the pipeline.

  • Leakage1 (fraction of total Intensity which is contained in the outermost pixels of the camera)
  • log10(Width*Length/Size)
  • square of distance from Image c.o.g. to the reconstructed event direction on the camera (dir_x, dir_y)
  • atan2(cog_y - dir_y, cog_x - dir_x)

The last 3 features may require a small enhancement in the management of the features read from the configuration file and form the scripts which produce DL2 data (see #90)

  • Get RMS from each Random Forest regressor.

Right now we get only the estimate but we should get also a measure of the variance from the trees estimates.
This is used also in the weighting for the gammaness estimation for the DL2-candidate event.

  • Even though we can get the RMS out of the trees we are missing a detail

From the wiki page of the CTAMARS analysis,

the RFs provide estimates for log10 E and its RMS, but the average is done after converting those to linear energy scale)

and now issue #139 is breaking this behaviour because this operation is performed on the base-10 logarithm scale!

This issue will be useful also for #93.
Right now we weigh with Intensity, while CTAMARS uses 1/RMS^2 where RMS comes from each Random Forest regressor.

  • Recover missing performance below few hundreds of GeVs

As shown here we have lost sensitivity at low energies (mainly due to the necessary changes between 0.3.0 and 0.4.0).
Currently, it is not clear if with the previous 2 points this will be solved or it will require to fix/add something in DL1 and/or DL2a.

  • Modify usage/training (configurable, to check)

CTAMARS uses the whole gamma-1 and samples to train the classification model, whereas protopipe splits the original TRAINING data into train/test sub-samples.
This allows applying intermediate benchmarking before applying the models to the rest of the analysis data sample (DL2 production takes more time and it could be convenient to make studies on the models without producing every time DL2 data).
In the case of energy estimation, this could represent a minor problem than classification, in fact, the energy estimation benchmarks can be applied (as of 0.4.0) to the gamma-2 sample, which is used to train the classification model.

@HealthyPear HealthyPear added enhancement New feature or request summary A summary of issues related to the same subject labels Jan 22, 2021
@HealthyPear HealthyPear added this to the v0.5.0 milestone Jan 22, 2021
@HealthyPear HealthyPear added this to To do in Pipeline features and enhancements via automation Jan 22, 2021
@HealthyPear HealthyPear moved this from To do to Summary issues in Pipeline features and enhancements Jan 22, 2021
@HealthyPear HealthyPear removed this from the v0.5.0 milestone Dec 9, 2021
Pipeline features and enhancements automation moved this from Summary issues to Done Feb 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request summary A summary of issues related to the same subject
Development

No branches or pull requests

1 participant