Gh diagnostics#58
Merged
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR adds support scaffolding for TMVA-style gamma/hadron classification features and related preprocessing/tests, alongside a small change to classification efficiency bookkeeping.
Changes:
- Adds TMVA-style feature selection/configuration paths for classification training.
- Extends classification extra columns and clipping ranges for TMVA-derived variables.
- Adds tests for binning behavior and optional
SizeSecondMaxhandling.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
src/eventdisplay_ml/features.py |
Adds TMVA-style classification feature lists and clip interval updates. |
src/eventdisplay_ml/data_processing.py |
Adds TMVA-style training data loading path and extra classification-derived columns. |
src/eventdisplay_ml/config.py |
Reads tmva_style from classification model parameters during training configuration. |
src/eventdisplay_ml/evaluate.py |
Changes stored signal/background count columns in efficiency output. |
tests/test_data_processing.py |
Adds tests for zenith and energy interpolation binning. |
tests/test_classification_apply_interpolation.py |
Adds test for missing SizeSecondMax behavior in standard classification. |
docs/changes/58.feature.md |
Adds changelog fragment for TMVA-style classification support. |
Comments suppressed due to low confidence (2)
src/eventdisplay_ml/data_processing.py:810
- The TMVA-style flag is only handled while loading training data; the application path still loads
features.features(...)and runs the standard telescope flattening path. A model trained with this branch will therefore be applied with a different feature set, with TMVA-only inputs such asSizeSecondMax,DispAbsSumWeigth, or rawXcore/Ycoremissing or filled as NaN at inference.
tmva_style = model_configs.get("tmva_style", False)
if tmva_style and analysis_type == "classification":
_logger.info("Using TMVA-style features for classification")
branch_list = features_module.features_tmva_style(analysis_type, training=True)
src/eventdisplay_ml/data_processing.py:885
- Building the TMVA-style training frame directly from ROOT branches bypasses
extra_columns()andapply_clip_intervals(). As a result, chi-square andSizeSecondMaxinputs are not clipped/log-transformed andXcore/Ycoreare not converted toCore_Distance, so the model is trained on raw variables instead of the TMVA-style preprocessed features defined elsewhere.
# For TMVA-style classification, skip telescope flattening (use event-level features only)
if tmva_style and analysis_type == "classification":
_logger.info("Converting to pandas (no telescope flattening for TMVA style)")
# Build DataFrame directly to stay compatible across awkward versions
# (some versions do not provide ak.to_pandas).
df_flat = pd.DataFrame({name: _to_numpy_1d(df[name]) for name in df.fields})
Comment on lines
+139
to
+140
| "SizeSecondMax", | ||
| } |
Comment on lines
+242
to
+243
| "EChi2S": (energy_min, 10000.0), # TMVA: -6 to 4 (before log10) | ||
| "EmissionHeightChi2": (1e-6, 10000.0), # TMVA: -11 to 4 (before log10) |
| "tgrad_x": (-50.0, 50.0), | ||
| "MSCW": (-2.0, 2.0), | ||
| "MSCL": (-2.0, 5.0), | ||
| "SizeSecondMax": (1e-6, 100000.0), # TMVA: 0 to 5 (after log10) |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.