Skip to content

Gh diagnostics#58

Merged
GernotMaier merged 14 commits into
mainfrom
gh-diagnostics
May 14, 2026
Merged

Gh diagnostics#58
GernotMaier merged 14 commits into
mainfrom
gh-diagnostics

Conversation

@GernotMaier
Copy link
Copy Markdown
Member

No description provided.

@GernotMaier GernotMaier self-assigned this May 14, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support scaffolding for TMVA-style gamma/hadron classification features and related preprocessing/tests, alongside a small change to classification efficiency bookkeeping.

Changes:

  • Adds TMVA-style feature selection/configuration paths for classification training.
  • Extends classification extra columns and clipping ranges for TMVA-derived variables.
  • Adds tests for binning behavior and optional SizeSecondMax handling.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/eventdisplay_ml/features.py Adds TMVA-style classification feature lists and clip interval updates.
src/eventdisplay_ml/data_processing.py Adds TMVA-style training data loading path and extra classification-derived columns.
src/eventdisplay_ml/config.py Reads tmva_style from classification model parameters during training configuration.
src/eventdisplay_ml/evaluate.py Changes stored signal/background count columns in efficiency output.
tests/test_data_processing.py Adds tests for zenith and energy interpolation binning.
tests/test_classification_apply_interpolation.py Adds test for missing SizeSecondMax behavior in standard classification.
docs/changes/58.feature.md Adds changelog fragment for TMVA-style classification support.
Comments suppressed due to low confidence (2)

src/eventdisplay_ml/data_processing.py:810

  • The TMVA-style flag is only handled while loading training data; the application path still loads features.features(...) and runs the standard telescope flattening path. A model trained with this branch will therefore be applied with a different feature set, with TMVA-only inputs such as SizeSecondMax, DispAbsSumWeigth, or raw Xcore/Ycore missing or filled as NaN at inference.
    tmva_style = model_configs.get("tmva_style", False)
    if tmva_style and analysis_type == "classification":
        _logger.info("Using TMVA-style features for classification")
        branch_list = features_module.features_tmva_style(analysis_type, training=True)

src/eventdisplay_ml/data_processing.py:885

  • Building the TMVA-style training frame directly from ROOT branches bypasses extra_columns() and apply_clip_intervals(). As a result, chi-square and SizeSecondMax inputs are not clipped/log-transformed and Xcore/Ycore are not converted to Core_Distance, so the model is trained on raw variables instead of the TMVA-style preprocessed features defined elsewhere.
                # For TMVA-style classification, skip telescope flattening (use event-level features only)
                if tmva_style and analysis_type == "classification":
                    _logger.info("Converting to pandas (no telescope flattening for TMVA style)")
                    # Build DataFrame directly to stay compatible across awkward versions
                    # (some versions do not provide ak.to_pandas).
                    df_flat = pd.DataFrame({name: _to_numpy_1d(df[name]) for name in df.fields})

Comment on lines +139 to +140
"SizeSecondMax",
}
Comment on lines +242 to +243
"EChi2S": (energy_min, 10000.0), # TMVA: -6 to 4 (before log10)
"EmissionHeightChi2": (1e-6, 10000.0), # TMVA: -11 to 4 (before log10)
"tgrad_x": (-50.0, 50.0),
"MSCW": (-2.0, 2.0),
"MSCL": (-2.0, 5.0),
"SizeSecondMax": (1e-6, 100000.0), # TMVA: 0 to 5 (after log10)
Comment thread src/eventdisplay_ml/data_processing.py Outdated
Comment thread docs/changes/58.feature.md Outdated
GernotMaier and others added 2 commits May 14, 2026 11:14
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@GernotMaier GernotMaier marked this pull request as ready for review May 14, 2026 10:21
@GernotMaier GernotMaier merged commit e9fddd4 into main May 14, 2026
3 checks passed
@GernotMaier GernotMaier deleted the gh-diagnostics branch May 14, 2026 10:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants