Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Train-time data-augmentation for parameterised learning #68

Open
GilesStrong opened this issue Jun 8, 2020 · 0 comments
Open

Train-time data-augmentation for parameterised learning #68

GilesStrong opened this issue Jun 8, 2020 · 0 comments
Labels
enhancement New feature or request low priority Not urgent and won't degrade with time

Comments

@GilesStrong
Copy link
Owner

Overview

Parameterised learning is useful in HEP, for example in cases where a classifier should learn multiple signal hypotheses (e.g. a heavy Higgs of several possible masses) see Baldi et al., 2016.

In this example the signal would have a parameterised input equal to the true resonant mass, and the background would be randomly assigned resonant masses. Once trained, the entire dataset can be set to a particular resonant mass in order to perform inference for a given hypothesis. This last part is already possible with the ParametrisedPrediction class.

Data augmentation for parameterised learning

Currently the random assignment of parameterised-feature values for background (in the example above) is performed once when preparing the data for training. It could well be possible that it is useful to perform this random assignment during training, which may provide some of the benefits of train-time data augmentation.

Implementation

To avoid conflicts with HEPAugFoldYielder, and due to the fact that this only wants to be performed during training, this secondary form of augmentation should probably implemented as a callback. It also needs to account for the possibility that multiple parameterisation features may be used, and that only a subset of the data may need to be changed.

@GilesStrong GilesStrong added enhancement New feature or request low priority Not urgent and won't degrade with time labels Jun 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request low priority Not urgent and won't degrade with time
Projects
None yet
Development

No branches or pull requests

1 participant