Change HEPAugFoldYielder to callback? #73

GilesStrong · 2020-06-08T11:27:52Z

Current status

HEPAugFoldYielder applied train-time and test-time data augmentaitons to HEP data (phi rotations, transverse & longitudinal flips). This is performed when loading the data since originally, this was the last point at which the feature names for the data were known to the model. Later changes to LUMIN, now mean that the model has a list of named features and how they map to the input features. This means that instead the data augmentation could be performed by a callback during training (similar to the suggestion of issue #68).

Discussion

It seems a bit strange that the choice of whether or not to augment the data is made by changing how the data is loaded from file. Specifying the choice as a callback make a bit more sense (to me). This also avoids complications once addition forms of augmentation are added, which may otherwise require their ownFoldYielder classes, and we must then account for all possible combinations of different types of augmentation.

Depending on the choices made in issue #50, this may reduce the efficiency of augmentation, but it's possible that augmenting the data inplace on device may actually be more efficient by since it could be done multithreaded. This would perhaps avoid the need to augment as a pandas.DataFrame, and maybe pre-cached rotation matrices could be used, in some part, to speed things up. Since the data is already on device, this would actually be quicker than loaded from disc, augmenting, and then loading to device; this is known to cause particular slow-down when working on GPU

Possible change

The callback would need to mimic the behaviour of HEPAugFoldYielder, i.e. provide random augmentation during training, and a choice of either set transformations during testing or random ones. It would need to be passed as a callback during training and prediction.

Additionally, tests should be done to compare the speed and memory usage of the callback to HEPAugFoldYielder.

If successful, this would depreciate HEPAugFoldYielder.

The text was updated successfully, but these errors were encountered:

GilesStrong mentioned this issue Jun 8, 2020

Make HEPAugFoldYielder work with pT eta phi coordinates #44

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change HEPAugFoldYielder to callback? #73

Change HEPAugFoldYielder to callback? #73

GilesStrong commented Jun 8, 2020

Change HEPAugFoldYielder to callback? #73

Change HEPAugFoldYielder to callback? #73

Comments

GilesStrong commented Jun 8, 2020

Current status

Discussion

Possible change