Implementing pytorch-lightning #94

asogaard · 2021-12-07T16:15:12Z

This PR introduces pytorch-lightning (PyL) to handle the main training loop. This introduces and number of quality-of-life benefits and means that we can reduce maintenance overhead quite substantially since the custom trainer classes and callbacks could be replaced more or less wholesale by PyL's built-in classes. Custom callbacks (e.g. the PiecewiseLinearLR scheduler) have been reimplemented using PyL's classes and have been confirmed to behave exactly as the original class it is replacing. A new example scripts examples/test_model_training_pytorch_lightning.py shows how to train a model in the PyL framework.

However, in the interest of ensuring full reproducibility of the paper results, I have agreed with @RasmusOrsoe to keep all of the old (i.e. non-PyL) model and training code in the src/gnn_reco/legacy/ module for the time being. After the low-energy performance paper is completed, we will tag the repo version and start deprecating these in favour of the functionality in the main branch. The example script examples/test_model_training_sqlite.py shows how to perform energy regression using this legacy code. The two example scripts have been tested and found to yield similar losses during training.

This PR addresses the problem with energy reconstruction pointed out in #88
Closes #76

Tagging @BozianuLeon and @kaareendrup as this might be of interest to you.

…other variables

…g rate callback along with custom progress bar

…el; print validation loss to commandline when training

… e.g. to have the model predict log10(energy) given targets that are naturally in units of energy

RasmusOrsoe

Nice! Good to see that PyL doesnt change the usage much.

One question; in the original implementation of the PiecewiseLinear Scheduler, it is possible to specify the initial LR (lowest bound), the maximal LR and the ending LR so three points in total. Is this still possible in the PyL implementation?

Rasmus

asogaard · 2021-12-09T11:55:50Z

Yes, @RasmusOrsoe! I have implemented a generic piecewise linear learning rate scheduler here, i.e. one where you can specify an arbitrary number of "milestones" (i.e. epoch/step counts) and the corresponding learning rate multipliers for each. Then the scheduler interpolates linearly between these milestones.

A special case is then having a starting LR multiplier, an end LR multiplier, and a "peak" somewhere in between. I have included an example of this, that should mirror the original implementation exactly, here.

RasmusOrsoe

Cool! Thanks!

asogaard · 2021-12-09T12:01:46Z

Tagging @kaareendrup cf. our chat earlier today.

asogaard added 15 commits November 23, 2021 16:11

Auto-updating badges

1950f86

Add pytorch-lightning to setup, and specify torch version 1.9.0

463b7cf

Switch to using pytorch-lightning modules

a0c5d42

Remove custom trainer classes

06c0961

Fix type

d9842fc

Add utility script to get predictions along with an arbitrary set of …

140c4e5

…other variables

Move old callbacks to legacy module; add new piecewise linear learnin…

a0d30b9

…g rate callback along with custom progress bar

Add back legacy model; add support for learning rate scheduler in Mod…

818b88c

…el; print validation loss to commandline when training

Remove whitespace

fb9e5e6

Add legacy trainer/predictor classes back

08691f8

Update gitignore

575f30a

Add support for transforming target separately from model prediction,…

3f96a6b

… e.g. to have the model predict log10(energy) given targets that are naturally in units of energy

Remove unused, reimplemented legacy code

b1ba5b2

Revert to training example with 'legacy' code

d9b01b3

Add separate training example with pytorch-lightning code

71c8057

asogaard added bug Something isn't working feature New feature or request labels Dec 7, 2021

asogaard requested a review from RasmusOrsoe December 7, 2021 16:15

asogaard self-assigned this Dec 7, 2021

Fix argument name in unit test

e84a713

RasmusOrsoe reviewed Dec 9, 2021

View reviewed changes

RasmusOrsoe self-requested a review December 9, 2021 11:58

RasmusOrsoe approved these changes Dec 9, 2021

View reviewed changes

asogaard merged commit e3a7ec6 into graphnet-team:main Dec 9, 2021

asogaard deleted the pytorch-lightning branch December 9, 2021 12:01

asogaard mentioned this pull request Dec 12, 2021

Tweaks to energy regression task and a few utils #88

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementing pytorch-lightning #94

Implementing pytorch-lightning #94

asogaard commented Dec 7, 2021

RasmusOrsoe left a comment •

edited

Loading

asogaard commented Dec 9, 2021

RasmusOrsoe left a comment

asogaard commented Dec 9, 2021

Implementing pytorch-lightning #94

Implementing pytorch-lightning #94

Conversation

asogaard commented Dec 7, 2021

RasmusOrsoe left a comment • edited Loading

Choose a reason for hiding this comment

asogaard commented Dec 9, 2021

RasmusOrsoe left a comment

Choose a reason for hiding this comment

asogaard commented Dec 9, 2021

RasmusOrsoe left a comment •

edited

Loading