Skip to content

Adding additional learning rate scheduler#296

Merged
wiederm merged 18 commits intomainfrom
dev-training-scheduler
Oct 29, 2024
Merged

Adding additional learning rate scheduler#296
wiederm merged 18 commits intomainfrom
dev-training-scheduler

Conversation

@wiederm
Copy link
Copy Markdown
Member

@wiederm wiederm commented Oct 23, 2024

Pull Request Summary

This PR will add a number of additional learning rate scheduler and infrastructure that is needed to pass control parameters. Additionally, this PR will add a loss scaling scheduler to dynamically control the weight of each loss component as a function for the current epoch index.

Learning rate scheduling

The default learning rate scheduler used a step function reducing the learning rate whenever a monitored property was not improving for a given number of optimization steps, an alternative is something like the CosineAnnealing learning rate scheduler (and it's variety with warmup/restart) that anneales the learning rate from a starting value to a target value over a specified number of epochs.
Other provided LR scheduler are the OneCycle and Cyclic learning rate scheduler, see the PyTorch documentation for their exact behvior.

image

Loss component scaling

To prioritize different learning tasks in multi-objective learning runs this PR introduces a linear scaling to each component that can be optionally activated using the keywords target_weights and mixing_step for each component name. This results in a scaling of the component from the weight to the target_weight value using mixing_step as the step size (Note that the sign has to be changed depending on the singe of the slope).

In the training run shown below the force component loss is scaled using from an initial weight of 0.8 to 0.2 using -0.1 step size, and then the training is continued with the target weight:
image

Key changes

Notable points that this PR has either accomplished or will accomplish.

Associated Issue(s)

Pull Request Checklist

  • Issue(s) raised/addressed and linked
  • Includes appropriate unit test(s)
  • Appropriate docstring(s) added/updated
  • Appropriate .rst doc file(s) added/updated
  • PR is ready for review

- adding a scheduler for the loss components (will allow us to change the scaling of the components as a factor of epoch number)
@wiederm wiederm self-assigned this Oct 23, 2024
@wiederm wiederm added the enhancement New feature or request label Oct 24, 2024
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Oct 24, 2024

Codecov Report

Attention: Patch coverage is 93.64162% with 11 lines in your changes missing coverage. Please review.

Project coverage is 85.35%. Comparing base (45be449) to head (87c4d3c).
Report is 19 commits behind head on main.

Additional details and impacted files

@wiederm wiederm merged commit 710d1e4 into main Oct 29, 2024
@wiederm wiederm deleted the dev-training-scheduler branch October 29, 2024 09:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants