Skip to content

Use snapshot to prevent warmup from affecting training and refactor warmup#68

Open
michaelmckinsey1 wants to merge 5 commits intoLBANN:mainfrom
michaelmckinsey1:fix-warmup
Open

Use snapshot to prevent warmup from affecting training and refactor warmup#68
michaelmckinsey1 wants to merge 5 commits intoLBANN:mainfrom
michaelmckinsey1:fix-warmup

Conversation

@michaelmckinsey1
Copy link
Copy Markdown
Collaborator

@michaelmckinsey1 michaelmckinsey1 commented May 7, 2026

  • Fixes issue where warmup was affecting training by using snapshot to restore model state to that before warmup.
  • Fixes issue where validation was not being warmed up properly.
  • Refactor shared logic

With this configuration, runtime of the first epoch should now be similar to subsequent epochs.

@michaelmckinsey1 michaelmckinsey1 self-assigned this May 7, 2026
@michaelmckinsey1 michaelmckinsey1 changed the title Use snapshot to prevent warmup from affecting training Use snapshot to prevent warmup from affecting training and refactor warmup May 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Refactor common logic in warmup and train bodies in trainer.py

1 participant