Inverse Square Root LR Schedule #657

mansheej · 2023-10-09T16:05:55Z

Adds the Inverse Square Root LR Scheduler.

This scheduler is meant to easily enable continual learning. It consists of three components:

A linear LR warmup.
A component where the LR decays as an inverse square root in the number of steps to a constant value at infinite time.
An optional linear cooldown.

The image below show two examples of the LR schedule with a 10 step warmup, with either no cooldown (orange) or a 20 step cooldown (blue) run for 100 steps total.

dakinggg

Will review in full as well, but is there any reason this should be in LLM foundry? It seems generically useful and I would probably prefer it is in composer directly.

mansheej · 2023-10-09T19:01:05Z

Will review in full as well, but is there any reason this should be in LLM foundry? It seems generically useful and I would probably prefer it is in composer directly.

This implementation and hyperparameters are a little different from how people typically do it. I wanted to first get it into LLM foundry, run some more experiments and have other people use it so we can work out some kinks and learn about some best practices, and then eventually upstream it into composer with the best practices documented.

dakinggg

Could you please add some unit tests testing this schedule produces what you expect?

llmfoundry/optim/scheduler.py

mansheej · 2023-10-09T20:33:47Z

Could you please add some unit tests testing this schedule produces what you expect?

Do you have any suggestions for reasonable unit tests for the LR scheduler?

dakinggg · 2023-10-09T20:38:28Z

I'd suggest:

the "build" function can create it successfully
set up a couple simple schedules with known values that you expect and test that your scheduler code produces exactly the schedule you expect.

b-chu

Add some unit tests to test functionality

llmfoundry/optim/scheduler.py

Co-authored-by: Brian <23239305+b-chu@users.noreply.github.com>

mansheej · 2023-10-10T01:52:34Z

Added unit tests and tried to address all the comments. Failing the Code Quality Checks, but I'm not sure why from the Error message.

dakinggg · 2023-10-10T01:53:22Z

@mansheej try running pre-commit run --all-files locally

…nto inv-sqrt-lr-sched

llmfoundry/optim/scheduler.py

tests/test_scheduler.py

Co-authored-by: Brian <23239305+b-chu@users.noreply.github.com>

dakinggg

Basically LGTM, left a few comments on simplifying the tests.

tests/test_scheduler.py

mansheej · 2023-10-11T03:31:06Z

Addressed most of the comments. The rest can perhaps be filed for improvement/addressed when we upstream to Composer?

dakinggg

I am ok with the level of test coverage here. Before merging, could you please make the PR description more complete?

In this case, I think an example schedule produced by this code (e.g. a wandb graph) and a one sentence description of the gist of the schedule would suffice. Thanks!

b-chu

Two small changes, thanks for the tests!

llmfoundry/optim/scheduler.py

tests/test_scheduler.py

mansheej and others added 8 commits September 8, 2023 00:52

Implement inverse square root with warmup scheduler v0

f2142e8

Merge branch 'mosaicml:main' into inv-sqrt-lr-sched

7858cbd

Merge branch 'mosaicml:main' into inv-sqrt-lr-sched

7ebb1c0

Merge branch 'mosaicml:main' into inv-sqrt-lr-sched

df6abcb

quick

92407a1

Merge branch 'mosaicml:main' into inv-sqrt-lr-sched

a5904be

Merge branch 'mosaicml:main' into inv-sqrt-lr-sched

90338b4

inverse square root LR Schedule

54007d3

mansheej requested review from sashaDoubov, codestar12, b-chu and dakinggg October 9, 2023 16:05

dakinggg reviewed Oct 9, 2023

View reviewed changes

llmfoundry/optim/scheduler.py Show resolved Hide resolved

llmfoundry/optim/scheduler.py Show resolved Hide resolved

scheduler

1f9ac7a

b-chu requested changes Oct 9, 2023

View reviewed changes

b-chu reviewed Oct 9, 2023

View reviewed changes

llmfoundry/optim/scheduler.py Outdated Show resolved Hide resolved

mansheej and others added 5 commits October 10, 2023 00:12

unit tests

0496953

Update llmfoundry/optim/scheduler.py

e99997e

Co-authored-by: Brian <23239305+b-chu@users.noreply.github.com>

Update llmfoundry/optim/scheduler.py

0752dd0

Co-authored-by: Brian <23239305+b-chu@users.noreply.github.com>

fixes for PR conversations

de7873e

merge some commits

166230f

codestar12 added 3 commits October 10, 2023 10:26

format

b8e37fc

Merge branch 'main' into inv-sqrt-lr-sched

81d0972

fix type hint

b8f717d

Merge branch 'inv-sqrt-lr-sched' of github.com:mansheej/llm-foundry i…

26accc4

…nto inv-sqrt-lr-sched

codestar12 enabled auto-merge (squash) October 10, 2023 15:35

codestar12 requested a review from b-chu October 10, 2023 15:35

Merge branch 'main' into inv-sqrt-lr-sched

71bb52a

b-chu requested changes Oct 10, 2023

View reviewed changes

codestar12 and others added 4 commits October 10, 2023 15:13

Update llmfoundry/optim/scheduler.py

540f9e5

Co-authored-by: Brian <23239305+b-chu@users.noreply.github.com>

Update llmfoundry/optim/scheduler.py

70cbef2

Co-authored-by: Brian <23239305+b-chu@users.noreply.github.com>

Update llmfoundry/optim/scheduler.py

7a17279

Co-authored-by: Brian <23239305+b-chu@users.noreply.github.com>

more fixes

d7d49f9

dakinggg reviewed Oct 11, 2023

View reviewed changes

tests/test_scheduler.py Outdated Show resolved Hide resolved

tests/test_scheduler.py Outdated Show resolved Hide resolved

tests/test_scheduler.py Outdated Show resolved Hide resolved

more fixes

574276c

mansheej requested review from hanlint, b-chu and dakinggg October 11, 2023 03:31

dakinggg approved these changes Oct 11, 2023

View reviewed changes

more test

c890161

b-chu approved these changes Oct 11, 2023

View reviewed changes

llmfoundry/optim/scheduler.py Outdated Show resolved Hide resolved

tests/test_scheduler.py Outdated Show resolved Hide resolved

mansheej added 3 commits October 11, 2023 15:00

fixes

608de1a

fix pyright errors

91b7db3

fix more errors

a11bcab

codestar12 merged commit 6c98276 into mosaicml:main Oct 11, 2023
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inverse Square Root LR Schedule #657

Inverse Square Root LR Schedule #657

mansheej commented Oct 9, 2023 •

edited

dakinggg left a comment

mansheej commented Oct 9, 2023

dakinggg left a comment

mansheej commented Oct 9, 2023

dakinggg commented Oct 9, 2023

b-chu left a comment

mansheej commented Oct 10, 2023

dakinggg commented Oct 10, 2023

dakinggg left a comment

mansheej commented Oct 11, 2023

dakinggg left a comment

b-chu left a comment

Inverse Square Root LR Schedule #657

Inverse Square Root LR Schedule #657

Conversation

mansheej commented Oct 9, 2023 • edited

dakinggg left a comment

Choose a reason for hiding this comment

mansheej commented Oct 9, 2023

dakinggg left a comment

Choose a reason for hiding this comment

mansheej commented Oct 9, 2023

dakinggg commented Oct 9, 2023

b-chu left a comment

Choose a reason for hiding this comment

mansheej commented Oct 10, 2023

dakinggg commented Oct 10, 2023

dakinggg left a comment

Choose a reason for hiding this comment

mansheej commented Oct 11, 2023

dakinggg left a comment

Choose a reason for hiding this comment

b-chu left a comment

Choose a reason for hiding this comment

mansheej commented Oct 9, 2023 •

edited