-
Notifications
You must be signed in to change notification settings - Fork 21.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
To add SequentialLR to PyTorch Core Schedulers #64037
Conversation
🔗 Helpful links
💊 CI failures summary and remediationsAs of commit b42bb36 (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions to the (internal) Dr. CI Users group. |
a5532e4
to
c1227cb
Compare
c1227cb
to
c8563bf
Compare
Codecov Report
@@ Coverage Diff @@
## master #64037 +/- ##
==========================================
+ Coverage 61.79% 66.65% +4.86%
==========================================
Files 710 710
Lines 92394 92420 +26
==========================================
+ Hits 57092 61605 +4513
+ Misses 35302 30815 -4487 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I left some nits on the comments for your consideration but it's up to you.
I only have a question about your example on the description of this PR. Here is what you have at the time of writing:
scheduler1 = ConstantLR(optimizer, factor=0.1, total_iters=2)
scheduler2 = ExponentialLR(optimizer, gamma=0.9)
scheduler = SequentialLR(optimizer, schedulers=[scheduler1, scheduler2], milestones=[5])
for epoch in range(100):
train(...)
validate(...)
scheduler.step()
The example has a funny warmup total_iters=2
and milestones=[5]
value mismatch (I believe it's just a typo) which raises an interesting question on how this scheduler works. My understanding is that for the first 2 epochs the warmup will use 10% of the LR, for 3 additional epochs it will be 100% of its value and then from 10 to 100 epoch the ExponentialLR scheduler will pick up. Is my understanding correct? If yes, I think that's the expected behaviour if someone misconfigures the scheduler.
test/test_optim.py
Outdated
@@ -1255,6 +1255,29 @@ def test_reduce_lr_on_plateau8(self): | |||
threshold=0.1, patience=5, cooldown=5) | |||
self._test_reduce_lr_on_plateau(scheduler, targets, metrics, epochs) | |||
|
|||
def test_sequentiallr1(self): | |||
epochs = 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why only 10 epochs here? I understand that the target below takes 19 possible values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh thanks for pointing it out, i indeed changed this test several times, epochs = 10
stayed from the previous version.
test/test_optim.py
Outdated
self._test(scheduler, targets, epochs) | ||
|
||
def test_sequentiallr2(self): | ||
epochs = 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. Also perhaps a test with more than 2 schedulers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test added with 3 schedulers :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
Following @datumbox comment and for my own understanding, what happens if we have milestones = 3
but use it with ConstantLR
of size 5? Do we restart from the original LR when we switch schedulers or do we pick the smaller LR?
"got schedulers at index {} and {} to be different".format(0, scheduler_idx) | ||
) | ||
self._schedulers = schedulers | ||
self._milestones = milestones |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a check that len(milestones) == len(schedulers) - 1
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's a really good idea ! thanks for pointing it out
@iramazanli has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
c8563bf
to
223e2e3
Compare
Your explanation for the expected behavior is correct, and it's not necessary the typo. because theoretically it could be configured that way without any error. Given that there can be any kind of schedulers added to SequentialLR i preferred to let all the schedulers behave in the default mode. Please let me know if you have a feedback about it. |
The way it behaves is as 3 epoch it behave like ConstantLR, then when we move to new scheduler it just behaves like it started from scratch in the 4-th epoch. |
223e2e3
to
43e72d5
Compare
@iramazanli has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
43e72d5
to
b42bb36
Compare
@iramazanli has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
1 similar comment
@iramazanli has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@iramazanli merged this pull request in 2b41bf4. |
Partially resolves pytorch/vision#4281
In this PR we are proposing a new scheduler --SequentialLR-- which enables list of different schedulers called in different periods of the training process.
The main motivation of this scheduler is recently gained popularity of warming up phase in the training time. It has been shown that having a small steps in initial stages of training can help convergence procedure get faster.
With the help of SequentialLR we mainly enable to call a small constant (or linearly increasing) learning rate followed by actual target learning rate scheduler.
which this code snippet will call
ConstantLR
in the first 5 epochs and will follow up withExponentialLR
in the following epochs.This scheduler could be used to provide call of any group of schedulers next to each other. The main consideration we should make is every time we switch to a new scheduler we assume that new scheduler starts from the beginning- zeroth epoch.
We also add Chained Scheduler to
optim.rst
andlr_scheduler.pyi
files here.