Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

To add SequentialLR to PyTorch Core Schedulers #64037

Closed
wants to merge 1 commit into from

Conversation

iramazanli
Copy link
Contributor

@iramazanli iramazanli commented Aug 26, 2021

Partially resolves pytorch/vision#4281

In this PR we are proposing a new scheduler --SequentialLR-- which enables list of different schedulers called in different periods of the training process.

The main motivation of this scheduler is recently gained popularity of warming up phase in the training time. It has been shown that having a small steps in initial stages of training can help convergence procedure get faster.

With the help of SequentialLR we mainly enable to call a small constant (or linearly increasing) learning rate followed by actual target learning rate scheduler.

scheduler1 = ConstantLR(optimizer, factor=0.1, total_iters=2)
scheduler2 = ExponentialLR(optimizer, gamma=0.9)
scheduler = SequentialLR(optimizer, schedulers=[scheduler1, scheduler2], milestones=[5])

for epoch in range(100):
    train(...)
    validate(...)
    scheduler.step()

which this code snippet will call ConstantLR in the first 5 epochs and will follow up with ExponentialLR in the following epochs.

This scheduler could be used to provide call of any group of schedulers next to each other. The main consideration we should make is every time we switch to a new scheduler we assume that new scheduler starts from the beginning- zeroth epoch.

We also add Chained Scheduler to optim.rst and lr_scheduler.pyi files here.

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Aug 26, 2021

🔗 Helpful links

💊 CI failures summary and remediations

As of commit b42bb36 (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@iramazanli iramazanli changed the title To add Composite Scheduler To add SequentialLR to PyTorch Core Schedulers Sep 2, 2021
@iramazanli iramazanli force-pushed the composite_schedulers branch 20 times, most recently from a5532e4 to c1227cb Compare September 8, 2021 18:00
@codecov
Copy link

codecov bot commented Sep 8, 2021

Codecov Report

Merging #64037 (c8563bf) into master (9cc44aa) will increase coverage by 4.86%.
The diff coverage is 92.85%.

❗ Current head c8563bf differs from pull request most recent head b42bb36. Consider uploading reports for the commit b42bb36 to get more accurate results

@@            Coverage Diff             @@
##           master   #64037      +/-   ##
==========================================
+ Coverage   61.79%   66.65%   +4.86%     
==========================================
  Files         710      710              
  Lines       92394    92420      +26     
==========================================
+ Hits        57092    61605    +4513     
+ Misses      35302    30815    -4487     

Copy link
Contributor

@datumbox datumbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I left some nits on the comments for your consideration but it's up to you.


I only have a question about your example on the description of this PR. Here is what you have at the time of writing:

scheduler1 = ConstantLR(optimizer, factor=0.1, total_iters=2)
scheduler2 = ExponentialLR(optimizer, gamma=0.9)
scheduler = SequentialLR(optimizer, schedulers=[scheduler1, scheduler2], milestones=[5])

for epoch in range(100):
    train(...)
    validate(...)
    scheduler.step()

The example has a funny warmup total_iters=2 and milestones=[5] value mismatch (I believe it's just a typo) which raises an interesting question on how this scheduler works. My understanding is that for the first 2 epochs the warmup will use 10% of the LR, for 3 additional epochs it will be 100% of its value and then from 10 to 100 epoch the ExponentialLR scheduler will pick up. Is my understanding correct? If yes, I think that's the expected behaviour if someone misconfigures the scheduler.

@@ -1255,6 +1255,29 @@ def test_reduce_lr_on_plateau8(self):
threshold=0.1, patience=5, cooldown=5)
self._test_reduce_lr_on_plateau(scheduler, targets, metrics, epochs)

def test_sequentiallr1(self):
epochs = 10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why only 10 epochs here? I understand that the target below takes 19 possible values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh thanks for pointing it out, i indeed changed this test several times, epochs = 10 stayed from the previous version.

self._test(scheduler, targets, epochs)

def test_sequentiallr2(self):
epochs = 10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. Also perhaps a test with more than 2 schedulers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test added with 3 schedulers :)

Copy link
Member

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

Following @datumbox comment and for my own understanding, what happens if we have milestones = 3 but use it with ConstantLR of size 5? Do we restart from the original LR when we switch schedulers or do we pick the smaller LR?

"got schedulers at index {} and {} to be different".format(0, scheduler_idx)
)
self._schedulers = schedulers
self._milestones = milestones
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a check that len(milestones) == len(schedulers) - 1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a really good idea ! thanks for pointing it out

@facebook-github-bot
Copy link
Contributor

@iramazanli has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@iramazanli
Copy link
Contributor Author

LGTM. I left some nits on the comments for your consideration but it's up to you.

I only have a question about your example on the description of this PR. Here is what you have at the time of writing:

scheduler1 = ConstantLR(optimizer, factor=0.1, total_iters=2)
scheduler2 = ExponentialLR(optimizer, gamma=0.9)
scheduler = SequentialLR(optimizer, schedulers=[scheduler1, scheduler2], milestones=[5])

for epoch in range(100):
    train(...)
    validate(...)
    scheduler.step()

The example has a funny warmup total_iters=2 and milestones=[5] value mismatch (I believe it's just a typo) which raises an interesting question on how this scheduler works. My understanding is that for the first 2 epochs the warmup will use 10% of the LR, for 3 additional epochs it will be 100% of its value and then from 10 to 100 epoch the ExponentialLR scheduler will pick up. Is my understanding correct? If yes, I think that's the expected behaviour if someone misconfigures the scheduler.

Your explanation for the expected behavior is correct, and it's not necessary the typo. because theoretically it could be configured that way without any error.

Given that there can be any kind of schedulers added to SequentialLR i preferred to let all the schedulers behave in the default mode. Please let me know if you have a feedback about it.

@iramazanli
Copy link
Contributor Author

Thanks for the PR!

Following @datumbox comment and for my own understanding, what happens if we have milestones = 3 but use it with ConstantLR of size 5? Do we restart from the original LR when we switch schedulers or do we pick the smaller LR?

The way it behaves is as 3 epoch it behave like ConstantLR, then when we move to new scheduler it just behaves like it started from scratch in the 4-th epoch.

@facebook-github-bot
Copy link
Contributor

@iramazanli has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@iramazanli has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

1 similar comment
@facebook-github-bot
Copy link
Contributor

@iramazanli has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@iramazanli merged this pull request in 2b41bf4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update reference scripts to use the "Batteries Included" utils
4 participants