Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warm restart policy is available now #6130

Closed
wants to merge 32 commits into from

Conversation

AutuanLiu
Copy link

  • Please tell me if violated any rules.

@ssnl
Copy link
Collaborator

ssnl commented Mar 30, 2018

Can you also add tests for this in test_optim.py?

@ezyang
Copy link
Contributor

ezyang commented Mar 30, 2018

@pytorchbot test this please

@AutuanLiu
Copy link
Author

ok

@ezyang
Copy link
Contributor

ezyang commented Apr 1, 2018

@pytorchbot test this please

@ezyang
Copy link
Contributor

ezyang commented Apr 1, 2018

@pytorchbot test this please

@ezyang
Copy link
Contributor

ezyang commented Apr 2, 2018

@pytorchbot test this please

(1 + math.cos(math.pi * self.last_epoch / self.T_max)) / 2
if self.restart and self.last_epoch == self.T_max:
self.last_epoch = 0
self.T_max *= self.T_mult

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

* actually, the restart argument is redundant, because T_max will equal to training epochs when warm restart policy not be used.
* if we want to apply warm restart policy, we need to set T_max < training epochs.

Args:
optimizer (Optimizer): Wrapped optimizer.
T_max (int): Maximum number of iterations.
eta_min (float): Minimum learning rate. Default: 0.
T_mult (int): Multiplicative factor of T_max. Default: 2
restart (bool): If True, warm restart policy will be used.

This comment was marked as off-topic.

single_targets = [eta_min + (0.05 - eta_min) * (1 + math.cos(math.pi * x / y)) / 2
for x, y in zip(T_cur, T_i)]
targets = [single_targets, list(map(lambda x: x * epochs, single_targets))]
scheduler = CosineAnnealingLR(self.opt, T_max=T_max, eta_min=eta_min, T_mult=T_mult, restart=True)

This comment was marked as off-topic.

@ssnl
Copy link
Collaborator

ssnl commented Apr 2, 2018

@pytorchbot add to whitelist

@ezyang
Copy link
Contributor

ezyang commented Apr 16, 2018

@ssnl Do you think this is OK to merge now?

@apaszke
Copy link
Contributor

apaszke commented Apr 16, 2018

@ezyang it's not. See our discussion above, which hasn't concluded yet.

self.cycle += 1
else:
self.cycle = int(math.floor(math.log(epoch / self.T_max * (self.T_mult - 1) + 1, self.T_mult)))
epoch -= sum([self.T_max * self.T_mult ** x for x in range(self.cycle)])

This comment was marked as off-topic.

@zou3519 zou3519 added the awaiting response (this tag is deprecated) This tag is deprecated while we figure out what to do with it label Jul 10, 2018
@yf225
Copy link
Contributor

yf225 commented Jul 10, 2018

@AutuanLiu Let us know if you have time to address @apaszke and @ssnl 's comments, thanks!

@AutuanLiu
Copy link
Author

@yf225 I'm so sorry, I have no time to address and figure out these comments.

@loshchil
Copy link

Sorry for not providing the actual working code, but our implementation of restarts for TensorFlow might be useful as a reference:
https://github.com/tensorflow/tensorflow/blob/25c197e02393bd44f50079945409009dd4d434f8/tensorflow/python/training/learning_rate_decay.py#L514

@AlexMRuch
Copy link

Looking forward to seeing this function!

@zdevito zdevito removed their request for review February 13, 2019 01:23
@gchanan gchanan removed their request for review February 28, 2019 16:28
@danieltudosiu
Copy link

danieltudosiu commented Mar 12, 2020

I gave it a shot at porting the TF one, but I am not sure if it is correct since I just started using PyTorch. Could you please give me your two cents?

import math
import warnings

from torch.optim.lr_scheduler import _LRScheduler


class CosineDecayRestarts(_LRScheduler):
    def __init__(
        self,
        optimizer,
        first_decay_steps,
        t_mul=2.0,
        m_mul=1.0,
        alpha=0.0,
        last_epoch=-1,
    ):
        self.first_decay_steps = first_decay_steps
        self.t_mul = t_mul
        self.m_mul = m_mul
        self.alpha = alpha

        super(CosineDecayRestarts, self).__init__(optimizer, last_epoch)

    def get_lr(self):
        if not self._get_lr_called_within_step:
            warnings.warn("To get the last learning rate computed by the scheduler, "
                          "please use `get_last_lr()`.", DeprecationWarning)

        if self.last_epoch == 0:
            return self.base_lrs

        return [self._calculate_decayed_lr(group['lr']) for group in self.optimizer.param_groups]

    def _calculate_decayed_lr(self, group_lr):
        completed_fraction = self._step_count / self.first_decay_steps

        if not self.t_mul == 1.0:
            i_restart = math.floor(
                math.log(1 - completed_fraction * (1 - self.t_mul)) / math.log(self.t_mul)
            )
            sum_r = (1.0 - self.t_mul ** i_restart) / (1.0 - self.t_mul)
            completed_fraction = (completed_fraction - sum_r) / self.t_mul ** i_restart
        else:
            i_restart = math.floor(completed_fraction)
            completed_fraction = completed_fraction - i_restart

        m_fac = self.m_mul ** i_restart
        cosine_decayed = 0.5 * m_fac * (1.0 + math.cos(math.pi * completed_fraction))
        decayed = (1 - self.alpha) * cosine_decayed + self.alpha

        return group_lr * decayed

@github-actions
Copy link
Contributor

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
Stale pull requests will automatically be closed 30 days after being marked Stale

@github-actions github-actions bot added the Stale label Mar 23, 2022
@facebook-github-bot
Copy link
Contributor

Hi @AutuanLiu!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks!

@github-actions github-actions bot removed the Stale label Mar 29, 2022
@github-actions
Copy link
Contributor

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

@github-actions github-actions bot added the Stale label May 28, 2022
@github-actions github-actions bot closed this Jun 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.