Skip to content
This repository has been archived by the owner on Nov 22, 2022. It is now read-only.

Inverse Sqrt Scheduler #1150

Closed
wants to merge 1 commit into from
Closed

Conversation

jeanm
Copy link
Contributor

@jeanm jeanm commented Nov 19, 2019

Summary:
Currently, WarmupScheduler does this during the warm-up period:

lr = base_lr * current_step / warmup_steps

This diff adds the option of adding LR decay after the warm-up period:

lr = base_lr * sqrt(warmup_steps) / sqrt(current_step)

This is similar to Fairseq's implementation.

Reviewed By: ccsasuke

Differential Revision: D18491650

Summary:
Currently, WarmupScheduler does this during the warm-up period:

    lr = base_lr * current_step / warmup_steps

This diff adds the option of adding LR decay after the warm-up period:

    lr = base_lr * sqrt(warmup_steps) / sqrt(current_step)

This is similar to [Fairseq's implementation](https://github.com/pytorch/fairseq/blob/master/fairseq/optim/lr_scheduler/inverse_square_root_schedule.py).

Reviewed By: ccsasuke

Differential Revision: D18491650

fbshipit-source-id: 2e24bd2355759ee8e2e0c1b0e87e600be23f81fb
@facebook-github-bot facebook-github-bot added CLA Signed Do not delete this pull request or issue due to inactivity. fb-exported labels Nov 19, 2019
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D18491650

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in c35d513.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
CLA Signed Do not delete this pull request or issue due to inactivity. fb-exported Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants