Skip to content
This repository has been archived by the owner on Nov 22, 2022. It is now read-only.

Scheduler with Warmup #1184

Closed
wants to merge 2 commits into from

Conversation

AkshatSh
Copy link

@AkshatSh AkshatSh commented Dec 6, 2019

Summary:
Current implementations of warmup in pytext either involve doing warmup and optionally inverse square root decay (TODO) or using polynomial decay (TODO). However, through my experiments, I notice for large batch training a warmup period is helpful on other schedulers as well, especially when trying to mimic results of small batch training on large batches.

This diff adds support for SchedulerWithWarmup, underneath it holds two schedulers, WarmupScheduler and any other scheduler. After warmup_steps, the scheduler will switch from warmup to the specified scheduler.

This allows something like Warmup with Expontential Decay.

Since the scheduler is built on top of the existing warmup scheduler, any new features that come to that scheduler, will directly be applicable here.

Sample Config

"SchedulerWithWarmup": {
  "warmup_scheduler": {
    "warmup_steps": 500
  },
  "scheduler": {
    "ExponentialLR": {
      "gamma": 0.95
    }
  }
}

Differential Revision: D18838272

@facebook-github-bot facebook-github-bot added CLA Signed Do not delete this pull request or issue due to inactivity. fb-exported labels Dec 6, 2019
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D18838272

AkshatSh pushed a commit to AkshatSh/pytext that referenced this pull request Dec 6, 2019
Summary:
Pull Request resolved: facebookresearch#1184

Current implementations of warmup in pytext either involve doing warmup and optionally inverse square root decay (TODO) or using polynomial decay (TODO). However, through my experiments, I notice for large batch training a warmup period is helpful on other schedulers as well, especially when trying to mimic results of small batch training on large batches.

This diff adds support for `SchedulerWithWarmup`, underneath it holds two schedulers, WarmupScheduler and any other scheduler. After `warmup_steps`, the scheduler will switch from warmup to the specified scheduler.

This allows something like Warmup with Expontential Decay.

Since the scheduler is built on top of the existing warmup scheduler, any new features that come to that scheduler, will directly be applicable here.

Sample Config

```
"SchedulerWithWarmup": {
  "warmup_scheduler": {
    "warmup_steps": 500
  },
  "scheduler": {
    "ExponentialLR": {
      "gamma": 0.95
    }
  }
}
```

Differential Revision: D18838272

fbshipit-source-id: 8bdb4616987d9030e8b09c073a8ba6d753d0fd8c
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D18838272

ArmenAg and others added 2 commits December 6, 2019 00:42
Differential Revision: D18725798

fbshipit-source-id: 131cd0dc983f6a8f5d7ef0a90451238681aef821
Summary:
Pull Request resolved: facebookresearch#1184

Current implementations of warmup in pytext either involve doing warmup and optionally inverse square root decay (TODO) or using polynomial decay (TODO). However, through my experiments, I notice for large batch training a warmup period is helpful on other schedulers as well, especially when trying to mimic results of small batch training on large batches.

This diff adds support for `SchedulerWithWarmup`, underneath it holds two schedulers, WarmupScheduler and any other scheduler. After `warmup_steps`, the scheduler will switch from warmup to the specified scheduler.

This allows something like Warmup with Expontential Decay.

Since the scheduler is built on top of the existing warmup scheduler, any new features that come to that scheduler, will directly be applicable here.

Sample Config

```
"SchedulerWithWarmup": {
  "warmup_scheduler": {
    "warmup_steps": 500
  },
  "scheduler": {
    "ExponentialLR": {
      "gamma": 0.95
    }
  }
}
```

Differential Revision: D18838272

fbshipit-source-id: e5aea7434d0563e14f357b8647781ec1ff0b0868
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D18838272

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in a56c761.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
CLA Signed Do not delete this pull request or issue due to inactivity. fb-exported Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants