Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to scale cosine learning rate for small batchsize #7

Closed
bowenc0221 opened this issue Nov 3, 2018 · 1 comment
Closed

How to scale cosine learning rate for small batchsize #7

bowenc0221 opened this issue Nov 3, 2018 · 1 comment

Comments

@bowenc0221
Copy link

When training ResNet on ImageNet, the GluonCV uses a batchsize of 128x8GPUs with cosine learning rate. Do you have any suggestion to scale learning rate when using a small batchsize? I tried directly scaling down learning rate by the ratio of batchsize but it did not work very well.

Thanks.

@zhanghang1989
Copy link
Collaborator

GluonCV use fp16 training, 128 batch size per gpu. You can use smaller batch size, but should be at least 256 for total mini-batch size to reproduce compatible performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants