How to scale cosine learning rate for small batchsize #7

bowenc0221 · 2018-11-03T04:55:37Z

When training ResNet on ImageNet, the GluonCV uses a batchsize of 128x8GPUs with cosine learning rate. Do you have any suggestion to scale learning rate when using a small batchsize? I tried directly scaling down learning rate by the ratio of batchsize but it did not work very well.

Thanks.

zhanghang1989 · 2018-11-05T17:07:57Z

GluonCV use fp16 training, 128 batch size per gpu. You can use smaller batch size, but should be at least 256 for total mini-batch size to reproduce compatible performance.

bowenc0221 closed this as completed Dec 14, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to scale cosine learning rate for small batchsize #7

How to scale cosine learning rate for small batchsize #7

bowenc0221 commented Nov 3, 2018

zhanghang1989 commented Nov 5, 2018

How to scale cosine learning rate for small batchsize #7

How to scale cosine learning rate for small batchsize #7

Comments

bowenc0221 commented Nov 3, 2018

zhanghang1989 commented Nov 5, 2018