Fused AdamW_SGD optimizer issues #17

vealocia · 2022-07-06T04:13:03Z

Hi, authors! Thanks for your awesome work!
I'm confused about the usage of fused AdamW_SGD optimizer as described in paper Appendix C, paragraph implementation details.
It's said you use AdamW with 1e-3 lr and 0.05 wd for ViT vision encoder, and SGD with 0.02 lr and 1e-4 wd for text transformer.
However, in your configuration, ViT-B/32 is also optimized by SGD instead of fused AdamW_SGD. So which optimizer is your choice in experiment actually?
And, if you use fused AdamW_SGD optimizer just as said in paper, why did you use it? CLIP only uses AdamW optimizer. Is this beneficial to CLIP?
Looking forward for your reply!😁

zlccccc · 2022-07-06T15:22:15Z

In the DeCLIP open-source model in our paper, the AdamW_SGD optimizer was used for optimization. The settings here is slightly different from the one described in the paper because the test configuration was changed from another configuration of the same model (as you can see, the training setup is all for the YFCC dataset).

zlccccc · 2022-07-06T15:27:13Z

We use the AdamW_SGD optimizer in the DeCLIP paper during training because we experimentally found that the language encoder tends to crash using the AdamW optimizer when using noisy labels.
This is not a problem on the YFCC dataset, so the AdamW optimizer also works well.

vealocia · 2022-07-07T01:17:35Z

Get it! Thanks for your answers.

vealocia · 2022-07-07T01:23:10Z

By the way, @zlccccc, can you share any training logs of your experiments (either CLIP or DeCLIP or others)? That will be helpful and greatly appreciated.

vealocia closed this as completed Jul 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fused AdamW_SGD optimizer issues #17

Fused AdamW_SGD optimizer issues #17

vealocia commented Jul 6, 2022

zlccccc commented Jul 6, 2022 •

edited

zlccccc commented Jul 6, 2022

vealocia commented Jul 7, 2022

vealocia commented Jul 7, 2022

Fused AdamW_SGD optimizer issues #17

Fused AdamW_SGD optimizer issues #17

Comments

vealocia commented Jul 6, 2022

zlccccc commented Jul 6, 2022 • edited

zlccccc commented Jul 6, 2022

vealocia commented Jul 7, 2022

vealocia commented Jul 7, 2022

zlccccc commented Jul 6, 2022 •

edited