Pre-trained results of MAE #1

lucasliunju · 2022-05-27T09:09:31Z

Thank you very much for your contribution. I think that will help the whole jax community about MAE training.

May I ask whether the repo can reproduce the results on the MAE paper, such as the comparison between this repo and official results?

Thanks for your contribution again!

Best,
Lucas

SarthakYadav · 2022-05-31T07:31:12Z

Hi @lucasliunju

Yes, I did run MAE pretraining + linear probe experiments on Base and Large architectures, although without gradient accumulation (I have to run those experiments too, but haven't had a chance yet).

Base reached 63% and ViT-L/16 got 69% accuracy in linear probe experiments. This is in comparison to official results in paper for ViT-L/16, which reached a linear probe accuracy of 73.5%. I do have the pretrained weights which I do intend to release publicly, I just haven't had time to do so yet.

I believe running gradient accumulation will close this gap, but I'm not certain when and if I'll have the capacity to do those experiments.

lucasliunju · 2022-05-31T08:25:10Z

Dear SarthakYadav,

Thanks for your reply. Maybe I can help you to test it. May I ask what your mean about gradient accumulation is? I noticed current batch size is 128*8.

Best,
Yong

snoop2head · 2024-04-22T12:33:43Z

@SarthakYadav

Thank you for awesome work!
May I ask training loss or validation loss value when training on ImageNet1K?

Thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pre-trained results of MAE #1

Pre-trained results of MAE #1

lucasliunju commented May 27, 2022

SarthakYadav commented May 31, 2022

lucasliunju commented May 31, 2022

snoop2head commented Apr 22, 2024

Pre-trained results of MAE #1

Pre-trained results of MAE #1

Comments

lucasliunju commented May 27, 2022

SarthakYadav commented May 31, 2022

lucasliunju commented May 31, 2022

snoop2head commented Apr 22, 2024