Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about ViT-augreg ("How to train?") fine-tuning transfer #60

Closed
lucasb-eyer opened this issue Nov 7, 2023 · 2 comments
Closed
Labels
question Further information is requested

Comments

@lucasb-eyer
Copy link
Collaborator

We got the following question by e-mail by @alexlioralexli but think it's of general interest:

  1. Details about fine-tuning process?
  2. Commands to reproduce pre-training and fine-tuning runs from the paper?
@lucasb-eyer lucasb-eyer added the question Further information is requested label Nov 7, 2023
@lucasb-eyer
Copy link
Collaborator Author

First answer by @andsteing

  1. We used our default transfer config: big_vision/configs/transfer.py, which uses inception crop (config -> preprocessing) and random horizontal flip (config).
  2. As for pre-training, refer to these configs: big_vision/configs/vit_i21k.py and big_vision/configs/vit_i1k.py (see module pydoc for more information).

@lucasb-eyer
Copy link
Collaborator Author

And addition by me, checking the old training logs and providing free-form text summary:

We select these on minival (held out from train):

  1. We sweep lr in 0.03, 0.01, 0.003, 0.001
  2. We sweep fine-tuning steps in 500, 2500, 10k, 20k

Couple probably important and fixed (not swept) settings, should be visible in the config Andreas linked:

  • inception-crop and flip-lr almost everywhere except where it's a completely silly thing to do (see BiT paper appendix)
  • no dropout or stochastic depth
  • no mixup, no randaugment
  • no weight decay
  • batch-size 512
  • softmax cross-entropy loss
  • SGDMomentum optimizer, momentum 0.9 in bfloat16.

@google-research google-research locked and limited conversation to collaborators Nov 7, 2023
@lucasb-eyer lucasb-eyer converted this issue into discussion #62 Nov 7, 2023

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant