Skip to content

Hyperparameters for training CoCa #455

Answered by iejMac
zw615 asked this question in Q&A
Discussion options

You must be logged in to vote

Here's the training params we used to train all CoCa models:

srun --comment openclip --cpu_bind=v --accel-bind=gn python -m training.main \
    --save-frequency 1 \
    --save-most-recent \
    --logs "/scratch" \
    --remote-sync "s3://s-laion/coca_checkpoints/coca_ViT-L-14_run1/" \
    --train-data "pipe:aws s3 cp s3://s-datasets/laion5b/laion2B-data/{000000..231349}.tar -" \
    --train-num-samples 135646078 \
    --dataset-type webdataset \
    --precision amp_bfloat16 \
    --warmup 2000 \
    --batch-size=235 \
    --epochs=97 \
    --dataset-resampled \
    --lr 1e-3 \
    --workers=12 \
    --report-to wandb \
    --model "coca_ViT-L-14" \
    --seed 0 \
    --ddp-static-graph \
…

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by zw615
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants