Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can you share some recordings of your experiments #39

Open
realTaki opened this issue Mar 23, 2022 · 2 comments
Open

Can you share some recordings of your experiments #39

realTaki opened this issue Mar 23, 2022 · 2 comments

Comments

@realTaki
Copy link

Can you share some recordings of your experiments like some graphs in neptune.ai or other logs tracking the performance/loss changes in training steps.

I would like to compare the effects of some configurations(e.g. batch size) on training convergence in depth. I think this uses a contrastive loss that depends on a similarity matrix, may be effected by batch size and converges slower in a smaller batch size. In your experiments, it was not using large batch sizes and may not achived the best performance yet. I think I want to try something haha~

@m-bain
Copy link
Owner

m-bain commented Mar 25, 2022

Hi, sure you can see some runs for MSRVTT here:
https://app.neptune.ai/m-bain/frozen/experiments?split=tbl&dash=charts&viewId=95e7e8f0-79f1-48a4-9bd5-e1017c21309b

Yeah smaller batch size will take longer to converge -- and intuitively I would think it gives worse performance due to n^2 comparisons.

However, I find for these small datasets that small batch size does really well if you tune the learning rate accordingly, maybe since its like more augmentation. All my best results are with batch size 8-16. I think during pretraining bigger is better just because training is hard to converge. Let me know how you get on :)

@bryant1410
Copy link
Contributor

bryant1410 commented Mar 25, 2022

For the sake of sharing results, I have reproduced the pre-training on CC3M+WebVid with 1-frame batch size 512 (instead of 96) and 4-frame batch size 128 (instead of 24). On MSR-VTT (1k-A split) zero-shot I got ~2% absolute improvement in R@1, R@5, and R@10. On MSR-VTT fine-tuning (1k-A split) (can't remember the batch size but probably 128), I got +2% in R@1 while R@5 and R@10 where essentially the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants