Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fine-tuning parameter on SUN RGBD and Kinetics400 #33

Open
ryosuke-yamada opened this issue Oct 12, 2022 · 3 comments
Open

Fine-tuning parameter on SUN RGBD and Kinetics400 #33

ryosuke-yamada opened this issue Oct 12, 2022 · 3 comments

Comments

@ryosuke-yamada
Copy link

ryosuke-yamada commented Oct 12, 2022

Hi.

Thank you very much for your excellent work and for sharing the repository!

I was wondering if you could provide more details on the hyperparameter of fine-tuning on SUN RGBD and Kinetics400 (in Table 2).

I think I would use the ImageNet-1k pre-trained model (ImageSwin) and fine-tune the parameters according to Supplement A., right?
Also, is the performance of the Omnivore model in Table 2 without using a pre-trained model?

@rohitgirdhar
Copy link
Contributor

Hi,
Thanks for your interest.
Yes the baseline numbers are finetuned from the ImageSwin checkpoint. The video numbers in Table 2 come from the Video Swin Transformer so you can refer to that for finetuning parameters. For SUN finetuning you can use the hyperparameters from Appendix B.
In Table 2, the omnivore model is trained from scratch on the 3 datasets, so yes, it does not use any pre-trained model.

@ryosuke-yamada
Copy link
Author

@rohitgirdhar
Thank you very much!!!
I could understand.

I have an additional question.
Do you plan to publish all config files on finetuning (in Table 3)?
Due to the different hyper-parameters in each finetuning dataset, I am struggling to reproduce OMNIVORE's performance.

If possible, I would appreciate it if you would consider publishing all config files regarding finetuning in Table 3.

@Zhangwenyao1
Copy link

Hi.

Thank you very much for your excellent work and for sharing the repository!

I was wondering if you could provide more details on the hyperparameter of fine-tuning on SUN RGBD and Kinetics400 (in Table 2).

I think I would use the ImageNet-1k pre-trained model (ImageSwin) and fine-tune the parameters according to Supplement A., right? Also, is the performance of the Omnivore model in Table 2 without using a pre-trained model?

I'm interested in understanding the procedure for handling video data during training if the input comprises RGB-D images. Do you simply set them to zero, or is there another approach?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants