-
Notifications
You must be signed in to change notification settings - Fork 210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About UCF101 and HMDB51 results #19
Comments
We haven't experimented with UCF or HMDB datasets. However, if you do so please let me know the numbers (both from Imagenet and from Kinetics pretrained checkpoints). It would be great to know how TimeSformer performs on these smaller datasets. My intuition is that it might not work as well as some prior methods. Our model has a large number of parameters, so these datasets might be too small for the model to learn meaningful patterns. Nevertheless, it would quite interesting to see some results in this setting. |
Thanks for the detailed answers! We will let you know if we have the results. |
Hi, I have tried timesformer on UCF101. With ImageNet pretrained, I only get 42.85% accuracy. The timesformer args are as followed: I know that the Transformer0based model is not easy to train on small datasets. I am not sure whether this result is normal. Do I miss some important training details? On VidTr paper, the authors achieve 96.7 on UCF101 with Transformer-based model. Looking for an insightful response that helps me to get a reasonable result on UCF101. Many thanks! |
https://aistudio.baidu.com/aistudio/projectdetail/2291410 |
Dear Authors,
Thanks for this great repo for reproducing the results in TimesFormer. I just want to have a quick check whether you have experimented with the two small video classification datasets (i.e., UCF101 and HMDB51) and have some initial results.
The text was updated successfully, but these errors were encountered: