Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training Time Required #81

Closed
kevaldoshi17 opened this issue Oct 14, 2021 · 4 comments
Closed

Training Time Required #81

kevaldoshi17 opened this issue Oct 14, 2021 · 4 comments

Comments

@kevaldoshi17
Copy link

Hi,

I was trying to train the Timesformer model from scratch on Kinetics-600 and the estimated time was shown as ~9 days. In the paper it was mentioned that the training time is roughly 440 V100 GPU hours. My setup is 8x Titan V GPUs, so I assumed that the training time would be closer to 50 hours. What am I missing here?

@gberta
Copy link
Contributor

gberta commented Oct 14, 2021

The numbers in the paper are reported on Kinetics-400, which is smaller than Kinetics-600. I haven't tested the code with Titan V GPUs so I can't really comment on that.

@kevaldoshi17
Copy link
Author

Thanks for the quick reply. Just to confirm, Kinetics-400 on 8 V100 GPUs for 15 epochs should take around 50 hours right?

@gberta
Copy link
Contributor

gberta commented Oct 15, 2021

It should be around ~55 hours, yes. Note that the training process will be significantly slower if you don't assign enough CPU processes for data loading. To the best of my knowledge, it shouldn't be a problem unless you are using SLURM.

@kevaldoshi17
Copy link
Author

Yes, I was using SLURM and didn't set enough CPU processes. Thanks for the help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants