Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is a reasonable training time on kitti dataset? #17

Closed
g1y5x3 opened this issue Mar 17, 2022 · 6 comments
Closed

What is a reasonable training time on kitti dataset? #17

g1y5x3 opened this issue Mar 17, 2022 · 6 comments

Comments

@g1y5x3
Copy link

g1y5x3 commented Mar 17, 2022

Hi, thank you for sharing the implementation and I have been testing the script as well as referencing the paper to have a better understanding of the entire implementations. Just wondering what is a reasonable expected training time on the 80G Kitti dataset? So far on my 64G memory machine with NVIDIA 2060 GPU, it takes around 25 minutes per epoch. I have tried with different batch_size but it didn't seem to improve the training time either.

I understand to train with 50G worth of data would not be any where quick but 25 minutes per epoch for 10000 epochs seem to be really out of the normality.

Thanks for your help!

@nubertj
Copy link
Member

nubertj commented Mar 20, 2022

Hi @g1y5x3 ,
It is hard to say what the right duration is, as it highly depends on the chosen learning-rate.
But for sure I never trained it for more than 200 epochs (you will also see that the loss kinda converges after that amount of time).
Best,

@nubertj nubertj closed this as completed Mar 20, 2022
@g1y5x3
Copy link
Author

g1y5x3 commented Mar 21, 2022

Thank you for your clarification. So the 10,000 epochs that was in deployer/trainer.py was just a generic configuration but not necessarily the one used for reproducing the results right?

@g1y5x3
Copy link
Author

g1y5x3 commented Mar 21, 2022

Actually, would you mind share the training parameters that you used in the paper?

@nubertj
Copy link
Member

nubertj commented Mar 21, 2022

yeah, usually I deployed it on servers and wanted to run it as long as possible rather than being killed by an internal epoch limit.

@nubertj
Copy link
Member

nubertj commented Mar 21, 2022

Which parameters do you mean? model parameters (i.e. weights)?

@nubertj
Copy link
Member

nubertj commented Mar 21, 2022

There is an example checkpoint for kitti provided here: https://github.com/leggedrobotics/DeLORA/tree/main/checkpoints

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants