-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training is stuck for pretraining #3
Comments
Hi, thank you for the notice. |
I just did a re-train on a few epochs, with no issue of workers stucking there. Have you modified anything in the config? Are you running with slidr_minkunet.yaml? |
Thank you! I just found that the images are corrupted! Thank you for the notice! Really helpful! |
Hi, I'm sorry for the late reply. |
Thanks! It will be great if you can send me the code via email!
…Sent from my iPhone
On Jul 24, 2022, at 2:39 AM, CSautier ***@***.***> wrote:
Hi, I'm sorry for the late reply.
Glad I could be of help; as for the Minkowski SR-Unet, we modified PointRCNN to use as backbone our Minkowski-UNet, where we have loaded the weights. We also modified the kitti dataloader to only keep 1 in n training data to use a fraction of the training data. This can be done by subsampling self.kitti_infos in kitti_dataset.py of OpenPCDet.
We did not provide the code as it did not belong to us, and that would have added an entire fork just for this one experiment, but I can send it to you by email if you need it.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.
|
I sent you the files by email and am closing this issue. Let me know by mail if you need help to use it. |
I wonder whether there is a bug around line https://github.com/valeoai/SLidR/blob/main/pretrain/dataloader_nuscenes_spconv.py#L321? (You should not comment this line) Because in your object detection code base, you use 4 features in object detection. |
I found an error around line 200 in transforms.py. For some samples, it never goes out of the loop, which makes the training not work, meaning that len_indexes= sum_indexes=0
The text was updated successfully, but these errors were encountered: