Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doubts on Noisy Student #16

Closed
anuj-sharma-19 opened this issue Jan 4, 2022 · 4 comments
Closed

Doubts on Noisy Student #16

anuj-sharma-19 opened this issue Jan 4, 2022 · 4 comments

Comments

@anuj-sharma-19
Copy link

anuj-sharma-19 commented Jan 4, 2022

Hi,

First of all, thank you for the great work of putting different semi-supervised methods on lidar point clouds into a single repo.

I have a question on the Noisy Student training. From the Noisy Student config https://github.com/PointsCoder/ONCE_Benchmark/blob/master/tools/cfgs/once_models/semi_learning_models/noisy_student_second_large.yaml, it does not seem to add dropout DP_RATIO into the model. But the Noisy Student paper suggests to add it. Not sure if I am missing something?

Also, the Noisy Student training seems to be for only 1-cycle, instead of 3-cycles as originally done in the paper. Could you please let me know if the multiple cycle experiment lowered the performance compared to only 1-cycle?

On comparing Noisy Student to Pseudo Labels config, it appears the only difference between the 2 being random augmentations of random_world_flip and random_world_rotation are not applied to Student model in Pseudo Labels. Could you please confirm if that's the only difference between these?

Looking forward to your reply.

Thank You !!
Anuj

@PointsCoder
Copy link
Owner

@anuj-sharma-19 Thanks for your attention to our work!

  1. We turned off the dropout when generating labels, which is just the same as inference. We haven't conducted experiments on adding the dropout. Please let us know If you can attain a better result with the dropout added.

  2. We kept 3-cycles to maintain a fair comparison with other approaches. We didn't explore the effects of different training cycles.

  3. There are actually 2 differences. The first difference is as you mentioned Pseudo Labels have NO augmentation, which will lead to a much worse result. The second difference is that in Noisy Student we replace the teacher with the new student after the first round of S-T semi-training, which can be done by simply changing the loading checkpoint.

@anuj-sharma-19
Copy link
Author

anuj-sharma-19 commented Jan 6, 2022

Hi,

Thanks for your quick response and the clarifications!!

  1. So, to confirm, in the code, the Teacher does not use dropout, i.e. runs in inference mode, and even the Student does not use dropout.
  2. Noisy Student is trained for 3 cycles, with Teacher being replaced by Student from the previous cycle. Are the methods, i.e. Mean Teacher, SESS, 3D-IoU-Match also trained for 3 cycles similarly, or is it just the Noisy Student one? If not, then does it mean that Noisy Student is effectively trained for 3 x 150 epochs, whereas the rest are trained for only 150 epochs?

Thanks !!
Anuj

@PointsCoder
Copy link
Owner

@anuj-sharma-19

  1. The strategy of using training/eval mode is in semi_train.py, you can check the dropout here:
    image

  2. Noisy student is different from other methods, we actually train it with 150x2 epochs by replacing the teacher with the student for one time. The rest are trained for 150 epochs.

@anuj-sharma-19
Copy link
Author

Okay, thanks a lot for the clarifications. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants