Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Baseline training schedule #16

Open
L-Reichardt opened this issue Oct 12, 2022 · 10 comments
Open

Baseline training schedule #16

L-Reichardt opened this issue Oct 12, 2022 · 10 comments

Comments

@L-Reichardt
Copy link

L-Reichardt commented Oct 12, 2022

First of all, thank you for uploading this very comprehensive and capable code. It surpasses the quality of what others upload on Github in the 3D-Semantic Space.

Has there been any additional training schedule for the baseline model?
Models like Cylinder3D used non-published training schedules and only hinted at some methods in github issues or in their paper. Most likely they used additional Test-time-augmentation, Instance augmentation, unreported augmentation schedule, model depth/channel tuning, Lasermix, ensembling etc.... Potentially every trick in the book.

Is that the same case in this code? Or has the baseline only been trained with the details present in this code? I want to train your code, but rather not invest resources if I cannot reach the reported 67.4 test result of the baseline.

I've read Issue 13 , but do the answers in that issue also apply to the baseline?

@L-Reichardt
Copy link
Author

L-Reichardt commented Oct 12, 2022

additional comments :

  • in Requirements the package "pyquaternion" is missing.
  • Just for information : I tested with different spconv-2.2.x Versions (additional speedup over spconv 2.1.x), however I get "CUDA illegal memory access error" on an A100-80Gb but it works fine on an RTX3060. Versions 2.2.x are OK with different codebases on the A100. I am using spconv-2.1.x now with success.

@bobpop1
Copy link

bobpop1 commented Oct 12, 2022

First of all, thank you for uploading this very comprehensive and capable code. It surpasses the quality of what others upload on Github in the 3D-Semantic Space.

Has there been any additional training schedule for the baseline model? Models like Cylinder3D used non-published training schedules and only hinted at some methods in github issues. They used additional Test-time-augmentation, Instance augmentation, unreported augmentation schedule, model depth/channel tuning, Lasermix, ensembling etc.... Basically every trick in the book.

Is that the same case in this code? Or has the baseline only been trained with the details present in this code? I want to train your code, but rather not invest resources if I cannot reach the reported 67.4 test result of the baseline.

I've read Issue 13 , but do the answers in that issue also apply to the baseline?

As reported in the paper of Cylinder3D, they do not use depth/channel tuning, lasermix, etc. How do you know that they use the mentioned tricks?

@L-Reichardt
Copy link
Author

L-Reichardt commented Oct 12, 2022

@bobpop1
I should have written that statement differently. Most likely they used those tricks, based on the following information.

In the pre-CVPR paper they mention Test Time Augmentation (August 2020)
In the CVPR Paper they mention in the caption of table 1 "Note
that all results are obtained from the literature, where post-processing, flip & rotation test ensemble, etc. are also applied." There is no more detail for this. Also, augmentation is not explicitly stated anymore (November 2020).

In the pre-CVPR paper they reach 65.2 mIOU on validation, 61.8 mIOU on test (- 3.4 mIOU)
In the CVPR paper the behaviour is flipped. 65.9 mIOU validation, 67.8 mIOU on test (+ 1.9 mIOU)

The public code matches the pre-CVPR paper (no pointwise refinement, no extra post-processing methods). The model starts to overfit around epoch 15, however the authors trained for 40 epochs in the CVPR paper, indicating they changed the training regiment to counter this.

There are common authors with PKVD and since they use point-distillation, they very likely used the (non-public) CVPR code of Cylinder3D. In an issue about overfitting, their author gave suggestions on additional augmentation.

@ldkong1205
Copy link

@L-Reichardt, thank you for collecting and sharing this information! It is indeed helpful.

@yanx27
Copy link
Owner

yanx27 commented Oct 13, 2022

Hi @L-Reichardt , thanks for your encouraging comments and useful feedbacks. This codebase currently releases the baseline training schedule on validation set. For the online test set, one needs to slightly modify the codebase, and instance augmentation is not supported in current codebase (the detailed training schedule is shown in our supplementary). Moreover, the SemanticKITTI benchmark is a very challenging online competition, I think the best results of all previous works are gained by carefully tuning the models. Directly training this codebase in 64 epochs for online test set may cannot achieve the best results on the benchmark, but will in a very reasonable range.

@L-Reichardt
Copy link
Author

L-Reichardt commented Oct 13, 2022

@yanx27 Thank you for your answer. I am training your baseline model currently, using the default configs from this repo. I am aware I will not reach the test results in the paper yet, as I am not tuning on validation set.

The topic augmentation is still somewhat unclear to me.

In the paper Supplementary - A I read: rotate, scale
In your code I find: rotate, flip, scale, translate + instance augmentation (Issue 13)

Are all augmentations from the code used? Or just rotate + scale + instance? Are any augmentations missing in this list?
Could you please go into more detail about "one needs to slightly modify the codebase"?

  • Lambda factor for Lovasz Loss? Was this always 0.1?

@bobpop1
Copy link

bobpop1 commented Oct 14, 2022

@bobpop1 I should have written that statement differently. Most likely they used those tricks, based on the following information.

In the pre-CVPR paper they mention Test Time Augmentation (August 2020) In the CVPR Paper they mention in the caption of table 1 "Note that all results are obtained from the literature, where post-processing, flip & rotation test ensemble, etc. are also applied." There is no more detail for this. Also, augmentation is not explicitly stated anymore (November 2020).

In the pre-CVPR paper they reach 65.2 mIOU on validation, 61.8 mIOU on test (- 3.4 mIOU) In the CVPR paper the behaviour is flipped. 65.9 mIOU validation, 67.8 mIOU on test (+ 1.9 mIOU)

The public code matches the pre-CVPR paper (no pointwise refinement, no extra post-processing methods). The model starts to overfit around epoch 15, however the authors trained for 40 epochs in the CVPR paper, indicating they changed the training regiment to counter this.

There are common authors with PKVD and since they use point-distillation, they very likely used the (non-public) CVPR code of Cylinder3D. In an issue about overfitting, their author gave suggestions on additional augmentation.

Thank you for your reply.

@bobpop1
Copy link

bobpop1 commented Oct 20, 2022

First of all, thank you for uploading this very comprehensive and capable code. It surpasses the quality of what others upload on Github in the 3D-Semantic Space.

Has there been any additional training schedule for the baseline model? Models like Cylinder3D used non-published training schedules and only hinted at some methods in github issues or in their paper. Most likely they used additional Test-time-augmentation, Instance augmentation, unreported augmentation schedule, model depth/channel tuning, Lasermix, ensembling etc.... Potentially every trick in the book.

Is that the same case in this code? Or has the baseline only been trained with the details present in this code? I want to train your code, but rather not invest resources if I cannot reach the reported 67.4 test result of the baseline.

I've read Issue 13 , but do the answers in that issue also apply to the baseline?

Hi, thanks for your information. Did you reach the reported 72.9 test result of the paper. I only achieve nearly 69 miou which is much lower than the reported scores. I think there are also other tricks for this code.

@L-Reichardt
Copy link
Author

Hello @bobpop1 I've retrained the baseline using the authors script and reached ~64 validation mIOU so something is still missing. Adding extra augmentations gets me to 67mIOU.

I did not test the rest of their model and have not put the baseline on the KITTI test server yet.

@iris0329
Copy link

Hi @L-Reichardt, could you share with us what kind of extra augmentations you used for 67 mIOU?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants