-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About training with DN-Deformable-DETR-R50 #120
Comments
Yes, it is possible. In our original DN-DETR repo, DN-Deformable-DETR can get |
Thanks, but I wonder why the result of DN-Deformable-DETR in paper[1] is 43.4 and why 'Deformable' can bring so large improvement. DN-Detr-R50 get only 38.5 in paper[2] and I also verified it. |
[1] DN-DETR: Accelerate DETR Training by Introducing Query DeNoising |
The DN-Detr-R50 gets only 38.5 because detection models without multi-scale features converge slowly and cannot do well on small objects. Multi-scale features are strong, and deformable attention is also strong to fuse multi-scale features well. All these make it even stronger. |
Thanks very much for your answer! |
man i just wanna know how long you train your model on "dataloader.train.total_batch_size = 4 and train.max_iter = 360000" ? appreciate!! |
Training DN-Deformable-DETR-R50 (12 epoch) cost me about 55 hours on one Tesla A100 |
i use one v100 32g,same settings as you,detrex shows more than 4 days🤣 |
When I train DN-Deformable-DETR-R50 in 12 epochs, cause I only have one Tesla A-100 GPU, I set the dataloader.train.total_batch_size = 4 and train.max_iter = 360000. And the result of the AP and AP50 is 46.5559 and 64.1131 in iteration 334999 and the result seems to be better.
Why it's so high... the detection accuracy.
The text was updated successfully, but these errors were encountered: