About training with DN-Deformable-DETR-R50 #120

Mr-Bigworth · 2022-10-29T16:30:37Z

When I train DN-Deformable-DETR-R50 in 12 epochs, cause I only have one Tesla A-100 GPU, I set the dataloader.train.total_batch_size = 4 and train.max_iter = 360000. And the result of the AP and AP50 is 46.5559 and 64.1131 in iteration 334999 and the result seems to be better.
Why it's so high... the detection accuracy.

FengLi-ust · 2022-10-30T07:11:24Z

Yes, it is possible. In our original DN-DETR repo, DN-Deformable-DETR can get 46.1 in 12 epochs.
One reason for your better performance is that training with small batchsize will accelerate convergence in the early stage, I have verified this before.
In addition, detrex implementation is better than our original DN-DETR repo. (:

Mr-Bigworth · 2022-10-30T07:25:32Z

Yes, it is possible. In our original DN-DETR repo, DN-Deformable-DETR can get 46.1 in 12 epochs. One reason for your better performance is that training with small batchsize will accelerate convergence in the early stage, I have verified this before. In addition, detrex implementation is better than our original DN-DETR repo. (:

Thanks, but I wonder why the result of DN-Deformable-DETR in paper[1] is 43.4 and why 'Deformable' can bring so large improvement. DN-Detr-R50 get only 38.5 in paper[2] and I also verified it.

Mr-Bigworth · 2022-10-30T07:26:50Z

[1] DN-DETR: Accelerate DETR Training by Introducing Query DeNoising
[2] GROUP DETR: FAST DETR TRAINING WITH GROUPWISE ONE-TO-MANY ASSIGNMENT

FengLi-ust · 2022-10-30T08:52:15Z

The 43.4 result is our initial implementation without deformable attention in the decoder and some other optimizations. Our released model performs better by better combine deformable detr and dn-detr.

DN-Detr-R50 gets only 38.5 because detection models without multi-scale features converge slowly and cannot do well on small objects. Multi-scale features are strong, and deformable attention is also strong to fuse multi-scale features well.

All these make it even stronger.

Mr-Bigworth · 2022-10-30T08:55:45Z

The 43.4 result is our initial implementation without deformable attention in the decoder and some other optimizations. Our released model performs better by better combine deformable detr and dn-detr.

DN-Detr-R50 gets only 38.5 because detection models without multi-scale features converges slowly and cannot do well on small objects. Multi-scale features are strong, and deformable attention is also strong to fuse multi-scale features well.

All this make it even more stronger.

Thanks very much for your answer!

hotcore · 2022-11-19T23:13:07Z

man i just wanna know how long you train your model on "dataloader.train.total_batch_size = 4 and train.max_iter = 360000" ? appreciate！！

Mr-Bigworth · 2022-11-20T04:54:46Z

man i just wanna know how long you train your model on "dataloader.train.total_batch_size = 4 and train.max_iter = 360000" ? appreciate！！

Training DN-Deformable-DETR-R50 (12 epoch) cost me about 55 hours on one Tesla A100

hotcore · 2022-11-20T05:25:33Z

man i just wanna know how long you train your model on "dataloader.train.total_batch_size = 4 and train.max_iter = 360000" ? appreciate！！

Training DN-Deformable-DETR-R50 (12 epoch) cost me about 55 hours on one Tesla A100

i use one v100 32g,same settings as you,detrex shows more than 4 days🤣
Anyway, thks for your reply!!

FengLi-ust self-assigned this Oct 30, 2022

Mr-Bigworth closed this as completed Oct 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About training with DN-Deformable-DETR-R50 #120

About training with DN-Deformable-DETR-R50 #120

Mr-Bigworth commented Oct 29, 2022

FengLi-ust commented Oct 30, 2022

Mr-Bigworth commented Oct 30, 2022

Mr-Bigworth commented Oct 30, 2022

FengLi-ust commented Oct 30, 2022 •

edited

Mr-Bigworth commented Oct 30, 2022

hotcore commented Nov 19, 2022

Mr-Bigworth commented Nov 20, 2022

hotcore commented Nov 20, 2022

About training with DN-Deformable-DETR-R50 #120

About training with DN-Deformable-DETR-R50 #120

Comments

Mr-Bigworth commented Oct 29, 2022

FengLi-ust commented Oct 30, 2022

Mr-Bigworth commented Oct 30, 2022

Mr-Bigworth commented Oct 30, 2022

FengLi-ust commented Oct 30, 2022 • edited

Mr-Bigworth commented Oct 30, 2022

hotcore commented Nov 19, 2022

Mr-Bigworth commented Nov 20, 2022

hotcore commented Nov 20, 2022

FengLi-ust commented Oct 30, 2022 •

edited