Reproudce DINO-R50 and get a higher result as 49.9 with batch_size=1 and nGPU=8 #150

FelixCaae · 2022-11-30T15:26:44Z

I reproduce DINO with dino_r50_4scale_12ep.py and set batch_size=1. I use max_iter=90000 x 2 and drops learning rate at 165000th iteration. Then, I got a result higher than this repo reports. Since this result (49.9) is obviously better than the current result (49.2) so there may be something wrong with my setting? Or this may be a better training setting than the default one (batch size=2).

[11/30 21:53:00 d2.evaluation.fast_eval_api]: COCOeval_opt.accumulate() finished in 4.95 seconds.
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.499
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.674
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.546
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.326
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.531
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.645
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.380
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.659
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.731
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.573
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.772
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.883
[11/30 21:53:00 d2.evaluation.coco_evaluation]: Evaluation results for bbox: 
|   AP   |  AP50  |  AP75  |  APs   |  APm   |  APl   |
|:------:|:------:|:------:|:------:|:------:|:------:|
| 49.890 | 67.436 | 54.627 | 32.601 | 53.056 | 64.499 |

The text was updated successfully, but these errors were encountered:

rentainhe · 2022-11-30T15:59:07Z

I reproduce DINO with dino_r50_4scale_12ep.py and set batch_size=1. I use max_iter=90000 x 2 and drops learning rate at 165000th iteration. Then, I got a result higher than this repo reports. Since this result (49.9) is obviously better than the current result (49.2) so there may be something wrong with my setting? Or this may be a better training setting than the default one (batch size=2).

[11/30 21:53:00 d2.evaluation.fast_eval_api]: COCOeval_opt.accumulate() finished in 4.95 seconds.
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.499
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.674
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.546
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.326
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.531
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.645
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.380
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.659
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.731
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.573
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.772
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.883
[11/30 21:53:00 d2.evaluation.coco_evaluation]: Evaluation results for bbox: 
|   AP   |  AP50  |  AP75  |  APs   |  APm   |  APl   |
|:------:|:------:|:------:|:------:|:------:|:------:|
| 49.890 | 67.436 | 54.627 | 32.601 | 53.056 | 64.499 |

wow! nice results~, would like to share your training log with us? And would you like to provide your checkpoints and config for us by creating a new pull request, we are very welcome to new contributors : )

HaoZhang534 · 2022-11-30T16:17:24Z

@FelixCaae Your result is normal. You use a smaller total batch size of 8 and more training iterations which leads to a better performance in the early stage of training. However, if you continue to run it until convergence, the result should be no higher than the result with a total batch size of 16. Actually, we have observed the same phenomenon when training other models.

rentainhe · 2023-02-01T07:47:54Z

As there is no more activity, I am closing the issue~ Feel free to reopen it if necessary. Or you can leave a new issue if you meet some other problems.

rentainhe assigned HaoZhang534 Dec 4, 2022

rentainhe closed this as completed Feb 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproudce DINO-R50 and get a higher result as 49.9 with batch_size=1 and nGPU=8 #150

Reproudce DINO-R50 and get a higher result as 49.9 with batch_size=1 and nGPU=8 #150

FelixCaae commented Nov 30, 2022

rentainhe commented Nov 30, 2022 •

edited

HaoZhang534 commented Nov 30, 2022

rentainhe commented Feb 1, 2023

Reproudce DINO-R50 and get a higher result as 49.9 with batch_size=1 and nGPU=8 #150

Reproudce DINO-R50 and get a higher result as 49.9 with batch_size=1 and nGPU=8 #150

Comments

FelixCaae commented Nov 30, 2022

rentainhe commented Nov 30, 2022 • edited

HaoZhang534 commented Nov 30, 2022

rentainhe commented Feb 1, 2023

rentainhe commented Nov 30, 2022 •

edited