Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproudce DINO-R50 and get a higher result as 49.9 with batch_size=1 and nGPU=8 #150

Closed
FelixCaae opened this issue Nov 30, 2022 · 3 comments
Assignees

Comments

@FelixCaae
Copy link
Contributor

I reproduce DINO with dino_r50_4scale_12ep.py and set batch_size=1. I use max_iter=90000 x 2 and drops learning rate at 165000th iteration. Then, I got a result higher than this repo reports. Since this result (49.9) is obviously better than the current result (49.2) so there may be something wrong with my setting? Or this may be a better training setting than the default one (batch size=2).

[11/30 21:53:00 d2.evaluation.fast_eval_api]: COCOeval_opt.accumulate() finished in 4.95 seconds.
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.499
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.674
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.546
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.326
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.531
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.645
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.380
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.659
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.731
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.573
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.772
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.883
[11/30 21:53:00 d2.evaluation.coco_evaluation]: Evaluation results for bbox: 
|   AP   |  AP50  |  AP75  |  APs   |  APm   |  APl   |
|:------:|:------:|:------:|:------:|:------:|:------:|
| 49.890 | 67.436 | 54.627 | 32.601 | 53.056 | 64.499 |
@rentainhe
Copy link
Collaborator

rentainhe commented Nov 30, 2022

I reproduce DINO with dino_r50_4scale_12ep.py and set batch_size=1. I use max_iter=90000 x 2 and drops learning rate at 165000th iteration. Then, I got a result higher than this repo reports. Since this result (49.9) is obviously better than the current result (49.2) so there may be something wrong with my setting? Or this may be a better training setting than the default one (batch size=2).

[11/30 21:53:00 d2.evaluation.fast_eval_api]: COCOeval_opt.accumulate() finished in 4.95 seconds.
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.499
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.674
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.546
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.326
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.531
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.645
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.380
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.659
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.731
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.573
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.772
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.883
[11/30 21:53:00 d2.evaluation.coco_evaluation]: Evaluation results for bbox: 
|   AP   |  AP50  |  AP75  |  APs   |  APm   |  APl   |
|:------:|:------:|:------:|:------:|:------:|:------:|
| 49.890 | 67.436 | 54.627 | 32.601 | 53.056 | 64.499 |

wow! nice results~, would like to share your training log with us? And would you like to provide your checkpoints and config for us by creating a new pull request, we are very welcome to new contributors : )

@HaoZhang534
Copy link
Collaborator

@FelixCaae Your result is normal. You use a smaller total batch size of 8 and more training iterations which leads to a better performance in the early stage of training. However, if you continue to run it until convergence, the result should be no higher than the result with a total batch size of 16. Actually, we have observed the same phenomenon when training other models.

@rentainhe
Copy link
Collaborator

As there is no more activity, I am closing the issue~ Feel free to reopen it if necessary. Or you can leave a new issue if you meet some other problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants