[URGENT] Eval results are much lower than what's reported #10

encounter1997 · 2021-10-03T10:44:08Z

Hi, thanks for the excellent work!

I follow the instructions in README to evaluate the models provided in your repo. However, the AP I got for yolos_ti .pth, yolos_s_200_pre.pth, yolos_s_300_pre.pth, yolos_s_dWr.pth, and yolos_base.pth are 28.7, 12.5, 12.7, 13.2, and 13.8, respectively. While yolos_ti.pth matches the performance in your paper and log, other four models are significantly lower than what's expected.
Any idea why this would happen? Thanks in advance!

For example, when evaluating the base model, I ran

python  -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --coco_path ../data/coco --batch_size 2 --backbone_name base --eval --eval_size 800 --init_pe_size 800 1344 --mid_pe_size 800 1344 --resume ../trained_weights/yolos/yolos_base.pth

and was expected to obtain a 42.0 AP performance, as shown in your paper and log. However, the result is only 13.8 AP.

The complete evaluation output is shown below.

*****************************************
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
*****************************************
| distributed init (rank 0): env://
| distributed init (rank 2): env://
| distributed init (rank 3): env://
| distributed init (rank 1): env://
| distributed init (rank 6): env://
| distributed init (rank 5): env://
| distributed init (rank 7): env://
| distributed init (rank 4): env://
Namespace(backbone_name='base', batch_size=2, bbox_loss_coef=5, clip_max_norm=0.1, coco_panoptic_path=None, coco_path='../data/coco', dataset_file='coco', decay_rate=0.1, det_token_num=100, device='cuda', dice_loss_coef=1, dist_backend='nccl', dist_url='env://', distributed=True, eos_coef=0.1, epochs=150, eval=True, eval_size=800, giou_loss_coef=2, gpu=0, init_pe_size=[800, 1344], lr=0.0001, lr_backbone=1e-05, lr_drop=100, mid_pe_size=[800, 1344], min_lr=1e-07, num_workers=2, output_dir='', pre_trained='', rank=0, remove_difficult=False, resume='../trained_weights/yolos/yolos_base.pth', sched='warmupcos', seed=42, set_cost_bbox=5, set_cost_class=1, set_cost_giou=2, start_epoch=0, use_checkpoint=False, warmup_epochs=0, warmup_lr=1e-06, weight_decay=0.0001, world_size=8)
Has mid pe
number of params: 127798368
loading annotations into memory...
Done (t=23.52s)
creating index...
index created!
800
loading annotations into memory...
Done (t=3.00s)
creating index...
index created!
Test:  [  0/313]  eta: 0:39:39  class_error: 29.21  loss: 2.1542 (2.1542)  loss_bbox: 0.4245 (0.4245)  loss_ce: 0.7761 (0.7761)  loss_giou: 0.9535 (0.9535)  cardinality_error_unscaled: 5.3750 (5.3750)  class_error_unscaled: 29.2100 (29.2100)  loss_bbox_unscaled: 0.0849 (0.0849)  loss_ce_unscaled: 0.7761 (0.7761)  loss_giou_unscaled: 0.4768 (0.4768)  time: 7.6030  data: 0.5298  max mem: 3963
Test:  [256/313]  eta: 0:00:26  class_error: 17.22  loss: 2.5668 (2.6435)  loss_bbox: 0.5639 (0.5792)  loss_ce: 0.8598 (0.8386)  loss_giou: 1.1904 (1.2257)  cardinality_error_unscaled: 3.8750 (4.2398)  class_error_unscaled: 28.7817 (28.6160)  loss_bbox_unscaled: 0.1128 (0.1158)  loss_ce_unscaled: 0.8598 (0.8386)  loss_giou_unscaled: 0.5952 (0.6129)  time: 0.4406  data: 0.0137  max mem: 10417
Test:  [312/313]  eta: 0:00:00  class_error: 16.29  loss: 2.8745 (2.6626)  loss_bbox: 0.5974 (0.5833)  loss_ce: 0.8791 (0.8461)  loss_giou: 1.3012 (1.2332)  cardinality_error_unscaled: 3.8750 (4.2370)  class_error_unscaled: 26.2946 (28.7748)  loss_bbox_unscaled: 0.1195 (0.1167)  loss_ce_unscaled: 0.8791 (0.8461)  loss_giou_unscaled: 0.6506 (0.6166)  time: 0.4251  data: 0.0134  max mem: 10417
Test: Total time: 0:02:25 (0.4663 s / it)
Averaged stats: class_error: 16.29  loss: 2.8745 (2.6626)  loss_bbox: 0.5974 (0.5833)  loss_ce: 0.8791 (0.8461)  loss_giou: 1.3012 (1.2332)  cardinality_error_unscaled: 3.8750 (4.2370)  class_error_unscaled: 26.2946 (28.7748)  loss_bbox_unscaled: 0.1195 (0.1167)  loss_ce_unscaled: 0.8791 (0.8461)  loss_giou_unscaled: 0.6506 (0.6166)
Accumulating evaluation results...
DONE (t=15.78s).
IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.13810
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.26766
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.11832
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.05146
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.13066
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.23324
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.18115
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.29001
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.31740
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.12520
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.31154
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.49446

The text was updated successfully, but these errors were encountered:

Yuxin-CV · 2021-10-03T14:26:11Z

Hi~@encounter1997, thanks for your interest in YOLOS and thanks for pointing out this issue :)

The codebase of YOLOS is built upon DETR's codebase, so there is a "bug" inherit from DETR: you need to set the num_GPU and batchsize_per_GPU during evaluation the same as during training. E.g., the num_GPU = 8 & batchsize_per_GPU = 1 for YOLOS-Small & YOLOS-Base.

It seems that you set batchsize_per_GPU = 2 during evaluation, which results in AP degeneration.

Try

python -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --coco_path /path/to/coco --batch_size 1 --backbone_name small --eval --eval_size 800 --init_pe_size 512 864 --mid_pe_size 512 864 --resume /path/to/YOLOS-Small

to reproduce YOLOS-Small AP, which should be 36.1.

encounter1997 · 2021-10-03T15:05:37Z

Thanks for your timely reply! I followed your advice and the problem was solved~

encounter1997 added the bug Something isn't working label Oct 3, 2021

Yuxin-CV added the good first issue Good for newcomers label Oct 3, 2021

Yuxin-CV added a commit that referenced this issue Oct 3, 2021

fix eval bugs (issue #10)

76ccbc6

Yuxin-CV added a commit that referenced this issue Oct 3, 2021

fix eval issue (#10)

bba5f14

Yuxin-CV added a commit that referenced this issue Oct 3, 2021

Fix eval issues (#10)

bd45638

encounter1997 closed this as completed Oct 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[URGENT] Eval results are much lower than what's reported #10

[URGENT] Eval results are much lower than what's reported #10

encounter1997 commented Oct 3, 2021

Yuxin-CV commented Oct 3, 2021 •

edited

encounter1997 commented Oct 3, 2021

[URGENT] Eval results are much lower than what's reported #10

[URGENT] Eval results are much lower than what's reported #10

Comments

encounter1997 commented Oct 3, 2021

Yuxin-CV commented Oct 3, 2021 • edited

encounter1997 commented Oct 3, 2021

Yuxin-CV commented Oct 3, 2021 •

edited