Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lower results when evaluating released BEVDet checkpoint #41

Closed
Divadi opened this issue Jul 13, 2022 · 11 comments
Closed

Lower results when evaluating released BEVDet checkpoint #41

Divadi opened this issue Jul 13, 2022 · 11 comments

Comments

@Divadi
Copy link
Contributor

Divadi commented Jul 13, 2022

Hello, I have tried to evaluate released BEVDet checkpoint as-is on my setup, but I get

mAP: 0.2751                                                                                                                                                                                   
mATE: 0.7179
mASE: 0.2738
mAOE: 0.5512
mAVE: 0.8747
mAAE: 0.2205
NDS: 0.3737
Eval time: 107.4s

Per-class results:
Object Class    AP      ATE     ASE     AOE     AVE     AAE
car     0.441   0.631   0.167   0.131   1.037   0.254
truck   0.197   0.757   0.225   0.125   0.828   0.227
bus     0.283   0.680   0.185   0.139   1.895   0.350
trailer 0.132   1.053   0.224   0.463   0.547   0.068
construction_vehicle    0.066   0.795   0.484   1.174   0.095   0.358
pedestrian      0.301   0.788   0.305   1.320   0.848   0.412
motorcycle      0.235   0.704   0.262   0.612   1.437   0.090
bicycle 0.182   0.607   0.265   0.875   0.310   0.006
traffic_cone    0.445   0.616   0.333   nan     nan     nan
barrier 0.468   0.547   0.287   0.122   nan     nan

which is lower than the expected 30.8/40.4 mAP/NDS.

I am using A6000 GPUs, torch 1.10.1, cudatoolkit 11.3. Do you know what might be the issue?

I find that I have the exact same numbers as #15 @BoLang615, but I believe I am using the latest version. I would appreciate any pointers for this.

Thank you!

@HuangJunJie2017
Copy link
Owner

@Divadi you train this with 4 gpus, total 8x4=32 batch size and lr=1e-4?

@Divadi
Copy link
Contributor Author

Divadi commented Jul 13, 2022

This is without re-training; I just loaded & evaluated the released checkpoint.

I'm running a separate training job with 4 gpus, 16x4=64 batch size, original lr, but it has not completed yet.

@HuangJunJie2017
Copy link
Owner

@Divadi It seems a common problem that needs checking the numerical consistency of the intermediate result. I store some intermediate results of the first sample with a pickle (python3.7).
image

check.zip

@Divadi
Copy link
Contributor Author

Divadi commented Jul 13, 2022

Hmm...

When I load your pkl and compare it with mine:

>>> a = pickle.load(open("check.pkl", 'rb')); b = pickle.load(open("check_divadi.pkl", 'rb'))
>>> a.keys()
dict_keys(['points', 'pred_bboxes', 'out_dir', 'file_name', 'bbox_pts', 'img_metas'])
>>> a['file_name']
'n015-2018-07-11-11-54-16+0800__LIDAR_TOP__1531281629949213'
>>> b['img_metas'][0]['pts_filename']
'datasets/nuscenes/samples/LIDAR_TOP/n015-2018-07-11-11-54-16+0800__LIDAR_TOP__1531281439800013.pcd.bin'

The first file path itself is different; the predictions are different as well. Is what you sent me the first sample as loaded by the pipeline?
Also, for reference I saw that nuscenes_converter was not different from mmdetection3d's pre-coordinate change version, so I had just used those pkl files.

@HuangJunJie2017
Copy link
Owner

I set the workers_per_gpu=0
Here is the md5sum of my test pkl, you can check this as well:
efd90b7e93c43fc18e98a0cf0ec8b1c4 /nuscenes_infos_val.pkl

@HuangJunJie2017
Copy link
Owner

emm, I apologize for my mistaken 'test.pkl' for 'check.pkl' and 'img_feats' for 'img_metas'
here is the modified pkl:
check.zip.zip

@Divadi
Copy link
Contributor Author

Divadi commented Jul 13, 2022

I will check the pkl & zip further when I get home.

The results of training myself are as follows:

mAP: 0.3050                                                                                                                                                                                                 
mATE: 0.6869
mASE: 0.2754
mAOE: 0.5599
mAVE: 0.8782
mAAE: 0.2481
NDS: 0.3876
Eval time: 120.7s

Per-class results:
Object Class    AP      ATE     ASE     AOE     AVE     AAE
car     0.503   0.542   0.160   0.109   0.929   0.228
truck   0.209   0.721   0.224   0.172   0.813   0.228
bus     0.300   0.731   0.188   0.093   1.747   0.440
trailer 0.170   1.048   0.242   0.385   0.617   0.112
construction_vehicle    0.055   0.894   0.485   1.118   0.106   0.392
pedestrian      0.325   0.743   0.302   1.343   0.861   0.495
motorcycle      0.262   0.678   0.259   0.670   1.680   0.075
bicycle 0.218   0.544   0.275   1.030   0.272   0.015
traffic_cone    0.503   0.501   0.332   nan     nan     nan
barrier 0.506   0.468   0.288   0.119   nan     nan

@Divadi Divadi closed this as completed Jul 13, 2022
@Divadi Divadi reopened this Jul 13, 2022
@HuangJunJie2017
Copy link
Owner

@Divadi mAVE and mAAE is a bit low. Some 'abnormal' examples (I think the others will not report their result when it is seem ok- - ) can be found in issue#21.

@HuangJunJie2017
Copy link
Owner

HuangJunJie2017 commented Jul 14, 2022

I will check the pkl & zip further when I get home.

The results of training myself are as follows:

mAP: 0.3050                                                                                                                                                                                                 
mATE: 0.6869
mASE: 0.2754
mAOE: 0.5599
mAVE: 0.8782
mAAE: 0.2481
NDS: 0.3876
Eval time: 120.7s

Per-class results:
Object Class    AP      ATE     ASE     AOE     AVE     AAE
car     0.503   0.542   0.160   0.109   0.929   0.228
truck   0.209   0.721   0.224   0.172   0.813   0.228
bus     0.300   0.731   0.188   0.093   1.747   0.440
trailer 0.170   1.048   0.242   0.385   0.617   0.112
construction_vehicle    0.055   0.894   0.485   1.118   0.106   0.392
pedestrian      0.325   0.743   0.302   1.343   0.861   0.495
motorcycle      0.262   0.678   0.259   0.670   1.680   0.075
bicycle 0.218   0.544   0.275   1.030   0.272   0.015
traffic_cone    0.503   0.501   0.332   nan     nan     nan
barrier 0.506   0.468   0.288   0.119   nan     nan

may be epoch18 is better……

@Divadi
Copy link
Contributor Author

Divadi commented Jul 16, 2022

@HuangJunJie2017
Whew... I think I found the issue; I had Pillow 9.2.0 installed, probably causing some of the operations in image transforms (loading.py) to be slightly different from your Pillow 8.4.0. As a consequence, your loaded images' differences with mine looked like this:
image
After downgrading to Pillow 8.4.0, the difference is nil:
image

Updated results:

mAP: 0.3082
mATE: 0.6648
mASE: 0.2729
mAOE: 0.5330
mAVE: 0.8287
mAAE: 0.2052
NDS: 0.4036
Eval time: 98.1s

Per-class results:
Object Class    AP      ATE     ASE     AOE     AVE     AAE
car     0.508   0.535   0.159   0.127   0.947   0.232
truck   0.222   0.671   0.216   0.123   0.834   0.220
bus     0.311   0.760   0.195   0.086   1.592   0.301
trailer 0.150   0.987   0.229   0.443   0.518   0.054
construction_vehicle    0.073   0.720   0.482   1.093   0.103   0.342
pedestrian      0.336   0.738   0.301   1.326   0.861   0.409
motorcycle      0.262   0.704   0.262   0.595   1.450   0.075
bicycle 0.213   0.525   0.270   0.885   0.325   0.009
traffic_cone    0.506   0.518   0.331   nan     nan     nan
barrier 0.502   0.490   0.284   0.119   nan     nan

Thank you for your help!

@HuangJunJie2017
Copy link
Owner

@Divadi nice job! thank you so much for your information!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants