The confusion about results of 3DSSD between official and MMDet3D implementation. #612

Physu · 2021-06-02T09:09:32Z

Thanks for developers extraordinary work!
I have a question about 3DSSD evaluation result between author and MMDet3D implementation.
The author's release result:

Methods	Easy AP	Moderate AP	Hard AP	Models
3DSSD	91.71	83.30	80.44	model
PointRCNN	88.91	79.88	78.37	model

In MMDet3D, the result:

Backbone	Class	Lr schd	Mem (GB)	Inf time (fps)	mAP	Download
PointNet2SAMSG	Car	72e	4.7		78.39(81.00)1	model \| log

I notice "Experiment details on KITTI datasets", which shows the difference between official implementation.

1.Official implementation based on Tensorflow1.4, but I guess pytorch is not the reason of poor performance, or tensorflow and pytorch exist performance gap?
2.It is about two percent margin(81.0 and 83.3) between two implementation, can we come up with some methods to fix it?

I also use single2080Ti to train a train+val model with configs/3DSSD/3dssd_kitti-3d-car.py, I modified the ann_file=data_root + 'kitti_infos_train.pkl', to ann_file=data_root + 'kitti_infos_trainval.pkl', the rest code was kept as origin.
when the train was finished, I will evaluate on test, and get the result there to discuss.
Thanks again!

The text was updated successfully, but these errors were encountered:

Physu · 2021-06-07T15:10:47Z

I train the 3DSSD followed the configs in configs/3dssd/3dssd_kitti-3d-car.py with train+val data
and modified the batchsize from 4 to 8, modified the lr from 0.002 to 0.004, the rest keep as origin.
The test result(under AP40):

Benchmark	Easy	Moderate	Hard
Car (Detection)	94.91 %	91.35 %	87.47 %
Car (Orientation)	0.01 %	0.47 %	0.63 %
Car (3D Detection)	86.06 %	76.48 %	69.71 %
Car (Bird's Eye View)	91.65 %	86.69 %	81.05 %

There exits a large margin between the official 3DSSD(76.48 vs 79.55). I feel confused about this, did I set something wrong? Or what can I do to make up this performance gap?
Thanks

Tai-Wang · 2021-06-08T08:59:09Z

The reason for the performance difference has been explained in the README page. Among the differences, there are two most important ones: different evaluation code and different train/val set. The first one can yield about 2 mAP difference as said in the readme while the second one will at least remove the influence of false positive predictions in those samples without ground truths.

In addition, we also regress the benchmark by evaluating our results with their evaluation code and evaluating their results with our evaluation code. The results are almost the same. (Actually, we only reproduce the 79.26 mAP with the official code according to the record of @encore-zhou.)

As for the difference on the test set, there exist some uncertainty and tricks. Have you ever tried to train a model with the official code and submit the result to the benchmark?

Physu · 2021-06-08T09:08:25Z

Thanks for your feedback! Official code was implemented by Tensorflow, I will try train a model and submit the result to the test and evaluate the performance. New results will be updated here, as soon as I get it.

Physu · 2021-06-08T09:17:47Z

By the way, 79.26 is evaluated on val data or test data? If result was evaluated on test data, 79.26 vs 79.55(official in test data), the margin is acceptable. My result on test data was 3 mAP margin, it is unacceptable.

Actually, we only reproduce the 79.26 mAP with the official code according to the record of @encore-zhou

Tai-Wang · 2021-06-08T09:27:33Z

It's evaluated on their val dataset and with their evaluation code (compared with the reported 83.3). So I guess there is a large range of fluctuation in terms of performance on the validation set. You can have a try first and let's have a closer look into whether there is a gap between our implementation and the official one.

Physu · 2021-06-08T12:02:54Z

Got it, I will try to reproduce result by following official code.

Physu · 2021-06-17T13:52:18Z

I use the official implementation and configs train models in Docker container.
The python packages are listed below:
tensorflow 1.4.0
tensorflow-tensorboard 0.4.0
python 3.5
cuda 9.0
numpy 1.14.5

total train iterations: 80700
final ckptfile: model-79893(not 80700 as final ckptfile)

the result:
the model-79893

the model-79086

the model-78279

the model-77472

Benchmark	iterations	Easy	Moderate	Hard
Car (Detection)	77472	89.70 %	82.84 %	79.97 %
Car (Detection)	78279	89.29 %	82.69 %	80.06 %
Car (Detection)	79086	91.14 %	82.79%	80.02 %
Car (Detection)	79893	89.39 %	82.54%	79.83 %

It seems official model evaluation results are better than MMdet3D, but the reason needs further study to find out.

Tai-Wang · 2021-06-18T06:52:13Z

It's a little strange because when we reproduce 3DSSD, @encore-zhou only got the following performance with the official code:

Maybe there is some fluctuation in performance?

Physu · 2021-06-18T08:39:00Z

Maybe the author improved the code implementation? There is something cause the performance gap. I will check the 3DSSD head, hope can find something to explain this situation.

And this is new results gotten by minutes ago.

Physu · 2021-06-18T08:42:05Z

By the way, this results is trained with more epoch, can see that the performance further improved( reach 82.9%).

Tai-Wang · 2021-06-18T09:36:35Z

Yes, it is really strange because we reproduce the above results on Aug. 2020 (as shown in the screenshot) and there are no updates after April 2020. We will check this issue recently. In the meantime, if you have any progress, please feel free to share it here.

Physu · 2021-06-18T09:44:39Z

Thanks for reopening this issue! New findings will be updated.

Physu · 2021-07-12T07:39:46Z

Using pytorch1.5
mmdet 1.3.9
mmdet3d 0.14.0
mmcv-full 1.3.9
ubuntu 18.4

I use official configs in configs/3DSSD/3dssd_4x4_kitti-3d-car.py and modified the single GPU batchsize from 4 to 8(because I use 2 GPUs, the official config setting is 4 GPUs), lr_rate and epoch keep as origin.
Trained a model with 2 2080Ti GPUs and with Full train data(7481). Finally I get a validation results on valsplit (3, 769 samples):

Then I generate test submission file, and submit it to test server:

The performance is not good as I expected, I just don't know why. Could you please give some opinions on this performance?

Physu · 2021-07-12T12:08:17Z

I find it is hard to reproduce the results on KITTI test, though you could have gotten a good result on val already.

Physu · 2021-07-15T02:15:30Z

If we set the confidence threshold great than 0.0(default, output all the plausible predictions), e.g. 0.2 to filter the final predictions in predictions_in_test.txt, we will get:
.
Note that in configs, you can define your threshold:

test_cfg=dict(
        nms_cfg=dict(type='nms', iou_thr=0.1),
        sample_mod='spec',
        score_thr=0.0,  # Attention!!!
        per_class_proposal=True,
        max_output_num=100))

Though there is some improvement, it is far from 79.57 in moderate (3DSSD in leaderboard). I guess a good post processing is needed，but the other skills which can improve performance are sitll a question.

Wuziyi616 · 2021-07-16T02:31:05Z

@Physu Have you ever tried generating submission using the official code and submit it to the test server to see the test set result? Also, it seems to me that, changing mmdet3d's training batch and GPUs from 4x4 to 8x2 improves val set results a lot?

Please kindly provide more observations and I will try to look into this issue.

Physu · 2021-07-16T03:20:58Z

@Wuziyi616 Thanks for your attention! Does offcial code mean dvlab-research/3DSSD or other methods?
Besides, in order to learn more about the evaluation procedure, I use traveller59/kitti-object-eval-python to test results on val set(e.g. get every LiDAR bin's results and save it in a txt file, finally get 3769 txt files).
I find when no other post processing involved, the results:

which is slightly better than mmdet3d evaluation result(Maybe it is unfair to compare this way, for the hyperparameters may change):

If I use a confidence threshold 0.2 to filter out the false positive, the result further improved:

Physu · 2021-07-16T03:38:38Z

I will reproduce on 4*4, and we will see the difference further.

Wuziyi616 · 2021-07-16T04:42:02Z

@Wuziyi616 Thanks for your attention! Does offcial code mean dvlab-research/3DSSD or other methods?
Besides, in order to learn more about the evaluation procedure, I use traveller59/kitti-object-eval-python to test results on val set(e.g. get every LiDAR bin's results and save it in a txt file, finally get 3769 txt files).
I find when no other post processing involved, the results:

which is slightly better than mmdet3d evaluation result(Maybe it is unfair to compare this way, for the hyperparameters may change):

If I use a confidence threshold 0.2 to filter out the false positive, the result further improved:

Exactly, the official code I said is the dvlab's code. I think that's the official code release for 3DSSD isn't it? As you mentioned in this reply, you said you would like to submit test results using that code, have you done that?

Physu · 2021-07-18T17:38:11Z

Thanks for your attention，my opportunity is running out, the results will be updated soon.

jlqzzz · 2021-09-01T08:08:52Z

@Physu
Have you tried to reproduce the multi-class version of 3dssd (that is, predict car, pedestrian and cyclist at the same time)?

Machine97 · 2021-12-14T05:26:42Z

@Physu
Hi, have you ever tried generating submission using the official code and submit it to the test server to see the test set result?

Physu changed the title ~~The confusion about results of SSD3D between author and MMDet3D implementation.~~ The confusion about results of 3DSSD between official and MMDet3D implementation. Jun 7, 2021

Tai-Wang assigned encore-zhou and Tai-Wang Jun 8, 2021

Tai-Wang added reimplementation community discussion labels Jun 8, 2021

Wuziyi616 closed this as completed Jun 17, 2021

Tai-Wang reopened this Jun 18, 2021

Physu mentioned this issue Jul 15, 2021

A question about BackgroundPointsFilter #756

Closed

Wuziyi616 closed this as completed Jul 30, 2021

Tai-Wang mentioned this issue Jun 7, 2022

Without any change, the 3dssd model is run locally, and the results in the validation set are far from those provided by the official model. #1456

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The confusion about results of 3DSSD between official and MMDet3D implementation. #612

The confusion about results of 3DSSD between official and MMDet3D implementation. #612

Physu commented Jun 2, 2021 •

edited

Physu commented Jun 7, 2021 •

edited

Tai-Wang commented Jun 8, 2021

Physu commented Jun 8, 2021

Physu commented Jun 8, 2021

Tai-Wang commented Jun 8, 2021

Physu commented Jun 8, 2021

Physu commented Jun 17, 2021 •

edited

Tai-Wang commented Jun 18, 2021

Physu commented Jun 18, 2021

Physu commented Jun 18, 2021

Tai-Wang commented Jun 18, 2021

Physu commented Jun 18, 2021 •

edited

Physu commented Jul 12, 2021 •

edited

Physu commented Jul 12, 2021

Physu commented Jul 15, 2021 •

edited

Wuziyi616 commented Jul 16, 2021 •

edited

Physu commented Jul 16, 2021 •

edited

Physu commented Jul 16, 2021

Wuziyi616 commented Jul 16, 2021

Physu commented Jul 18, 2021

jlqzzz commented Sep 1, 2021

Machine97 commented Dec 14, 2021

The confusion about results of 3DSSD between official and MMDet3D implementation. #612

The confusion about results of 3DSSD between official and MMDet3D implementation. #612

Comments

Physu commented Jun 2, 2021 • edited

Physu commented Jun 7, 2021 • edited

Tai-Wang commented Jun 8, 2021

Physu commented Jun 8, 2021

Physu commented Jun 8, 2021

Tai-Wang commented Jun 8, 2021

Physu commented Jun 8, 2021

Physu commented Jun 17, 2021 • edited

Tai-Wang commented Jun 18, 2021

Physu commented Jun 18, 2021

Physu commented Jun 18, 2021

Tai-Wang commented Jun 18, 2021

Physu commented Jun 18, 2021 • edited

Physu commented Jul 12, 2021 • edited

Physu commented Jul 12, 2021

Physu commented Jul 15, 2021 • edited

Wuziyi616 commented Jul 16, 2021 • edited

Physu commented Jul 16, 2021 • edited

Physu commented Jul 16, 2021

Wuziyi616 commented Jul 16, 2021

Physu commented Jul 18, 2021

jlqzzz commented Sep 1, 2021

Machine97 commented Dec 14, 2021

Physu commented Jun 2, 2021 •

edited

Physu commented Jun 7, 2021 •

edited

Physu commented Jun 17, 2021 •

edited

Physu commented Jun 18, 2021 •

edited

Physu commented Jul 12, 2021 •

edited

Physu commented Jul 15, 2021 •

edited

Wuziyi616 commented Jul 16, 2021 •

edited

Physu commented Jul 16, 2021 •

edited