Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inacurate Eigen Split Evaluation #166

Closed
jahaniam opened this issue Jun 22, 2018 · 12 comments
Closed

Inacurate Eigen Split Evaluation #166

jahaniam opened this issue Jun 22, 2018 · 12 comments

Comments

@jahaniam
Copy link

Dear Author,

Once again thanks for sharing your code.

The evaluation of Eigen split based on raw lidar data is inaccurate especially on occlusions .
Besides there is a twist in lidar as well which needs to be untwisted.
image

The best way to evaluate is through the ground truth recently provided by official kitti depth section. They untwisted and also used +-4 consecutive frames for each frame to generate more dense ground truth.

Although for few images ground truth doesn't exist so you can use raw lidar data for those.

Best,
Ali

@jahaniam
Copy link
Author

I would also appreciate if you could specify which flag is incorrect in your evaluation:

"We found that it uses an incorrect flag which made
the depths to be computed with respect to the LIDAR instead of the cameras."

@mrharicot
Copy link
Owner

mrharicot commented Jun 22, 2018

Hi Ali,

You are correct, the reprojection of the raw lidar point cloud is a pretty bad way of measuring accuracy.

I think people should just entirely move away from the eigen split and switch to the new depth benchmark and we had already been planning on adding those results to monodepth 1 and 2.

Finally for the incorrect flag, the actual error is here, the last argument 'vel_depth' should be False.
Eigen originally computed the depths wrt to the lidar instead of the cameras, we had this flag in order to reproduce their results.
The results change a little bit, some metrics go up, some go down.

I would recommend against using the fixed evaluation as most methods recently published actually use the flawed evaluation code and just simply move to the new depth benchmark.

I hope this helps!

@jahaniam
Copy link
Author

That's correct. I'm editing your code to use official ground truth provided by kitti evaluation of Eigen split! Not finished yet

@mrharicot
Copy link
Owner

Great! But I'd recommend moving entirely to the new data split, and giving up the Eigen one altogether, otherwise this is going to be confusing.

@jahaniam
Copy link
Author

But there is a limitation of submission still for the test set.

@mrharicot
Copy link
Owner

That's true, but the goal is to prevent people from overfitting to the test set.

@YifeiAI
Copy link

YifeiAI commented Jun 29, 2018

Hi, I am doing supervised depth estimation. How can I get the ground truth depth image from your code? Is the output of the function generate_depth_map from evaluation_utils.py the ground truth depth image or the desparity?

@mrharicot
Copy link
Owner

Hi,
As the name of the function suggests, it generates a depthmap :)

@jahaniam
Copy link
Author

Evaluation in the field of single image depth reconstruction is so messy and frustrating!

FYI, Just letting you and other researchers know I've evaluated the best model of monodepth on eigen split using ground truth post process provided by kitti website and I got about 3.7 RMSE! (you've reported 4.9 using your code!). My guess on some state of the art papers including DORN CVPR2018 is they are using official ground truth (not lidar) and they got RMSE about 2 for eigen!

I've tried Yevhen Kuznietsov (semi supervised) using your evaluation code(and lidar) and I got something around 4.3 which he reported 4.6 in his paper!

To researchers in this field, we absolutely need a consistent groundtruth+code or else if you are using yours please re evaluate other papers using your code!

@mrharicot
Copy link
Owner

Ali, I very much agree with you.

This is why I released the evaluation code, with the hope for it to be used by everyone.
Zhou CVPR'17 used our evaluation code and a few (I believe) CVPR'18 papers used it.

The results are indeed very much dependent on how you compute them.
This is why we recomputed all the results we could in the monodepth paper.
We tried to do so in the monodepth2 paper but didn't have access to all the results.

This is why I believe everyone should move to the "new" kitti ground truth.
The evaluation will be fair, identical for everyone, and on a closed test set.

@nicolasrosa
Copy link

@mrharicot The link you suggested points to the Depth Completion page. Did you mean the Depth Prediction?

http://www.cvlibs.net/datasets/kitti/eval_depth.php?benchmark=depth_prediction

@jahaniam
Copy link
Author

jahaniam commented Jan 15, 2019

Yeah, depth prediction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants