Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disparity to distance #37

Open
rodrigoGA opened this issue May 2, 2021 · 5 comments
Open

Disparity to distance #37

rodrigoGA opened this issue May 2, 2021 · 5 comments

Comments

@rodrigoGA
Copy link

First I want to congratulate you on the project.

I would like to know how to convert, the result of the model, to distance in meters.

Searching on the internet I found the following formula
depth = baseline * focal / disparity

disparity: is the result of the model, a number between 0 and 1. (modelResult - min_result) / (max_result - min_result)
baseline: seems to depend on the training data, for kitti dataset I have found values of is 0.54 or 0.22
focal: this i think is the focal length of the camera but i'm not sure i think a value close to 2262

I'm not sure if this is the correct way to do it, nor what parameters to use for the model you trained.

In case of exporting the model with different height and width, what parameters do I have to change?

python export.py --ckpt ckpt/pydnet \
        --arch pydnet \
        --dest "./" \
        --height 192 --width 192
@FilippoAleotti
Copy link
Owner

Hi,

Unfortunately, that formula doesn't hold here since the model is not stereo (nor is mimicking a stereo one). In particular, the model predicts an inverse depth map for the input image, thus you have to align each prediction using some known 3D points of the scene to obtain a metric depth map.

@rodrigoGA
Copy link
Author

Thank you very much for your prompt response.

what do you mean by

you have to align each prediction using some known 3D points of the scene

Can your model be extended for stereo view?

@LulaSan
Copy link

LulaSan commented May 19, 2021

I am also interested in how to obtain distance values from this network

@rodrigoGA
Copy link
Author

I will tell you what I have researched on inverse distance, but I am not an expert on the subject, I have found everything in google

1 / D = V * a + b

D= physical distance
V= inverse depth map value (result of the model)
a and b are values that fit based on known points in the image, you can do least squares to fit a and b based on known points.

Now I have doubts with the fact that it is necessary to calibrate in each frame. Is this necessary even if the same enviroment is used with the same light and camera?

I have found this repository https://github.com/nianticlabs/manydepth which seems to make the prediction consistent over time. It would be interesting to include something similar

@mpottinger
Copy link

Thank you very much for your prompt response.

what do you mean by

you have to align each prediction using some known 3D points of the scene

Can your model be extended for stereo view?

For example AR/Slam provide a sparse point cloud as part of their tracking, so you have some depth points in the image as a reference, rescale to that. Or have another depth sensor. I know that sounds pointless because why do monodepth if you already have depth? Well, depth sensors have usually many holes in the depth map, especially on reflective surfaces, dark surfaces, etc. This might help with that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants