Disparity to distance #37

rodrigoGA · 2021-05-02T02:26:10Z

First I want to congratulate you on the project.

I would like to know how to convert, the result of the model, to distance in meters.

Searching on the internet I found the following formula
depth = baseline * focal / disparity

disparity: is the result of the model, a number between 0 and 1. (modelResult - min_result) / (max_result - min_result)
baseline: seems to depend on the training data, for kitti dataset I have found values of is 0.54 or 0.22
focal: this i think is the focal length of the camera but i'm not sure i think a value close to 2262

I'm not sure if this is the correct way to do it, nor what parameters to use for the model you trained.

In case of exporting the model with different height and width, what parameters do I have to change?

python export.py --ckpt ckpt/pydnet \
        --arch pydnet \
        --dest "./" \
        --height 192 --width 192

The text was updated successfully, but these errors were encountered:

FilippoAleotti · 2021-05-03T06:14:40Z

Hi,

Unfortunately, that formula doesn't hold here since the model is not stereo (nor is mimicking a stereo one). In particular, the model predicts an inverse depth map for the input image, thus you have to align each prediction using some known 3D points of the scene to obtain a metric depth map.

rodrigoGA · 2021-05-03T12:29:20Z

Thank you very much for your prompt response.

what do you mean by

you have to align each prediction using some known 3D points of the scene

Can your model be extended for stereo view?

LulaSan · 2021-05-19T15:16:40Z

I am also interested in how to obtain distance values from this network

rodrigoGA · 2021-05-19T16:50:57Z

I will tell you what I have researched on inverse distance, but I am not an expert on the subject, I have found everything in google

1 / D = V * a + b

D= physical distance
V= inverse depth map value (result of the model)
a and b are values that fit based on known points in the image, you can do least squares to fit a and b based on known points.

Now I have doubts with the fact that it is necessary to calibrate in each frame. Is this necessary even if the same enviroment is used with the same light and camera?

I have found this repository https://github.com/nianticlabs/manydepth which seems to make the prediction consistent over time. It would be interesting to include something similar

mpottinger · 2021-08-24T01:32:28Z

Thank you very much for your prompt response.

what do you mean by

you have to align each prediction using some known 3D points of the scene

Can your model be extended for stereo view?

For example AR/Slam provide a sparse point cloud as part of their tracking, so you have some depth points in the image as a reference, rescale to that. Or have another depth sensor. I know that sounds pointless because why do monodepth if you already have depth? Well, depth sensors have usually many holes in the depth map, especially on reflective surfaces, dark surfaces, etc. This might help with that.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disparity to distance #37

Disparity to distance #37

rodrigoGA commented May 2, 2021

FilippoAleotti commented May 3, 2021

rodrigoGA commented May 3, 2021

LulaSan commented May 19, 2021

rodrigoGA commented May 19, 2021

mpottinger commented Aug 24, 2021

Disparity to distance #37

Disparity to distance #37

Comments

rodrigoGA commented May 2, 2021

FilippoAleotti commented May 3, 2021

rodrigoGA commented May 3, 2021

LulaSan commented May 19, 2021

rodrigoGA commented May 19, 2021

mpottinger commented Aug 24, 2021