New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disparity to distance #37
Comments
Hi, Unfortunately, that formula doesn't hold here since the model is not stereo (nor is mimicking a stereo one). In particular, the model predicts an inverse depth map for the input image, thus you have to align each prediction using some known 3D points of the scene to obtain a metric depth map. |
Thank you very much for your prompt response. what do you mean by
Can your model be extended for stereo view? |
I am also interested in how to obtain distance values from this network |
I will tell you what I have researched on inverse distance, but I am not an expert on the subject, I have found everything in google
D= physical distance Now I have doubts with the fact that it is necessary to calibrate in each frame. Is this necessary even if the same enviroment is used with the same light and camera? I have found this repository https://github.com/nianticlabs/manydepth which seems to make the prediction consistent over time. It would be interesting to include something similar |
For example AR/Slam provide a sparse point cloud as part of their tracking, so you have some depth points in the image as a reference, rescale to that. Or have another depth sensor. I know that sounds pointless because why do monodepth if you already have depth? Well, depth sensors have usually many holes in the depth map, especially on reflective surfaces, dark surfaces, etc. This might help with that. |
First I want to congratulate you on the project.
I would like to know how to convert, the result of the model, to distance in meters.
Searching on the internet I found the following formula
depth = baseline * focal / disparity
disparity: is the result of the model, a number between 0 and 1.
(modelResult - min_result) / (max_result - min_result)
baseline: seems to depend on the training data, for kitti dataset I have found values of is 0.54 or 0.22
focal: this i think is the focal length of the camera but i'm not sure i think a value close to 2262
I'm not sure if this is the correct way to do it, nor what parameters to use for the model you trained.
In case of exporting the model with different height and width, what parameters do I have to change?
The text was updated successfully, but these errors were encountered: