Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can the scale ambiguity be resolved with multi-frame approach? #59

Open
JinraeKim opened this issue Sep 2, 2022 · 1 comment
Open

Comments

@JinraeKim
Copy link

Hi, I'm trying to implement self-supervised depth estimation for robotic applications and started studying this area.
Thank you for the impressive works in advance!

My concern is that if one uses only RGB images, the scale awareness would be impossible in principle (even with the multi-frame approach). The scale would be consistent though.
So many works use "median scaling" using ground truth depth image, but it does not make sense as the true depth would not be accessible.

In this case, what would be the best practice for realistic application?

@JamieWatson683
Copy link
Collaborator

Hi - thanks for your interest, and sorry for the delay in getting back to you!

It's a good question, and is an active area of research.

There are 2 approaches that I can think of (although there are probably many more!)

  • Use a device with an IMU, and run a visual-inertial SLAM system to compute poses. I.e. an iphone running ARKit.
  • In autonomous driving (or potential robotics) where the camera is at a fixed height above the ground plane, you have a constraint which can be used to give you real world scale. There are a few papers using this method to obtain ground truth scale when training from video on KITTI

Hope this helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants