Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad Performance on Visual Odometry Image Sequences? #72

Open
C-H-Chien opened this issue Dec 20, 2023 · 3 comments
Open

Bad Performance on Visual Odometry Image Sequences? #72

C-H-Chien opened this issue Dec 20, 2023 · 3 comments

Comments

@C-H-Chien
Copy link

Hi,

I am interested in generating a bunch of feature tracks across a number of frames from a visual odometry sequence, e.g., KITTI, EuRoC datasets. However, when I try it with the demo, the number of features are low and the length of the feature tracks are pretty small, e.g., 5-7.

In the paper, it says "Models tend to fail on real-world videos with panning". I am not fully understand what it means. Is that the reason why this method does not perform well on visual odometry sequences?

Thank you!

@cdoersch
Copy link
Collaborator

I haven't tried working with visual odometry sequences, but there's a couple of issues. First, TAPIR tends to fail when there's large changes in scale, which are common in odometry sequences. Second, there's some challenging surfaces in odometry sequences: roads have very little texture (and repretitive textures like road markints), and trees are porous.

Because of the 'uncertainty estimate', TAPIR tends to mark tracks as occluded if it's not certain of the location; you can use the occlusion estimate directly, but the tracks are likely to be wrong.

Another option is to try to increase the resolution; remember that TAPIR does its intialization at 256x256. In the case of odometry sequences, points will typically remain in the same image quadrant where they started, so you might be able to improve performance by running TAPIR in a tiled fashion.

Otherwise, I think you'll just have to wait for fundamental TAP research to progress.

@C-H-Chien
Copy link
Author

Thanks for the comments!
I have a couple of questions:

  1. In what extent do you think resizing images is helpful in terms of estimating good feature tracks and efficiency (right now my running was quite slow, e.g. 20-30 mins for 50 frames with 512x512 resolution, even with GPU)
  2. Do you think training the TAPIR on odometry sequences from scratch would resolve the issue?
    Thanks again!

@cdoersch
Copy link
Collaborator

How many points are you tracking? If things are set up properly, then a few dozen points should take seconds, even at 512x512 resolution. I suspect you're mostly seeing JAX compilation time.

Training TAPIR on odometry sequences would certainly help. Probably fine-tuning would be more efficient than training from scratch (and probably more effective if you don't have a lot of data), but either should help. However, we haven't tried this. Out of curiosity, which data do you plan to use? We aren't aware of much odometry data with long-term tracks; what's available tends to rely on structure-from-motion, and it doesn't come with reliable occlusion estimates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants