Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to train for the case of a real life object and blender-like camera setting #19

Closed
dedoogong opened this issue Feb 17, 2022 · 2 comments

Comments

@dedoogong
Copy link

hello thanks for the sharing~!
im trying to train barf with my custom dataset. I dont know the camera intrinsics and
the camera is located at 3 position(0, 30, 60 degree) spherically and an object is on the turn-table and took pics of it every 15 degrees.
so i have 72 pics ((pitch 0 yaw 15, 30, 45,...,360) and (pitch 30 yaw 15, 30, 45,...,360) and (pitch 60 yaw 15, 30, 45,...,360)).
so, even the camera is fixed per each pitch, its similar as blender style camera movement.
Barf support blender and llff but i failed to provide applicable camera pose information(.json or npz) as blender or llff dataloader.py need. so i tried to use iphone configuration.

i seems to train at a level. but is quite far from succeed after 200000steps.

i tried to initialize the camera pose spherically manually but it gives worse results.

please give me some hints to solve it.

thanks^^.

@chenhsuanlin
Copy link
Owner

Hi @dedoogong, from your description I'm guessing there could be several issues:

  1. The most critical is probably the pose initialization. BARF is still a local registration method, which means that the pose initializations have to be close enough to the underlying ground-truth poses. Since your multi-view data is object-centric and captured 360˚ spherically, I don't expect BARF to be able to make the cameras automagically "wrap" around the object from the same (all-identity) pose. Since you already know your capture configuration of the 72 viewpoints, it would be more realistic to initialize from the spherical angles you described. I would suggest using such poses to train a NeRF first to make sure it could at least get you some reasonable results, and then switch to BARF to see if it improves.
  2. It sounds like you're turning the table and capturing the object every 15˚. If it's true, then the background would not be in correspondence, and BARF would have a hard time using the photometric cues for pose optimization. This wouldn't work even for the original NeRF (i.e. even when ground-truth poses are given).
  3. Using camera intrinsics different from your actual sensor could have an impact, but it probably isn't the major issue. You could consider also optimizing the intrinsic parameters (e.g. focal lengths), as in NeRF-- or Self-calibrating NeRF.

Typically if you don't see signs of BARF converging in 20k steps, then it probably won't in the end either.
Hope these help!

@dedoogong
Copy link
Author

dedoogong commented Feb 20, 2022

Hi @chenhsuanlin ! thanks so much for your kind thoughtful reply!
I tried to use an estimated camera intrinsics(focal length) from colmap but still failed.
Maybe, as you pointed out, it's too hard for Barf to optimize the identical, init pose to all around, spherical posese from scratch.
I agree your opinions(1,3) and I will try to find a better initial pose manually even though is would require a lot of trial and error.
Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants