Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training NERF using real-captured data #16

Closed
phongnhhn92 opened this issue May 30, 2020 · 34 comments
Closed

Training NERF using real-captured data #16

phongnhhn92 opened this issue May 30, 2020 · 34 comments

Comments

@phongnhhn92
Copy link

Hello,
I have followed your example to train NERF on my own data. So I have seen you and other guys have some success with single object scene (silica model). How about the real scene (fern or orchids dataset)?

I have captured a video of my office link. However, I cant use colmap to estimate poses to train NERF model. Since you are more experienced than me on this project. Can u show me some suggestion ? It's interesting to see if this method works on real data like this.

This is the error from the colmap:

python imgs2poses.py ./cmvs/
Need to run COLMAP
Features extracted
Features matched
Sparse map created
Finished running COLMAP, see ./cmvs/colmap_output.txt for logs
Post-colmap
Cameras 5
Images # 6
Traceback (most recent call last):
  File "imgs2poses.py", line 18, in <module>
    gen_poses(args.scenedir, args.match_type)
  File "/home/phong/data/Work/Paper3/Code/LLFF/llff/poses/pose_utils.py", line 276, in gen_poses
    save_poses(basedir, poses, pts3d, perm)
  File "/home/phong/data/Work/Paper3/Code/LLFF/llff/poses/pose_utils.py", line 66, in save_poses
    cams[ind-1] = 1
IndexError: list assignment index out of range
@kwea123
Copy link
Owner

kwea123 commented May 30, 2020

Actually the first data I tried was real forward facing scene but due to coronavirus I can only think of my messy desktop to capture, so I didn't post it haha...
pc
It works quite well except some flickering frames, which my be due to bad lighting in the room.

Concerning your data, the photos look good, one concern is that maybe it covers too wide range that colmap cannot handle. I will take a look. If you are using local pc, you can also try to run colmap with the gui, not the imgs2poses.py to see what reconstruction looks like.

@kwea123
Copy link
Owner

kwea123 commented May 30, 2020

Strange, it works perfectly using colmap gui. Are you able to run colmap gui?

1

I can recover the poses correctly.

Edit: although it reconstructs.. the poses don't seem to be correct. So I recommend two attempts:

  1. Try to use "Dense reconstruction" to see if it produces better pose estimates
  2. Try to take photos with smaller lateral range (do not rotate camera too much)

@phongnhhn92
Copy link
Author

Hi, It is weird that the script imgs2poses.py can not estimate the poses but the colmap gui is able to do it. I have tested my images using colmap gui and the sparse reconstruction works. I am trying to run Dense reconstruction to see the differences.

Btw, how can u tell that the poses don't seem to be correct ? I have similar sparse reconstruction with you but I have no idea how to evaluate this. Can you clarify ?

Actually, I am curious on how does this NERF works on large-scale scene. For example, can we test this on large dataset such as ScanNet, Matterport or DTU dataset.

In fact, I have capture a new set of images with lateral movement (not so much camera rotation) and this is my result. As you can see, the printer looks good but the background is not that good. My intial thought is that this NERF model doesnt work that well with far objects (like hallways). I dont know if there are any quick parameter fixes that we can change to train the model.

gif_optim

@phongnhhn92
Copy link
Author

Another issue is that if I am using colmap gui then how can I compute this pose_bound,npy file. I guess this is an necessary file for both training and testing.

@kwea123
Copy link
Owner

kwea123 commented May 31, 2020

For example the 001 and 035 are almost rotated by 90 degrees, but the reconstruction looks like there's no rotation... maybe I'm wrong, it's just personal estimation.

Currently the constraint is the world space, you can only have two kinds:

  1. 360 inward facing, such that the world space is a cube.
  2. forward facing, such that the world space is a cuboid that fully lies behind a certain plane. You can see here or some other issues in the original repo for explanation.

So complex structures like matterport won't work, since they are more like 360 outward facing which doesn't satisfy the above constraint. At the limit DTU still works (I tried on 1 scene) since it satisfies constraint 2.

For your new data, I reckon that the result is reasonable. As I mentioned above, the world space must be fully behind a certain plane, so anything behind that plane won't be correct, which explains the result on the left and right part (it might be due to scarce data on the edge as well).

To make it work on 360 outward facing or even more complex scenes, although I think the concept still works, it'll be lots of work:

  1. An efficient way to encode the whole space. The reason why original NeRF has the above 2 constraints is that we need to confine the coordinates into [-1, 1]^3 so that we can encode them and train efficiently. Training on 360 outward facing might require something like a spherical coordinate system to encode.
  2. An efficient way to sample the points. Different from 360 inward facing and forward facing where we know the roi lies near us, in complex scenes we need to design another way to sample the training points in 3d. Otherwise the result won't be as good in my opinion.

These are just some thoughts. Anyway I think it's a total new research topic, so there's no easy way to do that.

Finally for poses_bounds.npy, you can use colmap gui to generate sparse reconstructions first then call imgs2poses.py with the same argument. It will skip colmap part and only extract the bounds.

@phongnhhn92
Copy link
Author

Thanks for your clarification ! It make sense now.

@kwea123 kwea123 closed this as completed Jun 4, 2020
@sixftninja
Copy link

sixftninja commented Jun 10, 2020

I trained a model with 32 inward-facing images.

Any advice on what might be going wrong? I used full resolution images for COLMAP. Added the --spheric argument for training. While training some of the epochs (1, 9, 11, 13, 18, 26) did not complete training. Also, I let the model train till epoch 30 but checkpoints were saved only for epoch 17, 20, 22, 23 and 25. I used epoch25 checkpoint to render novel poses using eval.py.

@phongnhhn92
Copy link
Author

I am not sure why the training fails exactly but I guess your training images have complicated background. This NERF model doesn't do well with cluttered background. Why dont you try to put the pan on a white table and put the camera closer to it ? I doubt it will work this time.

@sixftninja
Copy link

Yeah started training again with a white background. Let's see now..

@kwea123
Copy link
Owner

kwea123 commented Jun 11, 2020

@3ventHoriz0n What do you mean by "didn't complete training"? By default I only save the best 5 epochs, that's why you only have 5 ckpts at the end. Every epoch should finish normally. You can change the number here:

save_top_k=5,)

@sixftninja
Copy link

sixftninja commented Jun 11, 2020

Oh missed that part.

Well, usually when an epoch finishes, the progress bar is replaced by progress for next epoch. at any given time I only see one progress bar on the screen. But for the epochs I mentioned, I saw the progress bar stuck midway and for the next epoch a new progress bar would load. So I had 7 progress bars on the screen. 6 of them stuck midway and the last one for current epoch. Don't know what that means though.

@kwea123
Copy link
Owner

kwea123 commented Jun 11, 2020

sometimes if you accidentally perturb the terminal (like accidentally pressed a key), it interrupts the progress bar, so a new progress bar appears and the old one will be left on the terminal and looks like it was stuck. It's just visual bug, doesn't affect training.

@sixftninja
Copy link

Ok I seriously can't figure out what I'm doing wrong.

Original

Generated

The model doesn't seem to be learning anything at all.

@kwea123
Copy link
Owner

kwea123 commented Jun 11, 2020

Can you share the sparse folder generated by colmap? And the poses_bounds.npy file.
Also the training log files.

@sixftninja
Copy link

sixftninja commented Jun 11, 2020

sparse
logs
poses_bounds

@kwea123
Copy link
Owner

kwea123 commented Jun 11, 2020

This is what I see from your training log, the center image is the prediction, I didn't see anything wrong. Also the poses seem correct. How did you generate that noisy image?
Screenshot from 2020-06-11 22-57-13

@sixftninja
Copy link

sixftninja commented Jun 11, 2020

I ran eval.py using checkpoint epoch=28 and dataset_name llff.

can you also tell me how you are using tensor board to visualize predictions?

@kwea123
Copy link
Owner

kwea123 commented Jun 11, 2020

Maybe you forgot to add --spheric_poses in evaluation? Is it indeed not mentioned in readme, I will add that.

@sixftninja
Copy link

yes I did forget to add that. I'll try again.

@kwea123
Copy link
Owner

kwea123 commented Jun 11, 2020

can you share the checkpoint?

@sixftninja
Copy link

epoch=28

@kwea123
Copy link
Owner

kwea123 commented Jun 11, 2020

There might be need to modify these two lines in order to get good visual result:

nerf_pl/datasets/llff.py

Lines 130 to 131 in d41ae30

[0,1,0,-0.9*t],
[0,0,1,t],

it controls where the virtual camera is placed. This part is actually hard-coded currently, I'm still finding a way to let it adapt to various scenes. For your data I find

        trans_t = lambda t : np.array([
            [1,0,0,0],
            [0,1,0,-0.6*t],
            [0,0,1,0.7*t],
            [0,0,0,1],
        ])

is good.
This is what I get after the above modification
test

@sixftninja
Copy link

sixftninja commented Jun 11, 2020

Can you please explain what exactly is happening here?
Also, I uploaded all necessary files to my google drive because I wanted to run the exact-mesh notebook but when I run the cell to search for tight bounds, the runtime restarts.

@sixftninja
Copy link

Adding --spheric_poses generated this gif. Looks good except the top part has been cut off and there's a strange cloud of white dust at one location.

@kwea123
Copy link
Owner

kwea123 commented Jun 11, 2020

this is the translation wrt poses center. the second line controls the height offset and the third line controls the distance offset

Yes like I said currently you need to manually tune the position as mentioned above, but it is only for visualization, for mesh extraction this code doesn't have effect.

@sixftninja
Copy link

ok will try the modification now.

@sixftninja
Copy link

sixftninja commented Jun 16, 2020

for the camera above, colorless mesh extraction worked perfectly. However when I tried to extract a colored mesh this was the result

I found tight bounds at x,y: -0.4, 0.3 and z: -1.25, -0.55. I tried sigma threshold values from 5 to 45 in increments of 5.
I tried occlusion threshold values from 0.05 to 0.2 in increments of 0.05.

What do you think is going wrong?

@kwea123
Copy link
Owner

kwea123 commented Jun 16, 2020

looks like the images are rotated by 90 degrees... can you try manually rotate them by +90 (or -90) then feed to the program?

@sixftninja
Copy link

Alright, will do that. I have faced this while reading iPhone captured images using Pillow and openCV, smh..

@sixftninja
Copy link

I fixed the EXIF data of images and ran the experiment again. The resulting colored mesh is still not satisfactory. Any advice?

1. Data Folder (Includes images and LLFF output files)
2. eval.py output
3. .ply file
4. Checkpoint
5. Colored mesh video

@kwea123
Copy link
Owner

kwea123 commented Jun 22, 2020

@3ventHoriz0n sorry, I misupdated the master code. I reverted it just now, please re-pull the code and retry the extract_mesh with the same parameters, it should give good results.

@sixftninja
Copy link

Done. Here's the final result.

@dichen-cd
Copy link

Hi @kwea123, thanks for your work!

it controls where the virtual camera is placed. This part is actually hard-coded currently, I'm still finding a way to let it adapt to various scenes.

I was wondering do you find any good way for adaptive render pose generation?
Currently I find it quite hard to set correct poses manually, therefore I'm using the interpolated c2ws from the training set. It is working but the camera movement is not satisfactory (shaky, jittering speed etc.)
Do you have any suggestions?

@astogxl
Copy link

astogxl commented Sep 15, 2021

@kwea123,hello!
do you have read the paper pixelNeRF? I just cant understand the part of hardcoding for generating render pose for DTU dataset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants