Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NeRF #1450

Closed
wants to merge 27 commits into from
Closed

NeRF #1450

wants to merge 27 commits into from

Conversation

YanivHollander
Copy link
Contributor

Changes

This pull request concerns a draft version of the NeRF algorithm for view synthesis and 3D rendering.

In this PR I included an API class - CameraCalibration, which drives NeRF, and is designed based on the discussion in #1384.

I took a brute force approach and copied the relevant core algorithms from the paper: Wang et al. (2021) - https://arxiv.org/abs/2102.07064. Most algorithm parameters are hard-coded, whereas some of those can be defined in CameraCalibration API.

I tested the full functionality of CameraCalibration in Colab - https://colab.research.google.com/drive/1xiX1Jf572HAN_TN4p-8IPU3CZRRkfhox#scrollTo=yVT7jsvk4ViS. Please let me know if you can run this notebook, or at least able to copy it and modify the relevant paths to make it work with the NeRF version in this PR.

The next steps in this PR, and I foresee, and please comment on each point:

  1. Parameterize the algorithm and export more control to CameraCalibration API (including, e.g., CPU/GPU control, rendering parameters, etc.)

  2. Refactor the code I took from https://arxiv.org/abs/2102.07064 to follow Kornia's standards

  3. Replace core engine component with of-the-shelve algorithms. Here I mainly refer to the rendering part, which is also the most compute intensive. I'm thinking PyTorch3D may have some good alternatives we may want to explore

Yaniv

Type of change

  • 📚 Documentation Update
  • 🧪 Tests Cases
  • 🐞 Bug fix (non-breaking change which fixes an issue)
  • 🔬 New feature (non-breaking change which adds functionality)
  • 🚨 Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • 📝 This change requires a documentation update

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • Did you update CHANGELOG in case of a major change?

@YanivHollander
Copy link
Contributor Author

@edgarriba, I shared the Colab Notebook with you. There is no way I could find in Colab to share with everyone. Should I upload this Notebook to this PR as well?

@edgarriba
Copy link
Member

@edgarriba, I shared the Colab Notebook with you. There is no way I could find in Colab to share with everyone. Should I upload this Notebook to this PR as well?

there should be a way to share to anyone with the link. Or yes, add it temporally in this PR

@edgarriba
Copy link
Member

@YanivHollander would interesting to see (or compare somehow) with classical checkerboard calibration. @ducha-aiki @lferraz do you know any open dataset for that ?

@YanivHollander
Copy link
Contributor Author

The following link should allow anyone to open the Colab:

https://colab.research.google.com/drive/1xiX1Jf572HAN_TN4p-8IPU3CZRRkfhox?usp=sharing

Please let me know if you run into issues opening the Notebook

@ducha-aiki
Copy link
Member

ducha-aiki commented Nov 9, 2021

@YanivHollander would interesting to see (or compare somehow) with classical checkerboard calibration. @ducha-aiki @lferraz do you know any open dataset for that ?

https://github.com/amy-tabb/basic-camera-calibration here is a code to generate such dataset, thanks to @amy-tabb

Also:
output of the code
https://app.box.com/s/bsg2vm6o3uzqlbbgzfk6w9e27veyel0l
configs used to generate the images above: https://app.box.com/s/7zbc1098797ubk8mh0q9lg3ual007f3j

@YanivHollander
Copy link
Contributor Author

After more work on the Nerf algorithm, I submitted to this PR few more modifications. Mostly I added more functions to the class, and exposed parameters from the algorithm itself.

I also tried to run the Nerf implementation on one of the calibration datasets proposed by @ducha-aiki. I am still struggling to get satisfying results. For the llff flower dataset, the results are reasonable, but still behind what is presented in the paper. My main problem is lack of compute. I am running the algorithm in Colab CUDA, but this resource is limited. Any suggestions will be greatly appreciated. This is especially important when trying out a deeper Nerf NN over the tiny Nerf currently the default that runs.

Few suggestion points for further discussion:

  1. Per a previous discussion on the Stoke framework, I believe this implementation of Nerf can nicely scale to work in the context of Stoke or a similar framework. I’m mainly referring to the fact the algorithm includes a training process, which could be wrapped by Stoke to automatically tune compute based on device resources

  2. Following another discussion on Nerf shared by @edgarriba on Linkedin, I believe this current contribution can be expanded at some point to include more model implementations, such as GAN based Nerf published in: https://github.com/MQ66/gnerf

But those extensions should be considered only after I improve basic results. Again, I stress that I believe my main problem is compute, so any suggestions in this direction will be greatly appreciated!

@edgarriba
Copy link
Member

@YanivHollander could you include in this PR the python training script so that I can try in my machine ? possibly a bash or something to easily download the dataset too

@YanivHollander
Copy link
Contributor Author

I shared the location of two datasets I tested:

Link to LLFF flower dataset:
https://drive.google.com/drive/folders/1YflPSvCiInSt4Y-hAmP6ZM4745P58l6r?usp=sharing

Link to one of the calibration datasets:
https://drive.google.com/drive/folders/1skJBBP23_wO5jPuidl6F1cSbVFP_tJax?usp=sharing

The following notebook is shared with everyone:
https://colab.research.google.com/drive/1xiX1Jf572HAN_TN4p-8IPU3CZRRkfhox?usp=sharing

Will this allow you to run some tests on your machine? You should run in the Nerf branch with the code in this PR (that is also committed to YanivHollander:nerf)

@YanivHollander
Copy link
Contributor Author

Hi,

Over the last couple of weeks, I ran several Nerf computations for different scenes. I found a solution for my compute problem with Colab Pro that works quite neatly.

My conclusion at this point is that the current implementation of the Nerf algorithm is not quite one-size-fits-all. There are scenes that are processed well; for example, the flower scene (results attached). On the other hand, I couldn’t quite get good reconstruction results with the calibration data set. Also, I tried to take few 2D shots of a mini superman figure I have, and again with some limited success (results attached).

During my research into how to improve results I added few features to the original algorithm, listed here:

  1. The ability to save checkpoints during training, and to start training from a previously saved checkpoint
  2. An extension to the focal model that allows each camera to have its own focal lengths that are estimated separately during training
  3. The ability to initialize the Nerf model after training, while keeping focal and pose estimations. This feature allows better training of the model in the second pass (from Wang, 2021 section 4.2).
  4. Few more algorithm parameters are exposed in the API (compute device, number of rays to randomly march from the image plane, learning rate)
  5. A bunch of unit tests

I think that at this point we may want to transition this PR to a regular one and start the process of integrating to Kornia. This will mainly open it up for others to test. I also have few other directions in mind, which I would like to test based on this version (moving to tools from PyTorch3D; implementing prior knowledge with Semantic Consistency Loss - https://www.notion.so/DietNeRF-Putting-NeRF-on-a-Diet-4aeddae95d054f1d91686f02bdb74745; etc.).

Please share your thoughts on how to proceed.

Yaniv

@YanivHollander
Copy link
Contributor Author

llff_flower_img
llff_flower_depth

@YanivHollander
Copy link
Contributor Author

superman_img_superman - 16 png

superman - 16

superman_img_superman - 21 png

superman - 21

@YanivHollander YanivHollander marked this pull request as ready for review December 3, 2021 03:32
Copy link
Member

@edgarriba edgarriba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high-level comments to start the integration with the library. Please, @ducha-aiki @shijianjian @lferraz check it out

@@ -0,0 +1,274 @@
from typing import Tuple

import numpy as np
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should remove anything from numpy. Find the equivalency in torch or we implement ourselves


class TinyNerf(nn.Module):
def __init__(self, pos_in_dims, dir_in_dims, D):
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the final version:

  • Typing as much as possible
  • Adjust the doc-strings with the same style as the rest of kornia components

self.rgb_layers = nn.Sequential(nn.Linear(D + dir_in_dims, D // 2), nn.ReLU())
self.fc_rgb = nn.Linear(D // 2, 3)

self.fc_density.bias.data = torch.tensor([0.1]).float()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably use register so that automatically cast to the correct dtype and device



def train_one_epoch(imgs, H, W, ray_params, n_selected_rays, opt_nerf, opt_focal,
opt_pose, nerf_model, focal_net, pose_param_net, device):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be great if we can integrate with our training api
https://kornia.readthedocs.io/en/latest/x.html

possibly we need to adjust things there but that's fine

return v / np.linalg.norm(v)


def create_spiral_poses(radii, focus_depth, n_poses=120, n_circle=2):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a good utility to go into kornia.geometry.camera ? /cc @ducha-aiki

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can open a separate PR for that

This could be further transformed to world coordinate later, using camera poses.
:return: (H, W, 3) torch.float32
"""
y, x = torch.meshgrid(torch.arange(H, dtype=torch.float32),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use kornia create_meshgrid + convert_to_homogeneous

from scipy.spatial.transform import Rotation as RotLib


def SO3_to_quat(R):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check on the conversions module because we have some of this functionalities
https://kornia.readthedocs.io/en/latest/geometry.conversions.html

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in case of missing we can implement there

torch.backends.cudnn.benchmark = False


def mse2psnr(mse):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rays_o = rays_o + t[..., None] * rays_d

# Store some intermediate homogeneous results
ox_oz = rays_o[..., 0] / rays_o[..., 2]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use convert_points_from_homogenous

N_sam = t_vals.shape[0]

# transform rays from camera coordinate to world coordinate
ray_dir_world = torch.matmul(c2w[:3, :3].view(1, 1, 3, 3),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@YanivHollander
Copy link
Contributor Author

Many of the problems I am seeing with objects other than the flower scene could potentially be related and improved by implementing the idea in the following: https://jonbarron.info/mipnerf/

@edgarriba
Copy link
Member

Many of the problems I am seeing with objects other than the flower scene could potentially be related and improved by implementing the idea in the following: https://jonbarron.info/mipnerf/

seems that the code is in jax though. Good experiment to see the difficulty to port to pytorch

@bsuleymanov
Copy link
Contributor

@edgarriba there is a draft version here: https://github.com/bsuleymanov/mip-nerf

@YanivHollander
Copy link
Contributor Author

Hi,

Just to quickly update. I am backlogged by many other tasks that keep me busy, so revising NeRF code takes longer than expected.

I am thinking now that utilizing primitives form PyTorch3D should really be the way to go. It will allow more flexibility moving forward. I am now focusing on this, and trying mainly to solve compilation issues of the package on my Mac. I will update when I have more to share.

If anyone has some experience with PyTotch3D on OS X please share

@edgarriba
Copy link
Member

Hi,

Just to quickly update. I am backlogged by many other tasks that keep me busy, so revising NeRF code takes longer than expected.

I am thinking now that utilizing primitives form PyTorch3D should really be the way to go. It will allow more flexibility moving forward. I am now focusing on this, and trying mainly to solve compilation issues of the package on my Mac. I will update when I have more to share.

If anyone has some experience with PyTotch3D on OS X please share

What primitives are we talking about?

@YanivHollander
Copy link
Contributor Author

PyTorch3D package has configurable classes for ray samplings, ray marching, positional embedding, etc. It also has camera models, which we may not need since Kornia has similar DSs. What PyTorch3D doesn't have is the part to estimate camera parameters for the purpose of calibration, so for that part I will keep using Wang's contribution.

More on PyTorch3D NeRF in a tutorial at: https://pytorch3d.org/tutorials/fit_simple_neural_radiance_field.

@edgarriba edgarriba marked this pull request as draft January 30, 2022 17:44
@edgarriba edgarriba added the wip 🛠️ Work in progress label Jan 30, 2022
@YanivHollander
Copy link
Contributor Author

I had to deprioritize my work on NeRF due to many other tasks. On top of that my choice to use PyTorch3D may not have been the best, since I suspect there are few bugs in this framework that prevent me from getting results as good as when I was using the (slower, though) Wang, 2011 implementation. I plan to go back to this effort, and will take a look at the reference you sent.

@ducha-aiki
Copy link
Member

Closing in favour of #1911

@ducha-aiki ducha-aiki closed this Oct 3, 2022
@YanivHollander YanivHollander deleted the nerf branch October 4, 2022 02:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wip 🛠️ Work in progress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants