New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NeRF #1450
NeRF #1450
Conversation
for more information, see https://pre-commit.ci
@edgarriba, I shared the Colab Notebook with you. There is no way I could find in Colab to share with everyone. Should I upload this Notebook to this PR as well? |
there should be a way to share to anyone with the link. Or yes, add it temporally in this PR |
@YanivHollander would interesting to see (or compare somehow) with classical checkerboard calibration. @ducha-aiki @lferraz do you know any open dataset for that ? |
The following link should allow anyone to open the Colab: https://colab.research.google.com/drive/1xiX1Jf572HAN_TN4p-8IPU3CZRRkfhox?usp=sharing Please let me know if you run into issues opening the Notebook |
https://github.com/amy-tabb/basic-camera-calibration here is a code to generate such dataset, thanks to @amy-tabb Also: |
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
After more work on the Nerf algorithm, I submitted to this PR few more modifications. Mostly I added more functions to the class, and exposed parameters from the algorithm itself. I also tried to run the Nerf implementation on one of the calibration datasets proposed by @ducha-aiki. I am still struggling to get satisfying results. For the llff flower dataset, the results are reasonable, but still behind what is presented in the paper. My main problem is lack of compute. I am running the algorithm in Colab CUDA, but this resource is limited. Any suggestions will be greatly appreciated. This is especially important when trying out a deeper Nerf NN over the tiny Nerf currently the default that runs. Few suggestion points for further discussion:
But those extensions should be considered only after I improve basic results. Again, I stress that I believe my main problem is compute, so any suggestions in this direction will be greatly appreciated! |
@YanivHollander could you include in this PR the python training script so that I can try in my machine ? possibly a bash or something to easily download the dataset too |
I shared the location of two datasets I tested: Link to LLFF flower dataset: Link to one of the calibration datasets: The following notebook is shared with everyone: Will this allow you to run some tests on your machine? You should run in the Nerf branch with the code in this PR (that is also committed to YanivHollander:nerf) |
for more information, see https://pre-commit.ci
Hi, Over the last couple of weeks, I ran several Nerf computations for different scenes. I found a solution for my compute problem with Colab Pro that works quite neatly. My conclusion at this point is that the current implementation of the Nerf algorithm is not quite one-size-fits-all. There are scenes that are processed well; for example, the flower scene (results attached). On the other hand, I couldn’t quite get good reconstruction results with the calibration data set. Also, I tried to take few 2D shots of a mini superman figure I have, and again with some limited success (results attached). During my research into how to improve results I added few features to the original algorithm, listed here:
I think that at this point we may want to transition this PR to a regular one and start the process of integrating to Kornia. This will mainly open it up for others to test. I also have few other directions in mind, which I would like to test based on this version (moving to tools from PyTorch3D; implementing prior knowledge with Semantic Consistency Loss - https://www.notion.so/DietNeRF-Putting-NeRF-on-a-Diet-4aeddae95d054f1d91686f02bdb74745; etc.). Please share your thoughts on how to proceed. Yaniv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
high-level comments to start the integration with the library. Please, @ducha-aiki @shijianjian @lferraz check it out
@@ -0,0 +1,274 @@ | |||
from typing import Tuple | |||
|
|||
import numpy as np |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should remove anything from numpy. Find the equivalency in torch or we implement ourselves
|
||
class TinyNerf(nn.Module): | ||
def __init__(self, pos_in_dims, dir_in_dims, D): | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for the final version:
- Typing as much as possible
- Adjust the doc-strings with the same style as the rest of kornia components
self.rgb_layers = nn.Sequential(nn.Linear(D + dir_in_dims, D // 2), nn.ReLU()) | ||
self.fc_rgb = nn.Linear(D // 2, 3) | ||
|
||
self.fc_density.bias.data = torch.tensor([0.1]).float() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably use register so that automatically cast to the correct dtype and device
|
||
|
||
def train_one_epoch(imgs, H, W, ray_params, n_selected_rays, opt_nerf, opt_focal, | ||
opt_pose, nerf_model, focal_net, pose_param_net, device): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be great if we can integrate with our training api
https://kornia.readthedocs.io/en/latest/x.html
possibly we need to adjust things there but that's fine
return v / np.linalg.norm(v) | ||
|
||
|
||
def create_spiral_poses(radii, focus_depth, n_poses=120, n_circle=2): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a good utility to go into kornia.geometry.camera
? /cc @ducha-aiki
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can open a separate PR for that
This could be further transformed to world coordinate later, using camera poses. | ||
:return: (H, W, 3) torch.float32 | ||
""" | ||
y, x = torch.meshgrid(torch.arange(H, dtype=torch.float32), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can use kornia create_meshgrid + convert_to_homogeneous
from scipy.spatial.transform import Rotation as RotLib | ||
|
||
|
||
def SO3_to_quat(R): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check on the conversions module because we have some of this functionalities
https://kornia.readthedocs.io/en/latest/geometry.conversions.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in case of missing we can implement there
torch.backends.cudnn.benchmark = False | ||
|
||
|
||
def mse2psnr(mse): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have ssim_loss
: https://kornia.readthedocs.io/en/latest/losses.html#kornia.losses.ssim_loss
rays_o = rays_o + t[..., None] * rays_d | ||
|
||
# Store some intermediate homogeneous results | ||
ox_oz = rays_o[..., 0] / rays_o[..., 2] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use convert_points_from_homogenous
N_sam = t_vals.shape[0] | ||
|
||
# transform rays from camera coordinate to world coordinate | ||
ray_dir_world = torch.matmul(c2w[:3, :3].view(1, 1, 3, 3), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many of the problems I am seeing with objects other than the flower scene could potentially be related and improved by implementing the idea in the following: https://jonbarron.info/mipnerf/ |
seems that the code is in jax though. Good experiment to see the difficulty to port to pytorch |
@edgarriba there is a draft version here: https://github.com/bsuleymanov/mip-nerf |
Hi, Just to quickly update. I am backlogged by many other tasks that keep me busy, so revising NeRF code takes longer than expected. I am thinking now that utilizing primitives form PyTorch3D should really be the way to go. It will allow more flexibility moving forward. I am now focusing on this, and trying mainly to solve compilation issues of the package on my Mac. I will update when I have more to share. If anyone has some experience with PyTotch3D on OS X please share |
What primitives are we talking about? |
PyTorch3D package has configurable classes for ray samplings, ray marching, positional embedding, etc. It also has camera models, which we may not need since Kornia has similar DSs. What PyTorch3D doesn't have is the part to estimate camera parameters for the purpose of calibration, so for that part I will keep using Wang's contribution. More on PyTorch3D NeRF in a tutorial at: https://pytorch3d.org/tutorials/fit_simple_neural_radiance_field. |
I had to deprioritize my work on NeRF due to many other tasks. On top of that my choice to use |
Closing in favour of #1911 |
Changes
This pull request concerns a draft version of the NeRF algorithm for view synthesis and 3D rendering.
In this PR I included an API class - CameraCalibration, which drives NeRF, and is designed based on the discussion in #1384.
I took a brute force approach and copied the relevant core algorithms from the paper: Wang et al. (2021) - https://arxiv.org/abs/2102.07064. Most algorithm parameters are hard-coded, whereas some of those can be defined in CameraCalibration API.
I tested the full functionality of CameraCalibration in Colab - https://colab.research.google.com/drive/1xiX1Jf572HAN_TN4p-8IPU3CZRRkfhox#scrollTo=yVT7jsvk4ViS. Please let me know if you can run this notebook, or at least able to copy it and modify the relevant paths to make it work with the NeRF version in this PR.
The next steps in this PR, and I foresee, and please comment on each point:
Parameterize the algorithm and export more control to CameraCalibration API (including, e.g., CPU/GPU control, rendering parameters, etc.)
Refactor the code I took from https://arxiv.org/abs/2102.07064 to follow Kornia's standards
Replace core engine component with of-the-shelve algorithms. Here I mainly refer to the rendering part, which is also the most compute intensive. I'm thinking PyTorch3D may have some good alternatives we may want to explore
Yaniv
Type of change
Checklist