-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimizing camera undistortion #2037
base: main
Are you sure you want to change the base?
Conversation
Need example of datasets with different camera types to finish testing. Also contains lots of junk from merging branches, will clean up. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Added some initial comments.
.vscode/settings.json
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the changes in this file.
nerfstudio/cameras/camera_utils.py
Outdated
@@ -413,40 +417,226 @@ def radial_and_tangential_undistort( | |||
distortion_params: torch.Tensor, | |||
eps: float = 1e-3, | |||
max_iterations: int = 10, | |||
) -> torch.Tensor: | |||
resolution: torch.Tensor = torch.tensor([1e-3, 1e-3]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you know the size, you should type it, ie
resolution: Float[torch.Tensor, "2"]
max_iterations: The maximum number of iterations to perform. | ||
resolution: The resolution (w, h of each pixel, in units of multiples of focal length) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add doc for tolerance
nerfstudio/cameras/camera_utils.py
Outdated
@@ -413,40 +417,226 @@ def radial_and_tangential_undistort( | |||
distortion_params: torch.Tensor, | |||
eps: float = 1e-3, | |||
max_iterations: int = 10, | |||
) -> torch.Tensor: | |||
resolution: torch.Tensor = torch.tensor([1e-3, 1e-3]), | |||
tol: float = 0.5, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename to tolerance
nerfstudio/cameras/camera_utils.py
Outdated
# n_samples = coords.shape[0] | ||
# n_iters = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove commented out code, here and elsewhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Added some initial comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many of the changes in this file don't seem related to this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah would suggest to revert the things unrelated
camera_opt_to_camera = self.pose_optimizer(c) | ||
if self.pose_optimizer is not None: | ||
camera_opt_to_camera = self.pose_optimizer(c) | ||
else: | ||
camera_opt_to_camera = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert this change as unrelated to this PR?
@@ -40,7 +41,8 @@ def __init__(self, cameras: Cameras, pose_optimizer: CameraOptimizer) -> None: | |||
self.pose_optimizer = pose_optimizer | |||
self.register_buffer("image_coords", cameras.get_image_coords(), persistent=False) | |||
|
|||
def forward(self, ray_indices: Int[Tensor, "num_rays 3"]) -> RayBundle: | |||
@profiler.time_function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove profiler here
@@ -21,6 +21,7 @@ | |||
from nerfstudio.cameras.camera_optimizers import CameraOptimizer | |||
from nerfstudio.cameras.cameras import Cameras | |||
from nerfstudio.cameras.rays import RayBundle | |||
from nerfstudio.utils import profiler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And remove here
nerfstudio/cameras/camera_utils.py
Outdated
|
||
Returns: | ||
The residuals (fx, fy) and jacobians (fx_x, fx_y, fy_x, fy_y). | ||
""" | ||
assert distortion_params.shape[-1] == 8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fisheye undistortion, which was tied to this function, is actually with correct formula. So instead of creating two separate newton functions for fisheye and perspective cameras, you can keep them in the same function like the old way. (The only minor issue with this function is the k4
for perspective camera but that's all zero for the current datasets anyway).
nerfstudio/cameras/camera_utils.py
Outdated
"""Computes undistorted coords given opencv distortion parameters. | ||
Adapted from MultiNeRF | ||
https://github.com/google-research/multinerf/blob/b02228160d3179300c7d499dca28cb9ca3677f32/internal/camera_utils.py#L477-L509 | ||
|
||
Args: | ||
coords: The distorted coordinates. | ||
distortion_params: The distortion parameters [k1, k2, k3, k4, p1, p2]. | ||
eps: The epsilon for the convergence. | ||
distortion_params: The distortion parameters. Supports 0, 1, 2, 4, 8 parameters, in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert here if the _compute_residual_and_jacobian
is reverted.
nerfstudio/cameras/camera_utils.py
Outdated
assert distortion_params.shape[-1] in [0, 1, 2, 4, 8] | ||
|
||
if distortion_params.shape[-1] == 0: | ||
return coords, torch.eye(2, device=coords.device), coords | ||
|
||
if distortion_params.shape[-1] < 8: | ||
distortion_params = F.pad(distortion_params, (0, 8 - distortion_params.shape[-1]), "constant", 0.0) | ||
assert distortion_params.shape[-1] == 8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah would suggest to revert the things unrelated
Thanks @DreekFire for the updates. This is not an easy one. My main question is on the fisheye formula. Have you run any tests to verify the fisheye formula is correct? The reason I'm asking this is that the old version actually have a correct fisheye undistortion formula by combining nerfstudio/nerfstudio/cameras/camera_utils.py Line 411 in 43c399e
and nerfstudio/nerfstudio/cameras/cameras.py Lines 724 to 735 in 43c399e
And this PR seems to change the |
What's the speed/quality difference on this PR? |
Some tests for correctness:
And speed: |
what machine are you testing on? is there any particular sort of hardware this implementation is optimized for? eg would you see stronger benefits on weaker CPUs or GPUs, etc |
@kerrj Will this conflict with any of the datamanager stuff that you are working on? |
Ended up removing the resampling because it didn't appear to be worth it - once a ray was already within sub-pixel accuracy, Newton's method usually achieved very good accuracy in just one more iteration. This makes the code simpler as well. Now, the key optimizations are early stopping for undistortion and analytical formulas for pixel areas instead of numerical ones (should also be more accurate because it correctly accounts for skewed pixels). The speed fluctuates throughout training, but it is generally about 10% faster at any given iteration than the main branch on puppy. Not sure why the tests are failing - seems to be something going wrong with torch.compile. It passes on both my local machine as well as my env on the puppy machine. |
Hi @liruilong940607, would you mind elaborating on why the old version has the correct fisheye undistortion. I'm a little confused as to how the math is correct in this case. The |
Hi @anc2001, nerfstudio does not use all of the distortion parameters simultaneously. For perspective cameras, it uses two radial and two tangential distortion parameters, while for fisheye cameras, it uses 4 radial distortion parameters. The old formula produces results identical to the formulas implemented in OpenCV (OpenCV uses different formulas for perspective and fisheye cameras) as long as you do not try to use the third or fourth radial distortion parameters for a perspective camera, or any tangential distortion parameters for a fisheye camera. |
Hi @DreekFire thanks for the reply. At the time of my original comment I was unsure that the current implementation of fisheye undistortion matched up with the math described in the OpenCV documentation. I worked it out and understand why it's correct now. Are you referring to the implementation on this branch or the current implementation on the main branch? To my understanding the current implementation uses 4 radial distortion parameters and 2 tangential where the 2 tangential are just 0 when the camera is fisheye. For perspective undistortion under the OpenCV distortion specification it will be incorrect if the 4th radial distortion parameter is included, but shouldn't it be correct up to the 3rd radial distortion parameter? Also shouldn't nerfstudio support the full OpenCV perspective camera model (maybe minus the thin prism coefficients)? |
Only runs Newton's method until sub-pixel accuracy is reached instead of all 10 iterations. Then resamples images with bilinear sampling to get pixel color at points with error.
Also uses analytical formulas for pixel areas instead of distorting coordinates of neighboring pixels to get pixel area.