Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimizing camera undistortion #2037

Draft
wants to merge 41 commits into
base: main
Choose a base branch
from

Conversation

DreekFire
Copy link
Contributor

Only runs Newton's method until sub-pixel accuracy is reached instead of all 10 iterations. Then resamples images with bilinear sampling to get pixel color at points with error.

Also uses analytical formulas for pixel areas instead of distorting coordinates of neighboring pixels to get pixel area.

@DreekFire DreekFire marked this pull request as draft June 5, 2023 12:07
@DreekFire
Copy link
Contributor Author

DreekFire commented Jun 5, 2023

Need example of datasets with different camera types to finish testing. Also contains lots of junk from merging branches, will clean up.

Copy link
Contributor

@tancik tancik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Added some initial comments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the changes in this file.

@@ -413,40 +417,226 @@ def radial_and_tangential_undistort(
distortion_params: torch.Tensor,
eps: float = 1e-3,
max_iterations: int = 10,
) -> torch.Tensor:
resolution: torch.Tensor = torch.tensor([1e-3, 1e-3]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you know the size, you should type it, ie
resolution: Float[torch.Tensor, "2"]

max_iterations: The maximum number of iterations to perform.
resolution: The resolution (w, h of each pixel, in units of multiples of focal length)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add doc for tolerance

@@ -413,40 +417,226 @@ def radial_and_tangential_undistort(
distortion_params: torch.Tensor,
eps: float = 1e-3,
max_iterations: int = 10,
) -> torch.Tensor:
resolution: torch.Tensor = torch.tensor([1e-3, 1e-3]),
tol: float = 0.5,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename to tolerance

Comment on lines 448 to 449
# n_samples = coords.shape[0]
# n_iters = 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove commented out code, here and elsewhere

Copy link
Contributor

@tancik tancik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Added some initial comments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many of the changes in this file don't seem related to this PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah would suggest to revert the things unrelated

Comment on lines 54 to 59
camera_opt_to_camera = self.pose_optimizer(c)
if self.pose_optimizer is not None:
camera_opt_to_camera = self.pose_optimizer(c)
else:
camera_opt_to_camera = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert this change as unrelated to this PR?

@@ -40,7 +41,8 @@ def __init__(self, cameras: Cameras, pose_optimizer: CameraOptimizer) -> None:
self.pose_optimizer = pose_optimizer
self.register_buffer("image_coords", cameras.get_image_coords(), persistent=False)

def forward(self, ray_indices: Int[Tensor, "num_rays 3"]) -> RayBundle:
@profiler.time_function
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove profiler here

@@ -21,6 +21,7 @@
from nerfstudio.cameras.camera_optimizers import CameraOptimizer
from nerfstudio.cameras.cameras import Cameras
from nerfstudio.cameras.rays import RayBundle
from nerfstudio.utils import profiler
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And remove here


Returns:
The residuals (fx, fy) and jacobians (fx_x, fx_y, fy_x, fy_y).
"""
assert distortion_params.shape[-1] == 8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fisheye undistortion, which was tied to this function, is actually with correct formula. So instead of creating two separate newton functions for fisheye and perspective cameras, you can keep them in the same function like the old way. (The only minor issue with this function is the k4 for perspective camera but that's all zero for the current datasets anyway).

"""Computes undistorted coords given opencv distortion parameters.
Adapted from MultiNeRF
https://github.com/google-research/multinerf/blob/b02228160d3179300c7d499dca28cb9ca3677f32/internal/camera_utils.py#L477-L509

Args:
coords: The distorted coordinates.
distortion_params: The distortion parameters [k1, k2, k3, k4, p1, p2].
eps: The epsilon for the convergence.
distortion_params: The distortion parameters. Supports 0, 1, 2, 4, 8 parameters, in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert here if the _compute_residual_and_jacobian is reverted.

Comment on lines 438 to 445
assert distortion_params.shape[-1] in [0, 1, 2, 4, 8]

if distortion_params.shape[-1] == 0:
return coords, torch.eye(2, device=coords.device), coords

if distortion_params.shape[-1] < 8:
distortion_params = F.pad(distortion_params, (0, 8 - distortion_params.shape[-1]), "constant", 0.0)
assert distortion_params.shape[-1] == 8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah would suggest to revert the things unrelated

@liruilong940607
Copy link
Contributor

Thanks @DreekFire for the updates. This is not an easy one.

My main question is on the fisheye formula. Have you run any tests to verify the fisheye formula is correct?

The reason I'm asking this is that the old version actually have a correct fisheye undistortion formula by combining

def radial_and_tangential_undistort(

and

theta = torch.sqrt(torch.sum(coord_stack**2, dim=-1))
theta = torch.clip(theta, 0.0, math.pi)
sin_theta = torch.sin(theta)
directions_stack[..., 0][mask] = torch.masked_select(
coord_stack[..., 0] * sin_theta / theta, mask
).float()
directions_stack[..., 1][mask] = torch.masked_select(
coord_stack[..., 1] * sin_theta / theta, mask
).float()
directions_stack[..., 2][mask] = -torch.masked_select(torch.cos(theta), mask).float()

And this PR seems to change the radial_and_tangential_undistort function but not the second part for fisheye. I'm not sure if the formula is correct anymore.

@kerrj
Copy link
Collaborator

kerrj commented Jun 13, 2023

What's the speed/quality difference on this PR?

@DreekFire
Copy link
Contributor Author

Some tests for correctness:

  • Visually inspected results on floating-tree and poster datasets - looks good (todo: equirectangular and stereo, even though these have no undistortion, just to make sure nothing broke)
  • Examined max, mean, and min pixel_areas under new method and old method: approximately matches (expect some difference because new method uses more accurate analytical formula which accounts for parallelogram-shaped pixels)
  • Bilinear sampling: manual inspection on dummy data
  • Measured max difference between rays returned by new and old methods: within specified tolerance for perspective camera, different for fisheye camera BUT new method is more accurate for fisheye camera (from ~1e-3 mean abs error to ~1e-8 (less than floating point precision) mean abs error.

And speed:
Speed fluctuates, measured manually at two different iteration ranges:
Units in thousand rays per second, on puppy.bair
Iteration 500-1800: 133-137 new fisheye, 131-132 new perspective, 126-127 old fisheye, 125-127 old perspective
Iteration 2000-3000: 150 new fisheye, 143-145 new perspective, 135-140 old fisheye, 133-135 old perspective
Speedup not as great as before. During development, I merged a new version of nerfstudio. Before and after the merge, the new method ran at about the same speed, but the older version sped up a bit, so the difference is not as great as expected.

@kerrj
Copy link
Collaborator

kerrj commented Jun 15, 2023

what machine are you testing on? is there any particular sort of hardware this implementation is optimized for? eg would you see stronger benefits on weaker CPUs or GPUs, etc

@tancik
Copy link
Contributor

tancik commented Jun 24, 2023

@kerrj Will this conflict with any of the datamanager stuff that you are working on?

@DreekFire
Copy link
Contributor Author

Ended up removing the resampling because it didn't appear to be worth it - once a ray was already within sub-pixel accuracy, Newton's method usually achieved very good accuracy in just one more iteration. This makes the code simpler as well. Now, the key optimizations are early stopping for undistortion and analytical formulas for pixel areas instead of numerical ones (should also be more accurate because it correctly accounts for skewed pixels).

The speed fluctuates throughout training, but it is generally about 10% faster at any given iteration than the main branch on puppy.

Not sure why the tests are failing - seems to be something going wrong with torch.compile. It passes on both my local machine as well as my env on the puppy machine.

@anc2001
Copy link
Contributor

anc2001 commented Sep 12, 2023

Thanks @DreekFire for the updates. This is not an easy one.

My main question is on the fisheye formula. Have you run any tests to verify the fisheye formula is correct?

The reason I'm asking this is that the old version actually have a correct fisheye undistortion formula by combining

def radial_and_tangential_undistort(

and

theta = torch.sqrt(torch.sum(coord_stack**2, dim=-1))
theta = torch.clip(theta, 0.0, math.pi)
sin_theta = torch.sin(theta)
directions_stack[..., 0][mask] = torch.masked_select(
coord_stack[..., 0] * sin_theta / theta, mask
).float()
directions_stack[..., 1][mask] = torch.masked_select(
coord_stack[..., 1] * sin_theta / theta, mask
).float()
directions_stack[..., 2][mask] = -torch.masked_select(torch.cos(theta), mask).float()

And this PR seems to change the radial_and_tangential_undistort function but not the second part for fisheye. I'm not sure if the formula is correct anymore.

Hi @liruilong940607, would you mind elaborating on why the old version has the correct fisheye undistortion. I'm a little confused as to how the math is correct in this case. The radial_and_tangential_undistort function implements undistortion different from the OpenCV fisheye distortion specification and it's unclear to me how the second part corrects this.

@DreekFire
Copy link
Contributor Author

Hi @anc2001, nerfstudio does not use all of the distortion parameters simultaneously. For perspective cameras, it uses two radial and two tangential distortion parameters, while for fisheye cameras, it uses 4 radial distortion parameters. The old formula produces results identical to the formulas implemented in OpenCV (OpenCV uses different formulas for perspective and fisheye cameras) as long as you do not try to use the third or fourth radial distortion parameters for a perspective camera, or any tangential distortion parameters for a fisheye camera.

@anc2001
Copy link
Contributor

anc2001 commented Sep 25, 2023

Hi @DreekFire thanks for the reply. At the time of my original comment I was unsure that the current implementation of fisheye undistortion matched up with the math described in the OpenCV documentation. I worked it out and understand why it's correct now. Are you referring to the implementation on this branch or the current implementation on the main branch? To my understanding the current implementation uses 4 radial distortion parameters and 2 tangential where the 2 tangential are just 0 when the camera is fisheye. For perspective undistortion under the OpenCV distortion specification it will be incorrect if the 4th radial distortion parameter is included, but shouldn't it be correct up to the 3rd radial distortion parameter? Also shouldn't nerfstudio support the full OpenCV perspective camera model (maybe minus the thin prism coefficients)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants