Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compute density compensation for screen space blurring of tiny gaussians #117

Merged
merged 1 commit into from Feb 8, 2024

Conversation

jb-ye
Copy link
Collaborator

@jb-ye jb-ye commented Feb 1, 2024

Why

Current gaussian splatting (both Inria and nerfstudio) doesn't sufficiently deal with the case when rendering tiny gaussians at substantial different resolution than the captured one. The reason is caused by the 0.3 pix kernel screen space blurring applied to the tiny splats.

A tiny splat that has been enlarged by 0.3 pixel can block more gaussians behind it if rendering at lower resolution than the captured one, or looks much thinner than it should be if rendering at higher resolution than the captured one. One would easily observe this artifacts by changing the distance or resolution of rendering.

This issue is acknowledged by the original author of Gaussian splatting, and was addressed in another open source repo.

Addressing this issue has significant benefit in an interactive webviewer, where one zoom in/out to inspect a splat model, and see much less artifacts then before. Another simple quantitative experiment one can do is to train at 1/2 resolution and evaluation at the original resolution, addressing the issue expects to improve metrics.

Solution

The solution is rather simple. We will compute a compensation factor $\rho=\sqrt{\dfrac{Det(\Sigma)}{Det(\Sigma+ 0.3 I)}}$ for each splat and multiply it to the opacity of gaussian before rasterization. The same solution was also proposed in another research paper.

The PR will basically modify the output of rasterize_gaussians to return the compensation factor $\rho$ and the change allows us to have two modes in nerfstudio's splatfacto: the classic mode which mimic the behavior of official GS, and the new mode which address the "aliasing-like" issue.

See results here (I had to upload the video to my own forked repo's issue due to file size limit).

Copy link
Collaborator

@liruilong940607 liruilong940607 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with merging this PR but I want to let @vye16 to decide if we need backward compatibility or not.


- **xys** (Tensor): x,y locations of 2D gaussian projections.
- **depths** (Tensor): z depth of gaussians.
- **radii** (Tensor): radii of 2D gaussian projections.
- **conics** (Tensor): conic parameters for 2D gaussian.
- **compensation** (Tensor): the density compensation for blurring 2D kernel
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This extra return would break backward compatibility. Personally I'm fine with it as we are in active-developing version 0.1.x. But I'll let @vye16 to decide starting from when we want to maintain backward compatibility.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @vye16 , could you help take a look at this PR and see if you have any other comments other than @liruilong940607

@@ -88,6 +92,7 @@ __global__ void project_gaussians_forward_kernel(
depths[idx] = p_view.z;
radii[idx] = (int)radius;
xys[idx] = center;
compensation[idx] = comp;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read/Write the global memory is usually the most time consuming part in a kernel (computation is usually not the burden). I tested this a bit and it slows down the project_gaussians from 3000 it/s to 2800 it/s which is not that much so I think is fine. Especially that project_gaussians is much cheaper comparing to the rasterization stage. I'm fine with this tiny little extra burden but just want to point it out for future reference.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code I used to test this.

import torch

def profiling(N: int = 1000000, D: int = 3):
    import tqdm

    from gsplat import project_gaussians, rasterize_gaussians

    torch.manual_seed(42)
    device = torch.device("cuda:0")

    means3d = torch.rand((N, 3), device=device, requires_grad=False)
    scales = torch.rand((N, 3), device=device) * 5
    quats = torch.randn((N, 4), device=device)
    quats /= torch.linalg.norm(quats, dim=-1, keepdim=True)

    viewmat = projmat = torch.eye(4, device=device)
    fx = fy = 3.0
    H, W = 256, 256
    BLOCK_X = BLOCK_Y = 16
    tile_bounds = (W + BLOCK_X - 1) // BLOCK_X, (H + BLOCK_Y - 1) // BLOCK_Y, 1

    pbar = tqdm.trange(10000)
    for _ in pbar:
        xys, depths, radii, conics, num_tiles_hit, cov3d = project_gaussians(
            means3d,
            scales,
            1,
            quats,
            viewmat,
            projmat,
            fx,
            fy,
            W / 2,
            H / 2,
            H,
            W,
            tile_bounds,
        )

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the evaluation effort.

@liruilong940607
Copy link
Collaborator

Needs to run the formatter black . gsplat/ tests/ examples/ for passing the core tests.

@hardikdava
Copy link
Contributor

@jb-ye Really great improvement. Amazing work @jb-ye Can't wait to try it out.

@jb-ye
Copy link
Collaborator Author

jb-ye commented Feb 5, 2024

Needs to run the formatter black . gsplat/ tests/ examples/ for passing the core tests.

I used a newer version of black. Now updated black with the same version used by devops and re-run the formatter.

Copy link
Collaborator

@vye16 vye16 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, aside from this backward compatibility issue. How much faster is it do this in the kernel than with the 2d covariance returned in the original projection function? If it's much faster, I'm happy to break backward compatibility.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants