Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Silhouette Rendering #26

Closed
Aeson-Hsu opened this issue Dec 12, 2023 · 4 comments
Closed

Silhouette Rendering #26

Aeson-Hsu opened this issue Dec 12, 2023 · 4 comments
Labels
question Further information is requested

Comments

@Aeson-Hsu
Copy link

Hi, what a great job! However, there is one question that consistently bothers me. How to get the silhouette?
I have read the paper and checked the code. But I also can't understand the principle of getting silhouette.

In the def get_depth_and_silhouette(pts_3D, w2c), I find that the data stored by depth_silhouette is [depth_z, 1, depth_z_sq].

def transformed_params2depthplussilhouette(params, w2c, transformed_pts):
    rendervar = {
        'means3D': transformed_pts,
        'colors_precomp': get_depth_and_silhouette(transformed_pts, w2c),
        'rotations': F.normalize(params['unnorm_rotations']),
        'opacities': torch.sigmoid(params['logit_opacities']),
        'scales': torch.exp(torch.tile(params['log_scales'], (1, 3))),
        'means2D': torch.zeros_like(params['means3D'], requires_grad=True, device="cuda") + 0
    }
    return rendervar


def get_depth_and_silhouette(pts_3D, w2c):
    """
    Function to compute depth and silhouette for each gaussian.
    These are evaluated at gaussian center.
    """
    # Depth of each gaussian center in camera frame
    pts4 = torch.cat((pts_3D, torch.ones_like(pts_3D[:, :1])), dim=-1)
    pts_in_cam = (w2c @ pts4.transpose(0, 1)).transpose(0, 1)
    depth_z = pts_in_cam[:, 2].unsqueeze(-1) # [num_gaussians, 1]
    depth_z_sq = torch.square(depth_z) # [num_gaussians, 1]

    # Depth and Silhouette
    depth_silhouette = torch.zeros((pts_3D.shape[0], 3)).cuda().float()
    depth_silhouette[:, 0] = depth_z.squeeze(-1)
    depth_silhouette[:, 1] = 1.0
    depth_silhouette[:, 2] = depth_z_sq.squeeze(-1)
    
    return depth_silhouette
# Initialize Render Variables
depth_sil_rendervar = transformed_params2depthplussilhouette(params, curr_data['w2c'], transformed_pts)

Then, the first variable depth_sil obtained after Gaussian rasterization rendering.

# Depth & Silhouette Rendering
depth_sil, _, _, = Renderer(raster_settings=curr_data['cam'])(**depth_sil_rendervar)

Finally, we can get silhouette directly from the second dimension of that variable depth_sil

silhouette = depth_sil[1, :, :]
@Nik-V9
Copy link
Contributor

Nik-V9 commented Dec 14, 2023

Hi, Thanks for your interest in our work!

The Silhouette is basically the cumulative opacity along a ray. This is naturally obtained by the alpha-compositing in the rasterization of 3D Gaussians.

Instead of implementing the gradient for the alpha-compositing in CUDA. Here, we render the Silhouette by considering it one channel of dummy color, where the other two channels are depth and depth squared. We essentially assign a color of 1 to each Gaussian. Let's say there are 3 Gaussians along a ray; the cumulative opacity will be given by (Opacity_Gaussian_1 * 1 + Opacity_Gaussian_2 * 1 + Opacity_Gaussian_3 * 1). On the other hand, in the original 3D Gaussian Splatting color rendering, the following would happen: (Opacity_Gaussian_1 * (r, g, b) + Opacity_Gaussian_2 * (r, g, b) + Opacity_Gaussian_3 * (r, g, b)).

In this fashion, for each pixel, you will directly get a cumulative opacity along the ray, which is the Silhouette. For empty space, the value will be 0 because no Gaussians exist along the ray.

@Aeson-Hsu
Copy link
Author

Aeson-Hsu commented Dec 15, 2023

Thank u for your explanation. I have understood the principle by thinking in terms of the α-compositing.

@lvmingzhe
Copy link

Hi, thanks for your question and the author`s explanation.
I want to get the detailed formula of Silhouette as equation(5) furthermore in the SplaTAM paper.
Firstly, equation(1) in the paper tells us each Gaussian in the 3D space $\mathbf{x} \in \mathbb{R}^{3}$ :

$$f(\mathbf{x})=o \exp \left(-\frac{|\mathbf{x}-\boldsymbol{\mu}|^{2}}{2 r^{2}}\right)$$

, and the Silhouette for each pixel $\mathbf{p}=(u, v)$ is defined by

$$S(\mathbf{p})=\sum_{i=1}^{n} f_{i}(\mathbf{p}) \prod_{j=1}^{i-1}\left(1-f_{j}(\mathbf{p})\right)$$

, where $f_{i}(\mathbf{p})$ is computed as in equation (1) but with $\boldsymbol{\mu}$ and $r$ of the splatted 2D Gaussians in pixel-space:

$$\boldsymbol{\mu}^{2 \mathrm{D}}=K \frac{E_{t} \boldsymbol{\mu}}{d}$$

$$r^{2 \mathrm{D}}=\frac{f r}{d}, \quad \text { where } \quad d=\left(E_{t} \boldsymbol{\mu}\right)_{z}$$

Above all are the descriptions in the SplaTAM paper. However, I don`t know the exact $f(p)$ formula. So at the beginning of my personal understanding, I think it may look like this:

$$f(p)=o \exp \left(-\frac{\left|p-\mu^{2 D}\right|^{2}}{2\left(r^{2 D}\right)^{2}}\right)$$

$$f(u, v)=o \exp \left(-\frac{\left(u-\mu_{u}^{2 D}\right)^{2}+\left(v-\mu_{v}^{2 D}\right)^{2}}{2\left(\frac{f r}{d}\right)^{2}}\right)$$

However, I found it a little weird to directly exchange the $\mathbf{x} \in \mathbb{R}^{3} $ to $\mathbf{p}=(u, v)$ for equation (1).
Since the $S(p)$ is in the perspective of a rendered pixel on 2D image plane, it may relate to many 3D Gaussians.
On the contrary, $\mathbf{x} \in \mathbb{R}^{3}$ refers to one Gaussian model in the Space. So I wonder there might be a more accurate description like:

$$ g_{i}(\mathbf{p}) = w(\mathbf{x}_i, p) f(\mathbf{x}_i) $$

, the function $w(\mathbf{x}_i, p)$ denotes the contribution coefficient of the Gaussian model at the position $\mathbf{x}_i$ towards the rendering of the pixel $p$, so the Silhouette should be rewritten as

$$S(\mathbf{p})=\sum_{i=1}^{n} g_{i}(\mathbf{p}) \prod_{j=1}^{i-1}\left(1-g_{j}(\mathbf{p})\right)$$

The views expressed above are my preliminary thoughts on the matter, and I would be grateful for any critiques or corrections from yours. Thanks very much!!

@Nik-V9 Nik-V9 changed the title How to get the silhouette? What's the principle behind it? Silhouette Rendering Dec 26, 2023
@Nik-V9
Copy link
Contributor

Nik-V9 commented Dec 26, 2023

Hi @lvmingzhe, Thanks for sharing your views!

Yes, you are correct. The detailed equation would be similar to what you outlined. One small addition is that the contribution coefficient of the Gaussian is the opacity (alpha) of the Gaussian. Hence, the Silhouette is the output of the alpha-compositing process in the rasterization of 3D Gaussians.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants