Skip to content

Shared memory optimizations for Gaussian rasterization#554

Merged
matthewdcong merged 4 commits into
openvdb:mainfrom
matthewdcong:smem_features_forward_pass
May 10, 2026
Merged

Shared memory optimizations for Gaussian rasterization#554
matthewdcong merged 4 commits into
openvdb:mainfrom
matthewdcong:smem_features_forward_pass

Conversation

@matthewdcong

Copy link
Copy Markdown
Contributor
  1. Forward rasterization does not currently store features in shared memory. As the problem size becomes larger (more intersections with each Gaussian), the cost of an unconditional global load is outweighed by the shared memory reuse.
  2. In addition, we cull loads for Gaussians with an opacity less than the threshold necessary for a Gaussian to be valid in the volume rendering pass. This optimization applies to the forward and backwards pass.

In profiling, this reduces a 17m 20s single-GPU reconstruction to 16m and 48s, leading to an approximately >3% speedup.

@matthewdcong matthewdcong requested a review from a team as a code owner March 18, 2026 06:08

@harrism harrism left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One concern.

Comment thread src/fvdb/detail/ops/gsplat/GaussianRasterizeForward.cu Outdated
@matthewdcong matthewdcong force-pushed the smem_features_forward_pass branch from 9478356 to 67a63fa Compare May 8, 2026 22:06
Signed-off-by: Matthew Cong <mcong@nvidia.com>
Signed-off-by: Matthew Cong <mcong@nvidia.com>
Signed-off-by: Matthew Cong <mcong@nvidia.com>
@matthewdcong matthewdcong force-pushed the smem_features_forward_pass branch from 67a63fa to cf36a1a Compare May 8, 2026 22:07
Signed-off-by: Matthew Cong <mcong@nvidia.com>
@matthewdcong matthewdcong merged commit d37144f into openvdb:main May 10, 2026
39 checks passed
@swahtz swahtz added this to the v0.5 milestone Jun 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants