Skip to content

perf: reduce register pressure in compute GSplat TileCount projection#8563

Merged
mvaligursky merged 1 commit intomainfrom
mv-tilecount-reg-pressure
Mar 31, 2026
Merged

perf: reduce register pressure in compute GSplat TileCount projection#8563
mvaligursky merged 1 commit intomainfrom
mv-tilecount-reg-pressure

Conversation

@mvaligursky
Copy link
Copy Markdown
Contributor

@mvaligursky mvaligursky commented Mar 31, 2026

Reduces GPU register pressure in the GSplatLocalTileCount compute shader's covariance projection by fusing the matrix chain and eliminating redundant work.

Changes:

  • Fuse the 2D covariance projection chain in computeSplatCov to avoid materializing intermediate Vrk (mat3x3f), J (mat3x3f), and W (mat3x3f) matrices. Instead computes B = M * TT directly and derives a, b, c from dot(B_col, B_col), reducing peak matrix registers from ~27 to ~15.
  • Compute TT columns inline from the 3 Jacobian scalars (J1, J2.x, J2.y) and view matrix rows, instead of constructing full J and W matrices.
  • Return viewDepth from computeSplatCov (already computed internally as viewCenter.z) instead of redundantly recomputing viewMatrix * center in the TileCount caller.
  • Remove unused radius and radiusFactor fields from SplatCov2D struct (only used internally for culling, never read by callers).

Performance:

  • Safari GPU profiling on Apple Silicon shows ~15% reduction in per-splat ALU cost for the TileCount pass, ~38% reduction in device memory reads per invocation, and a 5-point drop in the Compute Shader Launch Limiter.

Fuse the 2D covariance matrix chain to avoid materializing Vrk, J, and W
intermediate mat3x3f matrices, reducing peak register usage. Return viewDepth
from computeSplatCov to eliminate a redundant mat4*vec4 multiply. Remove unused
radius/radiusFactor fields from SplatCov2D struct.

Made-with: Cursor
@mvaligursky mvaligursky self-assigned this Mar 31, 2026
@mvaligursky mvaligursky added performance Relating to load times or frame rate area: graphics Graphics related issue labels Mar 31, 2026
@mvaligursky mvaligursky merged commit e52e5de into main Mar 31, 2026
8 checks passed
@mvaligursky mvaligursky deleted the mv-tilecount-reg-pressure branch March 31, 2026 11:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: graphics Graphics related issue performance Relating to load times or frame rate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant