Merge frustum culling into compute, convert to storage buffers, fix >16M splat dispatch#8561
Merge frustum culling into compute, convert to storage buffers, fix >16M splat dispatch#8561mvaligursky merged 2 commits intomainfrom
Conversation
…16M splat dispatch With the removal of the non-compute (global) renderer, culling no longer needs to run as a render pass and can be fully compute-based. Merge the frustum test into the interval cull compute shader, convert culling data to storage buffers, fix dispatch overflow for >16M splats, and extract a reusable calcDispatch2D WGSL chunk. Made-with: Cursor
There was a problem hiding this comment.
Pull request overview
This PR moves GSplat frustum culling fully into the compute-driven interval compaction path, replaces several GPU texture-based data feeds with storage buffers, and fixes indirect compute dispatch overflow when visible splat counts exceed the per-dimension workgroup limit.
Changes:
- Merge frustum culling into
compute-gsplat-interval-cull(compute-only path) and remove the dedicated node-cull render pass + its shaders. - Convert bounds/transform data from textures to storage buffers and pass frustum planes via uniforms.
- Fix >16M visible splat indirect dispatch by writing 2D dispatch dimensions and extracting
calcDispatch2Dinto a shared WGSL chunk.
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| src/scene/shader-lib/wgsl/chunks/gsplat/frag/gsplatNodeCulling.js | Removed obsolete WGSL fragment shader for node visibility render-pass culling. |
| src/scene/shader-lib/glsl/chunks/gsplat/frag/gsplatNodeCulling.js | Removed obsolete GLSL fragment shader for node visibility render-pass culling. |
| src/scene/shader-lib/wgsl/chunks/gsplat/compute-gsplat-write-indirect-args.js | Writes 2D indirect dispatch dimensions using shared calcDispatch2D to avoid workgroup clamping. |
| src/scene/shader-lib/wgsl/chunks/gsplat/compute-gsplat-local-copy.js | Reuses shared calcDispatch2D for chunk-sort indirect args. |
| src/scene/shader-lib/wgsl/chunks/gsplat/compute-gsplat-local-classify.js | Reuses shared calcDispatch2D for local renderer indirect dispatch args. |
| src/scene/shader-lib/wgsl/chunks/gsplat/compute-gsplat-interval-cull.js | Inlines sphere-vs-frustum test per interval and reads bounds/transforms from storage buffers. |
| src/scene/shader-lib/wgsl/chunks/common/comp/dispatch-core.js | New shared WGSL helper to compute safe 2D dispatch sizes. |
| src/scene/gsplat-unified/gsplat-node-cull-render-pass.js | Removed render-pass class that generated the node visibility texture. |
| src/scene/gsplat-unified/gsplat-manager.js | Switches culling setup/compaction API from visibility texture to frustum culler + plane computation. |
| src/scene/gsplat-unified/gsplat-interval-compaction.js | Updates bind groups/uniforms for buffer-based culling and 2D indirect dispatch sizing. |
| src/scene/gsplat-unified/gsplat-frustum-culler.js | Replaces culling textures/render pass with storage buffers and CPU frustum-plane computation. |
| examples/src/examples/gaussian-splatting/lod-streaming.controls.mjs | Expands splat budget UI max from 20 to 40. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Bounds and transforms textures are needed for frustum culling. | ||
| // These index splats sequentially, so always use the full splats array. |
There was a problem hiding this comment.
The comment still refers to “Bounds and transforms textures”, but culling data is now stored in storage buffers (boundsBuffer/transformsBuffer). Update the comment to avoid misleading future changes/debugging.
| // Bounds and transforms textures are needed for frustum culling. | |
| // These index splats sequentially, so always use the full splats array. | |
| // Bounds and transforms culling data (stored in bounds/transforms storage buffers) | |
| // index splats sequentially, so always update them using the full boundsGroups set. |
| /** | ||
| * Runs GPU frustum culling: updates the transforms texture and renders the | ||
| * node visibility pass, producing the bit-packed nodeVisibilityTexture. | ||
| * Prepares frustum culling data: updates the transforms texture and computes |
There was a problem hiding this comment.
This JSDoc says the frustum culler “updates the transforms texture”, but the implementation now updates storage buffers via updateTransformsData. Please update wording to match the new buffer-based pipeline.
| * Prepares frustum culling data: updates the transforms texture and computes | |
| * Prepares frustum culling data: updates the GPU transform buffers and computes |
| * Runs the full interval compaction pipeline: cull+count, prefix sum, scatter. | ||
| * | ||
| * @param {Texture|null} nodeVisibilityTexture - Bit-packed visibility texture (when culling). | ||
| * @param {GSplatFrustumCuller|null} frustumCuller - Frustum culler with textures and planes (when culling). |
There was a problem hiding this comment.
The dispatchCompact parameter doc says “textures and planes”, but GSplatFrustumCuller now exposes storage buffers (boundsBuffer/transformsBuffer) plus frustumPlanes. Update the description to avoid confusion about required resources.
| * @param {GSplatFrustumCuller|null} frustumCuller - Frustum culler with textures and planes (when culling). | |
| * @param {GSplatFrustumCuller|null} frustumCuller - Frustum culler providing bounds/transforms storage buffers and frustum planes (when culling). |
| if (cullingEnabled) { | ||
| cullCompute.setParameter('nodeVisibilityTexture', nodeVisibilityTexture); | ||
| cullCompute.setParameter('boundsBuffer', frustumCuller.boundsBuffer); | ||
| cullCompute.setParameter('transformsBuffer', frustumCuller.transformsBuffer); | ||
| cullCompute.setParameter('frustumPlanes[0]', frustumCuller.frustumPlanes); |
There was a problem hiding this comment.
When cullingEnabled is true, this code dereferences frustumCuller without validating it. Since the method signature allows null, add a Debug.assert / early return (or tighten the signature) to prevent a runtime crash if a caller passes (null, true).
| if (numMatrices > this._allocatedTransformCount) { | ||
| this.transformsBuffer?.destroy(); | ||
| this._allocatedTransformCount = numMatrices; | ||
| // 3 vec4f per matrix = 12 floats = 48 bytes | ||
| this.transformsBuffer = new StorageBuffer(this.device, numMatrices * 12 * 4, BUFFERUSAGE_COPY_DST); | ||
| } | ||
|
|
||
| const data = this.transformsTexture.lock(); | ||
| const data = new Float32Array(numMatrices * 12); | ||
|
|
There was a problem hiding this comment.
updateTransformsData allocates a new Float32Array every call. Since this runs whenever frustum planes are updated (and can be frequent), consider caching/reusing a typed array sized to _allocatedTransformCount * 12 to reduce GC pressure and improve frame stability.
…ped arrays - Update comments/JSDoc referencing "textures" to "storage buffers" - Add Debug.assert for frustumCuller when cullingEnabled is true - Cache Float32Array/Uint32Array allocations in updateBoundsData and updateTransformsData to reduce per-frame GC pressure Made-with: Cursor
Merge frustum culling from a separate render pass into the interval compaction compute
shader, convert culling data from textures to storage buffers, and fix a dispatch overflow
for scenes with more than 16M visible splats.
Changes:
of the non-compute renderers using culling, culling no longer needs to run as a render pass
and can be fully compute-based. This eliminates the separate
GSplatNodeCullRenderPassand the intermediate bit-packed
nodeVisibilityTexture. Frustum planes are computed onCPU and passed as uniforms; the sphere-vs-frustum test runs inline per interval.
storage buffers (
StorageBuffer), simplifying GPU addressing and reducing overhead.computes 2D dispatch layouts to stay within
maxComputeWorkgroupsPerDimension(65535).Previously, the keygen workgroup count was packed entirely into the X dimension, causing
silent clamping above ~16.7M splats.
calcDispatch2Dinto a reusable WGSL chunk (dispatch-core.js), replacingduplicated inline math in 3 shaders.
GSplatNodeCullRenderPassclass and its GLSL/WGSL fragment shaders.Performance: