B5: Add GPU-driven rendering with draw indirect#42
Merged
Conversation
Implement a GPU-driven rendering pipeline that replaces per-entity CPU frustum culling and draw-call submission with a single compute dispatch. - Add gpu_cull.wgsl compute shader: per-entity frustum culling, LOD selection based on camera distance, and DrawIndexedIndirect argument generation with atomic visible-count tracking - Add GpuDrivenPipeline struct: manages GPU buffers for command upload, frustum data, cull parameters, indirect draw output, and draw count - Add IndirectDrawBuffer: STORAGE|INDIRECT buffer of DrawIndexedIndirect structs consumed by draw_indexed_indirect - Add DrawCommandGpu: CPU-side struct mirroring the WGSL layout with model matrix, AABB, mesh/material IDs, and up to 4 LOD levels - Add 8 tests covering struct layouts, shader validity, bytemuck round-trips, frustum data construction, and workgroup count rounding Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Deploying euca-engine with
|
| Latest commit: |
e347556
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://4a350dd5.euca-engine.pages.dev |
| Branch Preview URL: | https://v4-b5-gpu-driven.euca-engine.pages.dev |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
gpu_cull.wgsl) that performs per-entity frustum culling and LOD selection on the GPU, outputtingDrawIndexedIndirectargumentsGpuDrivenPipelinestruct managing all GPU buffers (command upload, frustum data, indirect draw output, atomic draw count) and providingcull_and_prepare()/draw_indirect()methodsIndirectDrawBufferandDrawCommandGputypes with exact GPU struct layout matching (verified by 8 tests including bytemuck round-trips and size assertions)Design
One compute dispatch replaces CPU frustum culling + draw-call submission. Each entity gets a fixed slot in the indirect buffer; culled entities get
index_count = 0(zero-cost no-op). LOD selection uses squared-distance thresholds (up to 4 levels per entity). The atomicdraw_countbuffer tracks visible entities for optional multi-draw-indirect-count support.Test plan
draw_command_gpu_layout-- verifies 184-byte struct size and field offsets via bytemuckindirect_args_layout-- verifies 20-byte DrawIndexedIndirect matches wgpu layoutfrustum_data_layout_and_construction-- 112-byte uniform,from_frustum()correctnesscull_params_layout-- 16-byte aligned uniform buffergpu_cull_shader_is_valid_wgsl-- shader source contains all expected constructsworkgroup_count_rounding-- verifies ceil division for dispatch sizingindirect_buffer_size_calculation-- size constant matches struct sizedraw_command_gpu_zeroed-- bytemuck Zeroable produces all-zero fields🤖 Generated with Claude Code