Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement minimal GPU culling for cameras. #12673

Closed
wants to merge 1 commit into from

Commits on Mar 23, 2024

  1. Implement minimal GPU culling for cameras.

    This commit introduces a new component, `GpuCulling`, which, when
    present on a camera, skips the CPU visibility check in favor of doing
    the frustum culling on the GPU. This trades off potentially-increased
    CPU work and drawcalls in favor of cheaper culling and doesn't improve
    the performance of any workloads that I know of today. However, it opens
    the door to significant optimizations in the future by taking the
    necessary first step toward *GPU-driven rendering*.
    
    Enabling GPU culling for a view puts the rendering for that view into
    *indirect mode*. In indirect mode, CPU-level visibility checks are
    skipped, and all visible entities are considered potentially visible.
    Bevy's batching logic still runs as usual, but it doesn't directly
    generate mesh instance indices. Instead, it generates *instance
    handles*, which are indices into an array of real instance indices.
    Before any rendering is done, for each view, a compute shader,
    `cull.wgsl`, maps instance handles to instance indices, discarding any
    instance handles that represent meshes that are outside the visible
    frustum. Draws are then done using the *indirect draw* feature of
    `wgpu`, which instructs the GPU to read the number of actual instances
    from the output of that compute shader.
    
    Essentially, GPU culling works by adding a new level of indirection
    between the CPU's notion of instances (known as instance handles) and
    the GPU's notion of instances.
    
    A new `--gpu-culling` flag has been added to the `many_foxes`,
    `many_cubes`, and `3d_shapes` examples.
    
    Potential follow-ups include:
    
    * Split up `RenderMeshInstances` into CPU-driven and GPU-driven parts.
      The former, which contain fields like the transform, won't be
      initialized at all in when GPU culling is enabled. Instead, the
      transform will be directly written to the GPU in `extract_meshes`,
      like `extract_skins` does for joint matrices.
    
    * Implement GPU culling for shadow maps.
    
      - Following that, we can treat all cascades as one as far as the CPU
        is concerned, simply replaying the final draw commands with
        different view uniforms, which should reduce the CPU overhead
        considerably.
    
    * Retain bins from frame to frame so that they don't have to be rebuilt.
      This is a longer term project that will build on top of bevyengine#12453 and
      several of the tasks in bevyengine#12590, such as main-world pipeline
      specialization.
    pcwalton committed Mar 23, 2024
    Configuration menu
    Copy the full SHA
    8ebfd67 View commit details
    Browse the repository at this point in the history