Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: WEBGPU: batch/bundle rendering #26876

Closed
vegarringdal opened this issue Sep 30, 2023 · 15 comments
Closed

Feature request: WEBGPU: batch/bundle rendering #26876

vegarringdal opened this issue Sep 30, 2023 · 15 comments
Labels
Milestone

Comments

@vegarringdal
Copy link
Contributor

Description

Noticed babylongjs had some tricks to speed up webgpu
https://doc.babylonjs.com/setup/support/webGPU/webGPUOptimization/webGPUSnapshotRendering

Since webgpu is still under development/early in threejs I figured it might be a good time to plan for something like this in threejs also.

From what I understand babylonjs is using GPURenderBundle.
I might be wrong, but I could see they was using it in source code, and threejs did not.

Solution

GPU render bundle is still early, but might be a good time a adapth for it:
https://developer.mozilla.org/en-US/docs/Web/API/GPURenderBundle

Everything with webgpu is early 😄

Samples here showing it in action
https://webgpu.github.io/webgpu-samples/samples/animometer
https://webgpu.github.io/webgpu-samples/samples/renderBundles

Alternatives

Do not really have any alternatives

Additional context

No response

@sunag sunag added this to the r??? milestone Oct 1, 2023
@vegarringdal vegarringdal changed the title WEBGPU: batch/bundle rendering Feature request: WEBGPU: batch/bundle rendering Oct 3, 2023
@Mugen87 Mugen87 added the WebGPU label Oct 3, 2023
@aardgoose
Copy link
Contributor

It would be simple* to add the snapshot rendering feature as described in the Babylon.js reference, but as the reference notes, it is only useful with a constant scene graphs, so the use cases for that feature would be limited and should be used with care.

It might be worthwhile looking at per object bundle use, but the issues of invalidation need to be worked out for that case and wouldn't provide the same performance improvements as whole scene bundles.

  • I have actually implemented the bulk of this in an experiment to enable synchronous shader compilation with WebGPURenderer.render().

@vegarringdal
Copy link
Contributor Author

My model is mostly static (large bim models). But I use groups [start/end/material index] and change the group/material some time when user does coloring. So I can end up with a lot of drawcalls in some cases. So think snapshot would be a great fit for this.

Maybe mesh could have a useSnapshot to enable/disable and snapshotNeedsUpdate to tell threejs to update snapshot?

Old video of app i hope webgpu will help as it gets better. And believe snapshot part is needed for webgpu.
https://www.youtube.com/watch?v=FzTHbogfr5k

This is a very small model...

@mrdoob
Copy link
Owner

mrdoob commented Oct 27, 2023

How about doing a new scene type... StaticScene maybe?

We could make it so StaticScene.update() doesn't update matrices unless matrixWorldNeedsUpdate is true.
From there, implementing batching shouldn't be too hard...?

This would also deprecate the Object3D.DEFAULT_MATRIX_AUTO_UPDATE and Object3D.DEFAULT_MATRIX_WORLD_AUTO_UPDATE constants.

this.matrixAutoUpdate = Object3D.DEFAULT_MATRIX_AUTO_UPDATE;
this.matrixWorldNeedsUpdate = false;
this.matrixWorldAutoUpdate = Object3D.DEFAULT_MATRIX_WORLD_AUTO_UPDATE; // checked by the renderer

/cc @gkjohnson @takahirox @donmccurdy @Mugen87 @WestLangley

@donmccurdy
Copy link
Collaborator

donmccurdy commented Oct 27, 2023

I like the term 'static' over 'snapshot' as an API-agnostic way of declaring content's intent for the renderer to optimize. 👍🏻

The feature shouldn't be limited in scope to just multi-material meshes. But declaring the scene static vs. dynamic (all or nothing) feels a bit limiting too. Would it be too complex to allow something like...

// (a) mark everything within a THREE.Group as static, allowing the entire
// group to be rendered as a single batch.
group.static = true;

// (b) mark a single THREE.Mesh as static, allowing it to be batched together
// with anything else in the scene also marked as static.
mesh.static = true

Or we could go the other way around, let the user declare the scene static by default, and opt-out specific objects as dynamic. But in this case, and in (b) above, we'd still need to do the matrix updates for the entire scene...

I might lean toward (a).

@aardgoose
Copy link
Contributor

Either way a or b looks practical.

Creating renderBundles to freeze some of the scene render, captures the combination of vertex buffer bindings, uniform buffer bindings and shaders, but doesn't prevent you from modifying the contents of the vertices/uniforms themselves, so there is no problem with applying matrix updates etc. You would probably want to turn off frustum culling to ensure all the objects you want are captured when freezing a scene, or with more complexity request thaw/refreeze when the visibility of an object changes.

  • I have tested the prototype referenced previously with the webgpu_sprites example which worked as expected.

@gkjohnson
Copy link
Collaborator

Even if this is the kind of API that's added to the WebGPURenderer I'd still like to see a more flexible, controllable API (like BatchedMesh in #22376) that these more ergonomic use cases can be built on top of. I feel there's a lot of power to having more control which hopefully a demo I'm putting together will demonstrate. Overall my feeling is that static geometry rendering is the least-interesting use case for batching - but I admit that I'm less familiar with what this "GPURenderBundle" call is capable of compared to WEBGL_multidraw_arrays.

@vegarringdal
Copy link
Contributor Author

I really hope it will support replacing materials in groups and updating uniforms.

Having a lot of fun with merged meshes and geometry groups where I add transformation on selected

Would be very nice to be able to port this to webgpu and batch rendering later. Since I will end up with a lot of groups in the end if user goes crazy 😂

Small video of last weeks experiments using transform tool and adding transformations to groups
https://github.com/mrdoob/three.js/assets/2901416/ce805727-dfb7-4023-bc24-79152aafcc5b

@donmccurdy
Copy link
Collaborator

donmccurdy commented Oct 31, 2023

Overall my feeling is that static geometry rendering is the least-interesting use case for batching...

That's an interesting point. I'm not sure I understand the full applications of WebGPU render bundles either. It would be interesting to get relative performance numbers for:

  1. 1000 draw calls (naive)
  2. 1000 draw calls (merged)
  3. 1000 draw calls (render bundle)

If there's really ~zero CPU overhead for drawing 1000 draw calls as a bundle, including distinct materials and un-mergeable objects, and we can continue to update uniform buffers for materials and object transforms without invalidating the bundle ... then that's a hugely flexible feature to have, and limiting it to 'static' objects wouldn't make much sense. Why bother to carefully merge compatible things when you can just toss everything into a bundle? Have one bundle for the static objects and another for the dynamic objects.

My guess is that it's not that quite simple and there's more of a tradeoff here. But without knowing what these tradeoffs are, I'm not sure what to propose.

@vegarringdal
Copy link
Contributor Author

vegarringdal commented Oct 31, 2023

@donmccurdy
My first post had som sample links btw 😁

Having mesh option useRenderBundle to enable/disable and renderBundleNeedsUpdate to tell threejs to update gpu encoder/command etc might be the most flexible?
Or it could be reuseGpuState and gpuStateNeedsUpdate
This feature will only save time on js running between each draw call from how I understand it.

@aardgoose
Copy link
Contributor

I have been playing with #26983 and the webgpu_sprites example. You only see significant gains with bundling when you have a large number of draw calls (and associated pipeline and buffer bindings. Rendering 8000 sprites (actually less, because a lot will be culled in the first normal render pass) gives a frame time of 50ms normal and 40ms with bundling enabled.

The sprite demo also exposes some of the problems with bundling, the sort order of rendering is frozen as is frustum culling, although the later could be disabled before the 'freeze'.

The demonstrations linked are rather misleading, because they don't include the overhead of a full rendering engine. There is still a lot of overhead maintaining the bindingGroups (UBOs in WebGL terms) - selecting what needs updating and then writing etc, which are all still per object.

@donmccurdy
Copy link
Collaborator

donmccurdy commented Nov 1, 2023

A few more use cases we may want to consider eventually:

  • rendering multiple views (AR / VR, CubeCamera, ...)
  • rendering additional 'layers' of transmission (PBR transparency) at lower cost

These uses could be done automatically, without user-facing API.

@donmccurdy
Copy link
Collaborator

Short thread with some helpful comments from Brandon Jones —

https://fosstodon.org/@donmccurdy/111338067257899207

@donmccurdy
Copy link
Collaborator

An even more detailed writeup from Brandon! 🎉

https://toji.dev/webgpu-best-practices/render-bundles

@RenaudRohlinger
Copy link
Collaborator

Bundle Rendering is now possible with the static property.
For example:

const group = new Group()
group.static = true

// next render the Group and all its children will be automatically bundled and render in a more static way

I think we can now close this issue.
Regarding the webGPUSnapshotRendering of Babylonjs, we will get close to the concept once the next steps of the static mode gets achieved with multiple bindGroup and frame uniform group system.

@mrdoob mrdoob modified the milestones: r???, r165 May 24, 2024
@prideout
Copy link
Contributor

prideout commented Sep 5, 2024

Do you guys think that group.static should be added to the documentation? I just now tried a cursory search through the source code and could not find it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

9 participants