-
-
Notifications
You must be signed in to change notification settings - Fork 105
Description
The processing API allows performing arbitrary GPU->CPU readback on demand via the loadPixels method. This is generally an extension of the immediate mode philosophy of processing but poses significant friction with modern graphics architectures. In contrast with opengl, which goes to significant lengths to preserve the illusion that GPU data is easily accessible, WebGPU is designed to foreground that GPU operations are asynchronous relative to the CPU timeline. While the API does support blocking on these operations, the expectation of most framework including bevy is that you won't.
Concretely why this matters is because modern graphics libraries really want to batch rendering together for efficiency reasons. For example, in Bevy, all cameras (which could be considered an analog to a PGraphics instance) want to render at the same time every frame. loadPixels introduces an architectural complication in that the current draw state may need to be flushed and made visible to any other graphics context at any arbitrary time.
In other words, beyond any performance concerns, this highlights a potential dependency problem between PGraphics instances:
- In the event where multiple
PGraphicsinstances aren't dependent on each other, batching works fine and we can delay flushing all their draw state til the end of frame. - When a
PGraphicsis used in anotherPGraphics, e.g. an off-screen texture used by something in rendering to the screen, we could simply just track a relative order to ensure the BevyCamerafor one runs before the other. - If the user calls
loadPixels, we need to flush the draw state right now and make the texture visible to potentially any other CPU code.
Approach for WebGPU with Bevy
We should start just by mirroring the immediate mode API. What this means is that we'll mirror the drawStart, flush, drawEnd lifecycle. Set CameraOutputMode::Skipat the beginning of each frame for each surface to render only to the intermediate texture and CameraOutputMode::Write only when calling drawEnd.
We can handle loadPixels in this way just by doing a synchronous readback after a flush. It's fine, and will superficially look like opengl's behavior from the user perspective.
Because this immediate mode approach likely has some undesirable overhead (although tbd, it may be pretty minimal), we can do our own dependency/dirty tracking if necessary down the line. What this would look like is keeping an in-flight dependency graph of PGraphics and how they're being used, and only trigger flushes when they acutally need to be made visible to other graphics contexts. This isn't hard to do but will make the implementation more confusing.