Add opaque region or isOpaque hint #1871

rmader · 2021-06-23T13:06:15Z

This is a follow up on #1425 / #1474

IIUC the now introduced GPUSwapChainAlphaMode=opaque hint can have overhead depending on the implementation, as the implementation has e.g. fill alpha values etc. As already pointed out in #1425 (comment), many OSs, including MacOS[1], Android, Wayland[2], X11[3] and maybe Windows allow clients to provide a hint of the form "please composite this assuming the alpha channel is actually all 1, with undefined results if it's not.". These are usually realized either as a region (Wayland/X11) or as single boolean for a surface (MacOS).

Such a opt-in flag should allow to save a canvas size blit on many implementations, by skipping any blending. Most importantly it would move the responsibility to the client: provide content in an easy to optimize way and you become faster.

As this kind of hint has been well established in OS compositors and is available to native clients, I think we should have an equivalent in WebGPU as well.

cc: @kvark, @magcius

1: https://developer.apple.com/documentation/appkit/nsview/1483558-isopaque
2: https://wayland.freedesktop.org/docs/html/apa.html#protocol-spec-wl_surface -> set_opaque_region
3: https://specifications.freedesktop.org/wm-spec/wm-spec-latest.html#idm46291029692400

The text was updated successfully, but these errors were encountered:

magcius · 2021-06-23T14:52:16Z

If the user sets GPUSwapChainAlphaMode = opaque, then we should be able to set the system compositing flag as well, assuming the surface is using the system compositor. The only difference is that the "undefined results" isn't good for the web. e.g. somebody authors web content on Windows, which has a "composite as opaque" flag. If they don't author content correctly and don't notice, that content then starts behaving differently on other platforms, which they may not be able to immediately test.

Thankfully, we can fix this by running an extra pass that clears alpha to 1.0 before present. As far as I'm aware, this should be relatively cheap on all supported targets. No blending is done, and the number of blits is the same. It's still a massive performance improvement for platforms that support the "assume opaque" flag.

Am I missing something?

jrmuizel · 2021-06-23T15:01:21Z

Isn't the extra pass going to be going to be an entire framebuffer's worth of memory writes? That's not particularly cheap on low end Intel GPUs

kvark · 2021-06-23T15:33:33Z

@rmader thank you for filing this! Issues with proper references are the best :)
I read it twice in order to understand the actual proposal, so I'm going to re-state it here.

The proposal is to add a hint API allowing the implementation (on some platforms) to reduce potential overhead when filling out the alpha with 1. It could be a boolean "it's already 1, I promise!", or some sort of a region that is promised to have alpha of 1.

The general portability concern would apply, like @magcius noted. We aren't going to be checking if the hint is correct, so we'd end up with non-portable behavior if the hint is wrong.

@jrmuizel

Isn't the extra pass going to be going to be an entire framebuffer's worth of memory writes? That's not particularly cheap on low end Intel GPUs

In the last lengthy discussion, #1425 (comment) confirms that the performance overhead is not a big concern, based on https://bugs.chromium.org/p/chromium/issues/detail?id=1045643#c11 investigation:

Experiment 2: https://chromium-review.googlesource.com/2287369
Clear the alpha channel at the end of the frame. Attempted to do this both against multisampled renderbuffer and resolved texture; same result.
Had rendering artifacts around the pinball in this example.
Result: 99%

rmader · 2021-06-23T15:54:51Z

@kvark: thanks :) and yes, your recap sounds right to me.

Concerning the alpha clear: if that is really super cheap and 100% correct on different architectures (we may even care about software implementations?), well, then we are good. I find the evidence for that rather small so far and wonder why OS compositors AFAIK haven't adopted such an approach. From a Mutter dev perspective that would be great news of course :)
I guess it would be good to have some input from GPU vendors here.

The only difference is that the "undefined results" isn't good for the web.

I'm not familiar with web standard development, but if the flag is opt-in and well documented, would it be really too bad for something as complex as WebGPU? I'd imagine there are plenty of ways to do things wrong and still accidentally getting a good results :/

Kangz · 2021-06-23T15:59:59Z

Another idea discussed a long time ago was to have a special texture format that you can only render to that's rgbx8unorm. When using it we would put alpha-false in the write mask such that alpha is guaranteed to stay what it was at the beginning of the pass (and it would start at 1). This way there is no overhead for the compositor at all. However it is more spec complexity and less flexibility for the application.

I'm not familiar with web standard development, but if the flag is opt-in and well documented, would it be really too bad for something as complex as WebGPU? I'd imagine there are plenty of ways to do things wrong and still accidentally getting a good results :/

One promise of the Web is effortless portability where your page will work exactly the same on your system as other systems. When practical we try to keep this property in WebGPU. This prevents writing code that works on one browser and breaks on others (see all these "works best in XXX" pages).

kvark · 2021-06-23T16:01:49Z

The general approach to building a web API here is - minimize the chances something works on one platform but doesn't work on another. If there is a failure, and it's platform-specific, it should be happening as early as possible. I.e. if your program requests higher-than-base limits for the device, it will fail to request the logical device on some platforms. So this is a bold and explicit failure, done early.

kainino0x · 2021-06-23T20:10:54Z

The general portability concern would apply, like @magcius noted. We aren't going to be checking if the hint is correct, so we'd end up with non-portable behavior if the hint is wrong.

We already have this problem with non-opaque canvases (in WebGPU and WebGL): if you output pixels with R>A or G>A or B>A then you get undefined compositing results (not undefined web-observable behavior, notably). I think it would be palatable to extend this to opaque canvases.

My understanding here: #1425 (comment)
is that it would benefit macOS and Android.

kainino0x · 2021-07-19T18:06:08Z

@jdashg and I chatted about this and we think we should seriously consider the special-storeOp solution before this one.

Originally posted by @kainino0x in #1425 (comment)

@kvark had an intriguing idea on chat:

yeah, that's a bit of an issue.
I wonder if storeOp = "present" could be a thing

I think how this would work is, if the render target is a swap chain texture, and the app submit()s work using storeOp: "present", then we would inject a clear if needed and early-detach the swap chain texture (so it can't be accessed anymore). If a canvas texture didn't use storeOp: "present", the browser would potentially inject a whole extra render pass to clear the alpha channel (and we could warn if this occurs). Browsers could also choose a simpler implementation where they just always do that instead of the optimized injected clear.

Kangz · 2021-07-19T20:03:02Z

we would inject a clear if needed

Would that clear be at the beginning of the render pass (at which point it becomes something kinda of observable through alphaBlend) or is it a fullscreen quad with writeMask=alpha at the end of the render pass?

magcius · 2021-07-19T20:09:17Z

Would storeOp: "present" be an optional way to speed up performance, or would it be required to present? If required, would it be required everywhere, or just if you have compositingMode: "opaque" set on your swap chain config?

kainino0x · 2021-07-19T20:43:25Z

Would that clear be at the beginning of the render pass (at which point it becomes something kinda of observable through alphaBlend) or is it a fullscreen quad with writeMask=alpha at the end of the render pass?

Fullscreen quad.

Would storeOp: "present" be an optional way to speed up performance, or would it be required to present? If required, would it be required everywhere, or just if you have compositingMode: "opaque" set on your swap chain config?

Optional. Browsers might issue a warning in some cases.

kainino0x · 2021-08-23T22:35:07Z

I think this can be closed in favor of #1988 which discusses several possible solutions to this problem.

kainino0x · 2021-08-23T22:36:00Z

(It says "get some implementation experience" - but once we do, we can finish resolving it in that issue.)

kainino0x added this to Needs Discussion in Main Jul 17, 2021

kainino0x mentioned this issue Jul 26, 2021

Investigate possible solution for cheaper opaque canvases #1988

Open

kainino0x closed this as completed Aug 23, 2021

kainino0x moved this from Needs Discussion to Specification Done in Main Jan 19, 2022

kainino0x moved this from Specification Done to Needs Discussion in Main Jan 19, 2022

kainino0x moved this from Needs Discussion to No Action in Main Jan 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add opaque region or isOpaque hint #1871

Add opaque region or isOpaque hint #1871

rmader commented Jun 23, 2021 •

edited

magcius commented Jun 23, 2021

jrmuizel commented Jun 23, 2021

kvark commented Jun 23, 2021

rmader commented Jun 23, 2021

Kangz commented Jun 23, 2021

kvark commented Jun 23, 2021

kainino0x commented Jun 23, 2021

kainino0x commented Jul 19, 2021

Kangz commented Jul 19, 2021

magcius commented Jul 19, 2021

kainino0x commented Jul 19, 2021

kainino0x commented Aug 23, 2021

kainino0x commented Aug 23, 2021

Add opaque region or isOpaque hint #1871

Add opaque region or isOpaque hint #1871

Comments

rmader commented Jun 23, 2021 • edited

magcius commented Jun 23, 2021

jrmuizel commented Jun 23, 2021

kvark commented Jun 23, 2021

rmader commented Jun 23, 2021

Kangz commented Jun 23, 2021

kvark commented Jun 23, 2021

kainino0x commented Jun 23, 2021

kainino0x commented Jul 19, 2021

Kangz commented Jul 19, 2021

magcius commented Jul 19, 2021

kainino0x commented Jul 19, 2021

kainino0x commented Aug 23, 2021

kainino0x commented Aug 23, 2021

rmader commented Jun 23, 2021 •

edited