GPUBuffer.mapAsync is overly pessimistic #2646

litherum · 2022-03-07T05:56:37Z

The spec says:

Let p be a new Promise.
...
Enqueue an operation on the default queue’s Queue timeline that will execute the following:
...
Resolve p.

This is a bit unfortunate; it means that mapping is blocked on queue commands which are entirely unrelated to the buffer being mapped. Consider a program like this:

let bufferA = device.createBuffer(...);
let bufferB = device.createBuffer(...);

let commandEncoder1 = device.createCommandEncoder();
commandEncoder1.copyBufferToBuffer(bufferA, 0, someOtherBuffer, 0, 4); // Super small operation. Doesn't touch bufferB
let commandBuffer1 = commandEncoder1.finish();

let commandEncoder2 = device.createCommandEncoder();
populateCommandEncoderWithSuperIntensiveOperationsWith(bufferB); // Doesn't touch bufferA
let commandBuffer2 = commandEncoder1.finish();

device.queue.submit([commandBuffer1, commandBuffer2]);

bufferA.mapAsync(...).then(...); // Requires waiting for the super intensive operation to finish, despite bufferA being idle much earlier than that

When you use a bind group in a render/compute pass, we already have to check the state of all the buffers in that bind group to make sure they're not mapped, so it would seem pretty cheap for each buffer to have a "in-flight count" counter inside it. The count would represent the number of command buffers which have been submitted and not yet finished that reference this buffer. Each command buffer would have to remember all the resources it references, so they can have their counts decremented when the command buffer finishes. Then, the map promise could resolve when the in-flight count for that buffer reaches 0, rather than having to wait for all work on the queue to complete like the spec says today.

The text was updated successfully, but these errors were encountered:

litherum · 2022-03-07T06:16:10Z

Imagine the above program was slightly modified to replace:

bufferA.mapAsync(...).then(...);

with

bufferB.mapAsync(...).then(...);
bufferA.mapAsync(...).then(...);

This "in-flight count" proposal would make it possible for bufferA's promise to resolve before bufferB, despite mapAsync() being called on bufferB before bufferA. I think this is a feature, though, rather than being a bug. If the author actually wants to wait for all work on the queue to finish, they can GPUQueue.onSubmittedWorkDone().

litherum · 2022-03-07T06:17:31Z

This probably has to be discussed for V1 because changing the order of promises could break production code.

kainino0x · 2022-03-09T01:32:31Z

Thanks for filing this, it's something I've thought about from time to time but never dug to see what the spec actually says. I don't know whether we should allow mapping to complete early or not, though in principle it seems great since it would let mapAsync be used as a special kind of fence, different from onSubmittedWorkDone, a "map as soon as the last thing using this buffer is done, I don't care about stuff that got enqueued afterward".
Very related to #962, and #2119 (comment).

kainino0x · 2022-03-09T02:12:44Z

Though in many cases, it should be possible for apps to issue mapAsync right after they're done using the buffer, so maybe the more straightforward scheduling model is worthwhile. Definitely a tradeoff.

Kangz · 2022-03-16T15:10:54Z

Yeah the potential extra latency is unfortunate. Though with graphical applications using RAF there's already some latency due to frame pipelining so an extra bit of latency might not show. And for more expensive compute operations developers would split submissions right after the expensive operation (because all they are doing is the expensive operation). So I don't know how much of a problem it is in practice.

A premise of doing fine-grained mapping promise resolution like this is that it is cheap to track for a buffer whether it is in flight or not. Submission resource tracking is one of the most expensive things in the API, and iterating on all buffers more, or additional, or additional branches would show. Maybe there's a way to make it low overhead (but more complex for the implementation, I'm not sure).

But also making the spec allow fine grained resolution would mean that the order of promise resolution could depend on the browser, making the API less portable.

We can make the spec do things either way, or even start with what we have now and relax later as a non-breaking change (except for the ordering of promise resolution). But there's a clear tradeoff to be aware of.

Kangz · 2022-04-14T10:18:24Z

As discussed in yesterday's meeting, I will detail my claim that we can't at the same time make the buffer mapping less pessimistic and the order of promises defined, in an architecture like WebKit's or Chromium's.

My understanding is that Chromium's architecture and WebKit's architecture for asynchronous calls is the same:

On the content side a call like mapAsync always sends a message to the GPU process "map async was requested"
The promise is pending until a message from the GPU process side is received with "here is the result of your map async".
Some special cases apply, like unmapping a buffer early, or the connection to the GPU process has been lost.

My understanding of what @litherum suggested for promise ordering in the meeting is that promises still resole in "queue order" with the point in the queue considered for buffer mapping being "the end of the last use of the buffer in commands submitted prior to the mapAsync".

It is not possible to have both at the same time (without extensive content-side state tracking of mappable buffers used commands / bindgroups (with UMA ext) / etc, which we very much want to not do in Chromium). The reason is that there is a data race between the content process and the GPU process. Consider the following code:

const mappableBuffer;
const commands = createCommandsThatUse(mappableBuffer);
queue.submit([commands]);

queue.onSubmittedWorkDone().then(() => console.log("work done"));

// Assuming no operations on the queue are done here.
await something();

mappedBuffer.mapAsync(...).then(() => console.log("mapping done"))

Depending on the order of execution, both ordering of console logs can happen. If by the time the GPU process receives the mapAsync call it hasn't sent the callback for onSubmittedWorkDone then the ordering is "work done" "mapping done". However if by the time it gets the mapAsync call it already set the onSubmittedWorkDone (and for example the renderer process already consumed it), then the ordering may be "mapping done" "work done".

Like you said, it's better to make the ordering of promises well-specified so that JS code is more portable. The improved version here would need to lose that property, and the gains we expect are things applications can already do themselves (by mapping right after submit). That's why I think the current spec is what we should keep. But if you're less concerned about promise ordering and really want to change this, we could lift that restriction, and it would be a non-breaking change to add strict ordering in the future. Either way seems ok. (as long as we don't need to do extensive content-side state tracking).

kainino0x · 2022-05-03T04:27:03Z

I read the meeting minutes for April 20th and I'm a bit confused on the state of the issue - sounds like Myles said WebKit wouldn't have the ordering problem Corentin described, but I don't understand why.

I'm pretty wary about introducing undefined promise ordering, and I am pretty sure I agree with Corentin's example that we can't fully de-pessimize without either undefined ordering or a lot of client-side tracking.

litherum · 2022-05-03T04:42:54Z

I believe the conclusion from the last meeting is that ordering between onSubmittedWorkDone() and mapAsync() is less important than mapAsync() completing as soon as it’s able to.

kainino0x · 2022-05-03T18:48:36Z

OK, as long as we're clear that we'll have to have undefined ordering.

I think it's fine. I'd love to have full ordering but we can find a solution later to better-define the ordering if we start seeing compatibility issues between browsers in the real world.

Kangz · 2022-05-17T13:56:18Z

Moving to editorial since we have consensus to relax the constraint on ordering for mapAsync.

kainino0x · 2022-07-21T00:57:17Z

A note for when we're doing the editorial work: I'd like to have a non-normative note that says UAs should consider issuing a dev tools warning if you (successfully) try to getMappedRange() without having awaited (attached a .then to) the return value of mapAsync().

kainino0x · 2022-08-08T17:38:36Z

On second thought I think this needs a prose description (not just the algorithm details), including explaining the ordering with onSubmittedWorkDone. (I believe we agreed that onSubmittedWorkDone resolution guarantees the completion of maps issued before it.)

litherum added this to the V1.0 milestone Mar 7, 2022

kainino0x added this to Needs Discussion in Main Mar 9, 2022

kainino0x added for webgpu editors meeting and removed for webgpu editors meeting labels Mar 9, 2022

This comment was marked as resolved.

Sign in to view

Kangz modified the milestones: V1.0, Polish post-V1 Apr 25, 2022

This comment was marked as resolved.

Sign in to view

Kangz modified the milestones: Polish post-V1, V1.0 Apr 25, 2022

This comment was marked as resolved.

Sign in to view

Kangz added the copyediting Pure editorial stuff (copyediting, bikeshed, etc.) label May 17, 2022

kainino0x moved this from Needs Discussion to Needs Specification in Main May 19, 2022

kainino0x mentioned this issue Jul 22, 2022

Un-nest steps for mapAsync() #3224

Merged

kainino0x closed this as completed in #3224 Jul 23, 2022

kainino0x reopened this Aug 8, 2022

kainino0x mentioned this issue Aug 16, 2022

Better formalize and document mapAsync promise ordering #3338

Merged

kainino0x closed this as completed in #3338 Aug 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPUBuffer.mapAsync is overly pessimistic #2646

GPUBuffer.mapAsync is overly pessimistic #2646

litherum commented Mar 7, 2022 •

edited

litherum commented Mar 7, 2022 •

edited

litherum commented Mar 7, 2022

kainino0x commented Mar 9, 2022

kainino0x commented Mar 9, 2022

Kangz commented Mar 16, 2022

Kangz commented Apr 14, 2022

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

kainino0x commented May 3, 2022

litherum commented May 3, 2022

kainino0x commented May 3, 2022

Kangz commented May 17, 2022

kainino0x commented Jul 21, 2022 •

edited

kainino0x commented Aug 8, 2022

GPUBuffer.mapAsync is overly pessimistic #2646

GPUBuffer.mapAsync is overly pessimistic #2646

Comments

litherum commented Mar 7, 2022 • edited

litherum commented Mar 7, 2022 • edited

litherum commented Mar 7, 2022

kainino0x commented Mar 9, 2022

kainino0x commented Mar 9, 2022

Kangz commented Mar 16, 2022

Kangz commented Apr 14, 2022

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

kainino0x commented May 3, 2022

litherum commented May 3, 2022

kainino0x commented May 3, 2022

Kangz commented May 17, 2022

kainino0x commented Jul 21, 2022 • edited

kainino0x commented Aug 8, 2022

litherum commented Mar 7, 2022 •

edited

litherum commented Mar 7, 2022 •

edited

kainino0x commented Jul 21, 2022 •

edited