-
Notifications
You must be signed in to change notification settings - Fork 302
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPUBuffer.mapAsync is overly pessimistic #2646
Comments
Imagine the above program was slightly modified to replace:
with
This "in-flight count" proposal would make it possible for |
This probably has to be discussed for V1 because changing the order of promises could break production code. |
Thanks for filing this, it's something I've thought about from time to time but never dug to see what the spec actually says. I don't know whether we should allow mapping to complete early or not, though in principle it seems great since it would let mapAsync be used as a special kind of fence, different from onSubmittedWorkDone, a "map as soon as the last thing using this buffer is done, I don't care about stuff that got enqueued afterward". |
Though in many cases, it should be possible for apps to issue mapAsync right after they're done using the buffer, so maybe the more straightforward scheduling model is worthwhile. Definitely a tradeoff. |
Yeah the potential extra latency is unfortunate. Though with graphical applications using RAF there's already some latency due to frame pipelining so an extra bit of latency might not show. And for more expensive compute operations developers would split submissions right after the expensive operation (because all they are doing is the expensive operation). So I don't know how much of a problem it is in practice. A premise of doing fine-grained mapping promise resolution like this is that it is cheap to track for a buffer whether it is in flight or not. Submission resource tracking is one of the most expensive things in the API, and iterating on all buffers more, or additional, or additional branches would show. Maybe there's a way to make it low overhead (but more complex for the implementation, I'm not sure). But also making the spec allow fine grained resolution would mean that the order of promise resolution could depend on the browser, making the API less portable. We can make the spec do things either way, or even start with what we have now and relax later as a non-breaking change (except for the ordering of promise resolution). But there's a clear tradeoff to be aware of. |
As discussed in yesterday's meeting, I will detail my claim that we can't at the same time make the buffer mapping less pessimistic and the order of promises defined, in an architecture like WebKit's or Chromium's. My understanding is that Chromium's architecture and WebKit's architecture for asynchronous calls is the same:
My understanding of what @litherum suggested for promise ordering in the meeting is that promises still resole in "queue order" with the point in the queue considered for buffer mapping being "the end of the last use of the buffer in commands submitted prior to the mapAsync". It is not possible to have both at the same time (without extensive content-side state tracking of mappable buffers used commands / bindgroups (with UMA ext) / etc, which we very much want to not do in Chromium). The reason is that there is a data race between the content process and the GPU process. Consider the following code: const mappableBuffer;
const commands = createCommandsThatUse(mappableBuffer);
queue.submit([commands]);
queue.onSubmittedWorkDone().then(() => console.log("work done"));
// Assuming no operations on the queue are done here.
await something();
mappedBuffer.mapAsync(...).then(() => console.log("mapping done")) Depending on the order of execution, both ordering of console logs can happen. If by the time the GPU process receives the Like you said, it's better to make the ordering of promises well-specified so that JS code is more portable. The improved version here would need to lose that property, and the gains we expect are things applications can already do themselves (by mapping right after submit). That's why I think the current spec is what we should keep. But if you're less concerned about promise ordering and really want to change this, we could lift that restriction, and it would be a non-breaking change to add strict ordering in the future. Either way seems ok. (as long as we don't need to do extensive content-side state tracking). |
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
I read the meeting minutes for April 20th and I'm a bit confused on the state of the issue - sounds like Myles said WebKit wouldn't have the ordering problem Corentin described, but I don't understand why. I'm pretty wary about introducing undefined promise ordering, and I am pretty sure I agree with Corentin's example that we can't fully de-pessimize without either undefined ordering or a lot of client-side tracking. |
I believe the conclusion from the last meeting is that ordering between onSubmittedWorkDone() and mapAsync() is less important than mapAsync() completing as soon as it’s able to. |
OK, as long as we're clear that we'll have to have undefined ordering. I think it's fine. I'd love to have full ordering but we can find a solution later to better-define the ordering if we start seeing compatibility issues between browsers in the real world. |
Moving to editorial since we have consensus to relax the constraint on ordering for mapAsync. |
A note for when we're doing the editorial work: I'd like to have a non-normative note that says UAs should consider issuing a dev tools warning if you (successfully) try to |
On second thought I think this needs a prose description (not just the algorithm details), including explaining the ordering with onSubmittedWorkDone. (I believe we agreed that onSubmittedWorkDone resolution guarantees the completion of maps issued before it.) |
The spec says:
This is a bit unfortunate; it means that mapping is blocked on queue commands which are entirely unrelated to the buffer being mapped. Consider a program like this:
When you use a bind group in a render/compute pass, we already have to check the state of all the buffers in that bind group to make sure they're not mapped, so it would seem pretty cheap for each buffer to have a "in-flight count" counter inside it. The count would represent the number of command buffers which have been submitted and not yet finished that reference this buffer. Each command buffer would have to remember all the resources it references, so they can have their counts decremented when the command buffer finishes. Then, the map promise could resolve when the in-flight count for that buffer reaches 0, rather than having to wait for all work on the queue to complete like the spec says today.
The text was updated successfully, but these errors were encountered: