-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MapAsync with subranges. #605
Comments
Thank you for YADUP! I like how it both allows sub-ranges to be mapped and elegantly replaces I don't think this replaces
If we don't have a continuous persistent allocation, we'd have to either be copying data from the GPU buffer, or zeroing it, and that changes the performance characteristics of this API. Is there really a choice here? I thought that the proposal addresses the problem of mem-zeroing in #594 , but that "if we don't want" puts that fix in question. Also, since we know about buffer state on the client side, is there a reason not to expose it to the user (e.g. as a read-only property of the |
Should your sample IDL function return
|
That's a good point and a good reason to stick with allowing overlapping subranges.
It's part of another of the choices that make the buffer itself the promise object, I wouldn't mind adding it so it's a bit easier for folks to debug and learn about buffer mapping:
I thought the |
WebIDL doesn't have |
Echoing my feedback from the call: this change at first seems like an improvement over the current
Things to improve in this proposal:
|
I would prefer keeping read and write call as one. I think that makes for a better API, and that we should instead work to satisfy the usecase of a partial map. |
@jdashg but this proposal does not offer the use case of a partial map. I'm suggesting the ways to improve this proposal for what it's trying to do. What you are talking about is a different proposal that has client-side buffer tracking, and it will need to be considered independently, even if built upon this one. |
We're both suggesting modifications to this proposal to make it better in our own opinions. |
Feedback I heard is that we should split the read and write calls so the read path can be simplified and can take an additional range argument. After spending more time on it and discussions with @kvark offline (which considered alternatives like The rational for keeping range-less GPUQueue.prototype.readBufferBack = function(buffer, offset, size) {
const readbackBuffer = device.createBuffer({size, usage: MAP_READ | COPY_DST});
const encoder = device.createCommandEncoder();
encoder.copyBufferToBuffer(buffer, offset, readbackBuffer, 0, size);
this.submit([encoder.finish()]);
await readbackBuffer.mapReadAsync();
const content = readbackBuffer.getMappedRange();
content.detach = function() {
readbackBuffer.destroy(); // has implicit unmap
};
return content;
}; So the IDL would be the following: partial interface GPUBufferUsage {
const GPUBufferUsageFlags MAP_READ = 0x0001;
const GPUBufferUsageFlags MAP_WRITE = 0x0002;
};
partial interface GPUBuffer {
Promise<void> mapReadAsync();
Promise<void> mapWriteAsync();
ArrayBuffer getMappedRange(unsigned long offset = 0, unsigned long size = 0);
void unmap();
}
partial dictionary GPUBufferDescriptor {
boolean mappedAtCreation = false;
}; and the behavior the same as the original proposal (splitting Like the original proposal, this one assumes |
@litherum @kvark @jdashg I'd like to make progress on this issue outside of the meetings, it seems that the only discussion left are:
|
The (2) question needs to be answered first because if we do specify the range, then the calls are clearly different and (1) is no longer a question. I do think a range is useful for We don't need to change any rules to support this range, i.e. calling |
That could work, and
I'm very slightly in favor of |
In last week's @jdashg had a concern that this proposal could be difficult for native game engines to adopt if it differs too much from the way they use buffer mapping. In particular being able to map some ranges of a buffer while using the rest as staging. I looked at the most advanced open-source engine I know of, Godot, and if you look at the description of their buffer management, it seems to be doing exactly what this proposal allows for: use a buffer per frame, and if it is not enough create an additional buffer for this frame. I can't point at source code or describe it in details, but this shouldn't be too concerning for Unreal Engine either. I couldn't check for Unity. Hopefully this resolves the concern there was about native game engine being able to use the form of buffer mapping proposed in this issue. |
The range would be helpful because it provides a way for applications to not have to transfer the entire contents of the buffer from the GPU Process to the Web Process. If it's difficult for applications to use, they can specify the entirety of the buffer. But, without this, it's impossible for applications to specify a smaller range if they do happen to know which range they want. |
Ok, let's have a range argument with the behavior that @kvark described:
So the IDL would be the following: partial interface GPUBufferUsage {
const GPUBufferUsageFlags MAP_READ = 0x0001;
const GPUBufferUsageFlags MAP_WRITE = 0x0002;
};
partial interface GPUBuffer {
Promise<void> mapReadAsync(unsigned long offset = 0, unsigned long size = 0);
Promise<void> mapWriteAsync();
ArrayBuffer getMappedRange(unsigned long offset = 0, unsigned long size = 0);
void unmap();
}
partial dictionary GPUBufferDescriptor {
boolean mappedAtCreation = false;
}; |
This also does a number of cleanups to match the style of other WebGPU functions with the valid usage section.
This also does a number of cleanups to match the style of other WebGPU functions with the valid usage section.
Merged but will need a small tweak (split read/write or add a flag for read/write). Once that PR is up, this goes back to Discussion, then to Testing. |
…b#708) This also does a number of cleanups to match the style of other WebGPU functions with the valid usage section.
…b#708) This also does a number of cleanups to match the style of other WebGPU functions with the valid usage section.
Buffer mapping is in the spec now. Closing. |
Reopening because the spec still references this issue. Search the spec for "allowed buffer usages". |
These 4 inline issues are just editorial things for stuff not fully specified in the spec yet, that link back here as reference for the design. I don't think that means this issue has to be open? We have plenty of inline editorial issues in the spec that don't link to an open issue at all. |
I think we should close again. |
YADUP (Yet Another Data Upload Proposal)
Last meeting there seemed to be appetite for asynchronous mapping that would allow requesting subranges but the group wanted to see a more fleshed out proposal.
This version of data upload is very similar to what we have in the spec today with
mapWriteAsync
andmapReadAsync
but the resolution of the mapping promise doesn't give anArrayBuffer
, instead it stores theArrayBuffer
in an internal slot of theGPUBuffer
and there's aGPUBuffer.getMappedRange
method that allows getting subranges of the internalArrayBuffer
.This is close the @kainino0x's old
GPUMappedMemory
idea.Proposal
Calling
GPUBuffer.mapAsync
is an error if the buffer is not valid or if it is not in the "unmapped" state (which means it is not destroyed either). Upon errormapAsync
returns a promise that will reject. Upon successmapAsync
puts the buffer in the "mapping" state and returns a promise that when it resolves, will put the buffer in the "mapped" state.Calling
GPUBuffer.getMappedRange
, if the buffer is not in the "mapped" state, return null. If called in the "mapped" state it returns a newArrayBuffer
that's a view into the content of the buffer at range[offset, offset + size[
(obviously there's a JS exception on a bad range check).size
andoffset
default to 0, and asize
of 0 means the remaining size of the buffer afteroffset
:buffer.getMappedRange
returns the whole range.Calling
GPUBuffer.unmap
is an error if the buffer is not valid or if it is in the unmapped state. On success:ArrayBuffers
returned byGPUBuffer.getMappedRange()
are detached and the buffer if put in the "unmapped" stateNote that modifications to the content of
ArrayBuffer
returned bygetMappedRange
are semantically modifications of the content of the buffer itself.Calling
GPUDevice.createBuffer
withdescriptor.mappedAtCreation
can be done even ifdescriptor.usage
doesn't contain theMAP_READ
orMAP_WRITE
flags. IfmappedAtCreation
is true, the buffer is created in the "mapped" and its content modified beforeunmap()
and other uses like in aqueue.submit()
.As usual, other uses of
GPUBuffer
like in aGPUQueue.submit()
would validate that the buffer is in the "unmapped" state. And similar to other proposals there would be restrictions on the usages that can be used in combination withMAP_READ
andMAP_WRITE
. Contrary to other proposalsMAP_READ
andMAP_WRITE
could be set at the same time, and I suggest the following rules:MAP_WRITE
is presentCOPY_SRC
is allowed.MAP_READ
is present,COPY_DST
is allowed.MAP_READ
andMAP_WRITE
are present, then bothCOPY_SRC
andCOPY_DST
are allowed.MAP_WRITE
is present, thenVERTEX
andUNIFORM
are also allowed.This mapping mechanism would live side-by-side with a
writeToBuffer
path.There's also threading constraints that all calls to
getMappedRange
andunmap()
must be in the same worker soArrayBuffers
can be detached.Alternatives choices
A single
mapAsync
is present instead ofmapWriteAsync
andmapReadAsync
. The proposal talks about theArrayBuffer
being the content of theGPUBuffer
directly, so it was a bit weird to havetwo map functions. The downside if that if the implementation can't wrap shmem in a GPU resource:
unmap()
even forMAP_READ
buffers to update the content with writes the application did in theArrayBuffer
MAP_READ
buffers so the implementation knows what to overwriteIt could be possible to not return a promise from
mapAsync
and instead make theGPUBuffer
itself act like a promise with a.then
method and maybe a synchronous "state" member.The assumption is that multi-process browsers will allocate one large shmem corresponding to the whole size of mapped buffers, so multiple
ArrayBuffers
could look at the same memory and overlap. If we don't want to force one large continuous allocation,getMappedRange
could enforce that the ranges are all disjoint between calls tounmap
.The text was updated successfully, but these errors were encountered: