You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We're deviating from the WebGPU API with respect to how to read/write buffer data, because its hard to reproduce the API in Python in a way that does not make it very easy for the user to access released memory and thus cause a segfault.
The WebGPU API
The WebGPU API for synchronizing data between a GPUBuffer and the CPU makes use of "mapping". The API works as follows:
You request the buffer to map its data (mapAsync()), using a specific range.
Then you obtain an ArrayBuffer using getMappedRange(), again using a subrange (within the total subrange).
You copy to/from that array-buffer (via a typed array view).
You unmap the buffer.
This API offers appealing advantages:
Allows reading and writing data in a buffer.
Allows doing that with subranges.
Allows doing that with multiple subranges in one go, i.e. without mapping/unmapping multiple times.
Without unnecessary data copies.
The problem
It is challenging to find a Pythonic API to replicate this behavior. I think that any solution that we implement should make it impossible for the user to access the memory after it is unmapped. (I mean impossible, unless the users is deliberately using ffi or something to do so.)
However, this appears to be oddly hard to do with the way buffers, arrays and views interact in Python.
As an example. one can invalidate a memorview object by calling its release() method. But if another memoryview or numpy array has been mapped to the same memory, these continue to work.
What we have now
The solution so far has been to implement a much simpler API using map_read() and map_write(data), without the option to read/write a subrange of the buffer. So basically only the first bulletpoint.
What we need
At the very least the first two bullet points. I very much hope to also include the third bullet-point. If we can't avoid data-copies (the fourth bullet point), that's unfortunate, but not problematic as Python is not super-duper-fast anyway.
Some options ...
Stick to read_data and write_data
What we have now, but can include args to specify a range.
Chunked writing / reading
I think this would cover most use-cases, without the need to expose the mapping stuff:
defwrite_chunks(sequence, offset=0, size=0):
# With sequence an iterable (or even a generator) providing tuples (offset, data)
...
Mapping, but via a custom class so that we can restrict access
classBufferMapping:
def__init__(self, mem):
self._never_touch_this_mem=mem# a memoryviewself._ismapped=True# The buffer will set this to False when it's unmappeddefcast(self, format, shape=None):
ifnotself._ismapped:
raiseRuntimeError("Cannot use a buffer mapping after it's unmapped.")
self._never_touch_this_mem=self._never_touch_this_mem.cast(format, shape)
returnselfdef__getitem__(self, index):
ifnotself._ismapped:
raiseRuntimeError("Cannot use a buffer mapping after it's unmapped.")
res=self._never_touch_this_mem.__getitem__(index)
ifisinstance(res, memoryview):
raiseIndexError("Cannot get a sub-view") # or also wrap in a BufferMapping?returnresdef__setitem__(self, index, value):
ifnotself._ismapped:
raiseRuntimeError("Cannot use a buffer mapping after it's unmapped.")
self._never_touch_this_mem.__setitem__(index, value)
defto_memoryview(self):
# Make a copynew_obj= (ctypes.c_uint8*self._never_touch_this_mem.nbytes)()
new_mem=memoryview(new_obj)
new_mem[:] =self._never_touch_this_memreturnnew_mem
The thing is ... when would you use this? To map the data and then setting data elements one by one? That would be slow because of the overhead that we introduce. In batches then? Well, in that case you could call write_data(subdata, offset) a few times ...
The use-cases where a mapped API has an advantage (in Python) seem flaky, and the API is much more complex. Therefore we don't currently expose this API in wgpu-py.
However ... I could miss a use-case. And I could miss a possibly elegant solution.
The text was updated successfully, but these errors were encountered:
In the current API (per #156) we have buffer.map_read() and buffer.map_write() for somewhat lower-level io, and queue.read_buffer, write_buffer, read_texture and write_texture for more convenience (that use a temporary buffer behind the scenes).
Also see:
Intro
We're deviating from the WebGPU API with respect to how to read/write buffer data, because its hard to reproduce the API in Python in a way that does not make it very easy for the user to access released memory and thus cause a segfault.
The WebGPU API
The WebGPU API for synchronizing data between a GPUBuffer and the CPU makes use of "mapping". The API works as follows:
mapAsync()
), using a specific range.ArrayBuffer
usinggetMappedRange()
, again using a subrange (within the total subrange).unmap
the buffer.This API offers appealing advantages:
The problem
It is challenging to find a Pythonic API to replicate this behavior. I think that any solution that we implement should make it impossible for the user to access the memory after it is unmapped. (I mean impossible, unless the users is deliberately using ffi or something to do so.)
However, this appears to be oddly hard to do with the way buffers, arrays and views interact in Python.
As an example. one can invalidate a
memorview
object by calling itsrelease()
method. But if another memoryview or numpy array has been mapped to the same memory, these continue to work.What we have now
The solution so far has been to implement a much simpler API using
map_read()
andmap_write(data)
, without the option to read/write a subrange of the buffer. So basically only the first bulletpoint.What we need
At the very least the first two bullet points. I very much hope to also include the third bullet-point. If we can't avoid data-copies (the fourth bullet point), that's unfortunate, but not problematic as Python is not super-duper-fast anyway.
Some options ...
Stick to read_data and write_data
What we have now, but can include args to specify a range.
Chunked writing / reading
I think this would cover most use-cases, without the need to expose the mapping stuff:
Mapping, but via a custom class so that we can restrict access
The thing is ... when would you use this? To map the data and then setting data elements one by one? That would be slow because of the overhead that we introduce. In batches then? Well, in that case you could call
write_data(subdata, offset)
a few times ...The use-cases where a mapped API has an advantage (in Python) seem flaky, and the API is much more complex. Therefore we don't currently expose this API in wgpu-py.
However ... I could miss a use-case. And I could miss a possibly elegant solution.
The text was updated successfully, but these errors were encountered: