Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Making buffer mapping part of the public API? #114

Closed
Tracked by #387
almarklein opened this issue Jul 1, 2020 · 1 comment · Fixed by #377
Closed
Tracked by #387

Making buffer mapping part of the public API? #114

almarklein opened this issue Jul 1, 2020 · 1 comment · Fixed by #377
Labels
question Further information is requested

Comments

@almarklein
Copy link
Member

almarklein commented Jul 1, 2020

Also see:

Intro

We're deviating from the WebGPU API with respect to how to read/write buffer data, because its hard to reproduce the API in Python in a way that does not make it very easy for the user to access released memory and thus cause a segfault.

The WebGPU API

The WebGPU API for synchronizing data between a GPUBuffer and the CPU makes use of "mapping". The API works as follows:

  • You request the buffer to map its data (mapAsync()), using a specific range.
  • Then you obtain an ArrayBuffer using getMappedRange(), again using a subrange (within the total subrange).
  • You copy to/from that array-buffer (via a typed array view).
  • You unmap the buffer.

This API offers appealing advantages:

  • Allows reading and writing data in a buffer.
  • Allows doing that with subranges.
  • Allows doing that with multiple subranges in one go, i.e. without mapping/unmapping multiple times.
  • Without unnecessary data copies.

The problem

It is challenging to find a Pythonic API to replicate this behavior. I think that any solution that we implement should make it impossible for the user to access the memory after it is unmapped. (I mean impossible, unless the users is deliberately using ffi or something to do so.)

However, this appears to be oddly hard to do with the way buffers, arrays and views interact in Python.

As an example. one can invalidate a memorview object by calling its release() method. But if another memoryview or numpy array has been mapped to the same memory, these continue to work.

What we have now

The solution so far has been to implement a much simpler API using map_read() and map_write(data), without the option to read/write a subrange of the buffer. So basically only the first bulletpoint.

What we need

At the very least the first two bullet points. I very much hope to also include the third bullet-point. If we can't avoid data-copies (the fourth bullet point), that's unfortunate, but not problematic as Python is not super-duper-fast anyway.

Some options ...

Stick to read_data and write_data

What we have now, but can include args to specify a range.

Chunked writing / reading

I think this would cover most use-cases, without the need to expose the mapping stuff:

def write_chunks(sequence, offset=0, size=0):
    # With sequence an iterable (or even a generator) providing tuples (offset, data)
    ...

Mapping, but via a custom class so that we can restrict access

class BufferMapping:

    def __init__(self, mem):
        self._never_touch_this_mem = mem  # a memoryview
        self._ismapped = True  # The buffer will set this to False when it's unmapped

    def cast(self, format, shape=None):
        if not self._ismapped:
            raise RuntimeError("Cannot use a buffer mapping after it's unmapped.")
        self._never_touch_this_mem = self._never_touch_this_mem.cast(format, shape)
        return self

    def __getitem__(self, index):
        if not self._ismapped:
            raise RuntimeError("Cannot use a buffer mapping after it's unmapped.")
        res = self._never_touch_this_mem.__getitem__(index)
        if isinstance(res, memoryview):
            raise IndexError("Cannot get a sub-view")  # or also wrap in a BufferMapping?
        return res

    def __setitem__(self, index, value):
        if not self._ismapped:
            raise RuntimeError("Cannot use a buffer mapping after it's unmapped.")
        self._never_touch_this_mem.__setitem__(index, value)

    def to_memoryview(self):
       # Make a copy
        new_obj = (ctypes.c_uint8 * self._never_touch_this_mem.nbytes)()
        new_mem = memoryview(new_obj)
        new_mem[:] = self._never_touch_this_mem
        return new_mem

The thing is ... when would you use this? To map the data and then setting data elements one by one? That would be slow because of the overhead that we introduce. In batches then? Well, in that case you could call write_data(subdata, offset) a few times ...

The use-cases where a mapped API has an advantage (in Python) seem flaky, and the API is much more complex. Therefore we don't currently expose this API in wgpu-py.

However ... I could miss a use-case. And I could miss a possibly elegant solution.

@almarklein
Copy link
Member Author

In the current API (per #156) we have buffer.map_read() and buffer.map_write() for somewhat lower-level io, and queue.read_buffer, write_buffer, read_texture and write_texture for more convenience (that use a temporary buffer behind the scenes).

@almarklein almarklein added the question Further information is requested label Dec 14, 2021
This was referenced Oct 12, 2023
@almarklein almarklein mentioned this issue Oct 23, 2023
30 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant