Add API to `Store` interface to read directly into an output buffer

Currently, we reading uncompressed data we have the store read the data into a temporary buffer and then copy those bytes into the output buffer.

To get the best performance, we should consider adding an optional `get_into` API to `Store`. Instead of taking a `prototype`, this would take the actual output `Buffer` to read into. Stores must opt into this, for backwards compatibility, by overriding `supports_get_into`:

```
    async def get_into(
        self,
        key: str,
        out: Buffer,
        byte_range: ByteRequest | None = None,
    ) -> bool:
        raise NotImplementedError

    @property
    def supports_get_into(self) -> bool:
        """Does the store support get_into?"""
        return False
```

For the special case of

- uncompress data, and
- The chunk being read is a contiguous subset of the output ndarray

then the bytes on disk can be interpreted directly as an ndarray (when combined with a shape and itemsize (and maybe endianness?), and we can avoid a `memcpy`. Some early testing indicates that this might be worth doing.  Over in https://github.com/TomAugspurger/zarr-python/blob/tom/zero-copy-alt/simple.py, I see about 7.5x higher throughput for reading uncompressed data with `read_into` (compared to about 2.5x higher throughput for compressed data, where this `get_into` optimization isn't an option).

Real world gains will probably be lower, and remote file system APIs typically don't offer APIs to read *directly* into a user-allocated output buffer like `.readinto` does.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add API to `Store` interface to read directly into an output buffer #3429

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Add API to Store interface to read directly into an output buffer #3429

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Add API to `Store` interface to read directly into an output buffer #3429