-
-
Notifications
You must be signed in to change notification settings - Fork 364
Description
Currently, we reading uncompressed data we have the store read the data into a temporary buffer and then copy those bytes into the output buffer.
To get the best performance, we should consider adding an optional get_into
API to Store
. Instead of taking a prototype
, this would take the actual output Buffer
to read into. Stores must opt into this, for backwards compatibility, by overriding supports_get_into
:
async def get_into(
self,
key: str,
out: Buffer,
byte_range: ByteRequest | None = None,
) -> bool:
raise NotImplementedError
@property
def supports_get_into(self) -> bool:
"""Does the store support get_into?"""
return False
For the special case of
- uncompress data, and
- The chunk being read is a contiguous subset of the output ndarray
then the bytes on disk can be interpreted directly as an ndarray (when combined with a shape and itemsize (and maybe endianness?), and we can avoid a memcpy
. Some early testing indicates that this might be worth doing. Over in https://github.com/TomAugspurger/zarr-python/blob/tom/zero-copy-alt/simple.py, I see about 7.5x higher throughput for reading uncompressed data with read_into
(compared to about 2.5x higher throughput for compressed data, where this get_into
optimization isn't an option).
Real world gains will probably be lower, and remote file system APIs typically don't offer APIs to read directly into a user-allocated output buffer like .readinto
does.