-
-
Notifications
You must be signed in to change notification settings - Fork 280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prototype of object-store
-based Store implementation
#1661
base: main
Are you sure you want to change the base?
Conversation
Amazing @kylebarron! I'll spend some time playing with this today. |
With roeap/object-store-python#9 it should be possible to fetch multiple ranges within a file concurrently with range coalescing (using That PR also adds a |
src/zarr/v3/store/object_store.py
Outdated
async def get_partial_values( | ||
self, key_ranges: List[Tuple[str, Tuple[int, int]]] | ||
) -> List[bytes]: | ||
# TODO: use rust-based concurrency inside object-store |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How I did it in rfsspec: https://github.com/martindurant/rfsspec/blob/main/src/lib.rs#L141
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
object-store has a built-in function for this: get_ranges
. With the caveat that it only manages multiple ranges in a single file.
get_ranges also automatically handles request merging for nearby ranges in a file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I know, but mine already did the whole thing, so I am showing how I did that.
Great work @kylebarron! |
I suggest we see whether it makes any improvements first, so it's author's choice for now. |
While @rabernat has seen some impressive perf improvements in some settings when making many requests with Rust's tokio runtime, which would possibly also trickle down to a Python binding, the biggest advantage I see is improved ease of use in installation. A common hurdle I've seen is handling dependency management, especially around boto3, aioboto3, etc dependencies. Versions need to be compatible at runtime with any other libraries the user also has in their environment. And Python doesn't allow multiple versions of the same dependency at the same time in one environment. With a Python library wrapping a statically-linked Rust binary, you can remove all Python dependencies and remove this class of hardship. The underlying Rust object-store crate is stable and under open governance via the Apache Arrow project. We'll just have to wait on some discussion in object-store-python for exactly where that should live. I don't have an opinion myself on where this should live, but it should be on the order of 100 lines of code wherever it is (unless the v3 store api changes dramatically) |
👍
I want to keep an open mind about what the core stores provided by Zarr-Python are. My current thinking is that we should just do a |
This is no longer an issue, s3fs has much more relaxed deps than it used to. Furthermore, it's very likely to be already part of an installation environment. |
I agree with that. I think it is beneficial to keep the number of dependencies of core zarr-python small. But, I am open for discussion.
Sure! That is certainly useful. |
This is awesome work, thank you all!!! |
Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com>
The I'd like to update this PR soonish to use that library instead. |
If the zarr group prefers object-store-rs, we can move it into the zarr-developers org, if you like. I would like to be involved in developing it, particularly if it can grow more explicit fsspec compatible functionality. |
I have a few questions because the
I like that |
This came up in the discussion at https://github.com/zarr-developers/zarr-python/pull/2426/files/5e0ffe80d039d9261517d96ce87220ce8d48e4f2#diff-bb6bb03f87fe9491ef78156256160d798369749b4b35c06d4f275425bdb6c4ad. By default, it's passed as Does it look compatible with what you need? |
I think I'm confused why it's a parameter at all. Why shouldn't it return a protocol, and the store can implement whatever interface is most convenient to return data. Put another way: when the store chooses the return interface, it can ensure no memory copies, and then the caller of the store can decide whether they need to copy the memory elsewhere. |
Yeah, I'm not familiar with that. Looks like @madsbk added it in #1910, so presumably it's related to whether or not the data will end up on the GPU? I guess that's one bit of context the Store won't necessarily have, assuming it can place the data in host or device memory, and so it being a parameter might be necessary. |
I do think making the concrete return type of |
It makes sense that we'll always need a copy for CPU -> GPU, but I'd like to avoid situations where a store must copy data for CPU -> CPU. Right now that could be unavoidable depending on the buffer class the user passes in. Are we saying that the user needs to know the copy semantics of the underlying store? |
Actually no, I fully expect that the rapids team should be able to make a direct object-store/NIC->GPU store class and also to do filter decoding on the GPU ( https://docs.rapids.ai/api/kvikio/stable/zarr/ ). Whether any of that ends up here, is another matter. |
Sure, I really meant to say "if the store loads data into the CPU, then we'll need to make a copy for CPU to GPU". I'm not surprised that it's possible to make direct to GPU readers. |
@kylebarron - in terms of testing this, you should take a look at how we're doing this for other stores. Basically, we've created a reusable test harness in zarr-python/tests/test_store/test_remote.py Lines 107 to 134 in 5807cba
|
The idea with @kylebarron you should be able to create a Buffer from a async def get(
self,
key: str,
prototype: BufferPrototype,
byte_range: tuple[int | None, int | None] | None = None,
) -> Buffer | None:
the_rust_buffer: RustBuffer = # load data into a rust buffer
return prototype.buffer.from_buffer(memoryview(the_rust_buffer)) Now, if the user request a GPU buffer, a later codec can decide to move the data to the GPU and maybe use nvCOMP to decompress the data etc. |
Can someone detail the semantics of zarr-python/src/zarr/abc/store.py Line 18 in 9dd9ac6
But that type hint on its own isn't fully descriptive, and I can't find any documentation about it. This is what I think it means:
|
I'm not really a fan of this API, but I don't know the GPU side well enough to propose something else. |
That is certainly what it means when used in fsspec; None in the first place is the same as 0 and None in the second place is the same as "end"/"size". Note that they can can be negative, so a suffix range would be I can't guarantee if the same convention is used here, but zarr blocks are either whole (None, None) or the exact range (start, stop) is known. |
If zarr blocks are either whole or known, then shouldn't the type hint for the store be ByteRangeRequest: TypeAlias = tuple[int, int] | None ? |
or maybe |
The suffix request is required for sharding. In shard files, the index containing the byte ranges of the chunks is, by default, at the end of the file. The size of the index can be statically determined from the array metadata. The size of the shard file can not be inferred in the general case. To avoid a preflight request to determine the file size, the suffix request is required. Most HTTP servers including Object Storage services support suffix requests. |
I introduced that type so that we could have exactly this conversation -- prior to its definition, we had various functions across the codebase that were taking a byte range parameter, but the type of that parameter wasn't defined in a central place. I'm not attached to this particular type! We can totally change it to something nicer, provided the semantics of that type covers all required use cases. |
Is there an example of a suffix request somewhere in this repo, so we can see how the range is passed as an argument to the store?
Except for Azure 😢. The default implementation of the |
I think it's worth considering changing it to a dataclass, because the semantics of the tuple are not always clear. And elsewhere in the codebase, a "chunk slice" refers to start and length, not start and end. |
Agreed, we definitely need to bump up the literacy of this type. I opened #2437 for this discussion. |
https://github.com/zarr-developers/zarr-python/blob/main/src/zarr/codecs/sharding.py#L700-L702 |
Prototype of object-store based store.
object-store is a rust crate for interoperating with remote object stores like S3, GCS, Azure, etc. See the highlights section of its docs. It doesn't even try to implement a filesystem interface, instead focusing on the core atomic operations supported across object stores. This makes it a good candidate for use with a Zarr v3 Store.
object-store-python is a Python binding to object-store. With roeap/object-store-python#6, I added async methods to the library. So the underlying Rust binary will return a Python coroutine that can be awaited.
That and related PRs haven't been merged yet, but you can try this out locally by installing
note that you need the Rust compiler on your computer. Install that by following these docs.
TODO: