Rebase on object_store #22

clbarnes · 2023-11-20T11:43:17Z

Pros

Offload complexity of different backends, gaining any new implementations for free
More consistent (and probably better thought out) interface
Async-first, which we should be moving towards anyway

Cons

Probably still need wrapper types so that we can implement backends they don't have
May not fit perfectly with our data model
WASM doesn't support cloud stores due to reqwests: object-store fails to compile for wasm32-unknown-unknown with http feature apache/arrow-rs#4776
~~We use traits a lot, and async methods on traits aren't stable yet, but may be as of 2023-12-28~~
get and get_range{s} use an async stream wrapping a Bytes ~~respectively~~; both would need wrapping over
~~doesn't allow suffix ranges yet object_store: range request with suffix apache/arrow-rs#4611~~

With any move to async, BB codecs in particular would need major rewrites, and make use of https://crates.io/crates/async-compression for compression.

The text was updated successfully, but these errors were encountered:

clbarnes · 2024-01-08T14:17:10Z

Async traits are stable, and object-store has a release candidate with suffix requests, so this can get going.

Some design decisions tie in to the async strategy.

object-store asynchronously provides the whole block of bytes rather than a reader. IRL anything providing a Read/ Write interface will probably be doing so over an opaque buffer anyway, and chunks should be small compared to RAM so I don't think this is a problem. The difference is how that buffer is allocated, I suppose.

Async functions should not block, i.e. all CPU-heavy functions (e.g. compression) should be pushed down into async functions. But at the end of the day, these are probably just going to call spawn_blocking, which has its own overhead, especially if we're doing lots of small reads, as we expect to be. We like the async nature of the original fetch, because there's waiting involved, but don't get a lot from async read/write through the codec chain. It might be better, therefore, to have CPU-heavy blocking code inside the (async) codec chain, with the expectation that the caller tucks each chunk's codec chain into a spawn_blocking to parallelise it at the top level (fetching multiple chunks to make a multi-chunk read).

@JackKelly has probably thought about concurrency models as they apply to IO more than I have - any thoughts? The options are

have BB codecs asynchronously return a Bytes object rather than an (async)reader, putting blocking code into that async function on the basis that concurrency should be controlled by thread spawning higher up the chain. This is easier than the alternative.
have BB codecs return async readers/ writers, more like the current API. We would need to manually do some async wrapping for a bunch of codecs. This might end up spawning more threads (under the hood) and so introducing more overhead, but may be more RAM efficient depending on the buffering done by each codec.

clbarnes · 2024-01-08T15:59:24Z

Incompatibilities

object_store has no partial writes - does HTTP?!
object_store get_ranges all uses the same location. This isn't insurmountable, just annoying to map back and forth over

So we'll keep the current traits (with some signature changes) and just impl<S: ObjectStore> (Readable/Listable/Writeable)Store for S.

JackKelly · 2024-01-08T16:24:22Z

Hey Chris! This all sounds very interesting.

FWIW, I've been slowly hacking away on light-speed-io. I'm currently focused on using io_uring to fetch large numbers of chunks from local disks. light-speed-io isn't a Zarr implementation. Instead the aim is to provide an easy-to-use API for fetching large numbers of chunks of files concurrently, using different optimisations for different storage systems. I'm toying with the idea of having light-speed-io take a user-supplied function, so light-speed-io can also orchestrate (and parallelise) the processing of each chunk. Or maybe light-speed-io will just expose an iterator of loaded chunks, and it'll be entirely the user's responsibility to

On the topic of async: I was originally thinking of using async. But I've shifted towards using rayon to process chunks in parallel. And all the async IO will be handled by the operating system (using io_uring). Here's some very hacky proof-of-concept rust code using io_uring and Rayon.

One day, I'd hope that light-speed-io could be the IO backend for zarr3-rs. But that day isn't today! It'll be months before light-speed-io does anything very useful, and even longer before light-speed-io has mature support for cloud storage, so I wouldn't want to block your work!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rebase on object_store #22

Rebase on object_store #22

clbarnes commented Nov 20, 2023 •

edited

clbarnes commented Jan 8, 2024

clbarnes commented Jan 8, 2024 •

edited

JackKelly commented Jan 8, 2024

Rebase on object_store #22

Rebase on object_store #22

Comments

clbarnes commented Nov 20, 2023 • edited

Pros

Cons

clbarnes commented Jan 8, 2024

clbarnes commented Jan 8, 2024 • edited

Incompatibilities

JackKelly commented Jan 8, 2024

clbarnes commented Nov 20, 2023 •

edited

clbarnes commented Jan 8, 2024 •

edited