Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Group operations #68

Open
JackKelly opened this issue Feb 20, 2024 · 3 comments
Open

Group operations #68

JackKelly opened this issue Feb 20, 2024 · 3 comments
Labels
enhancement New feature or request

Comments

@JackKelly
Copy link
Owner

JackKelly commented Feb 20, 2024

Allow users to submit operations in groups. All operations in group n will be delivered to the user before any operations in group n+1.

Related

Implementation ideas

1) Use IOSQE_IO_DRAIN

To quote the liburing manual page for io_uring_enter:

IOSQE_IO_DRAIN
When this flag is specified, the SQE will not be started before previously submitted SQEs have completed, and new SQEs will not be started before this one completes. Available since 5.2.

This is perfect when groups have multiple files: we'd set IOSQE_IO_DRAIN on the first SQE of each group.

But we might not want to set IOSQE_IO_DRAIN on every SQE, if we have a tiny number of ops per group (because then we're forcing the storage subsystem to load files sequentially). But let's benchmark it and see how it performs, before implementing a more complicated strategy.

2) Fill the SQ once, but don't top up the SQ until group n has finished

Let's say the user submits 64 groups, and each group has only 1 file. And let's say the SQ size is 32. Maybe we'd fill the SQ with 32 files (1 file from each of 32 groups) in one go. But we'd only top up the SQ when we deliver a buffer to the user. So we'd top up the SQ with 1 file once the first group's data is delivered to the user. And top up again when the second group is delivered. etc.

@JackKelly
Copy link
Owner Author

JackKelly commented May 1, 2024

How to group instructions in LSIO's new trait-based IO interface?

The latest design for LSIO defines the "instruction interface" for the IO crates using a Reader trait and a Writer trait. (See crates/lsio_io/src/lib.rs in the new-io-uring branch for the current draft code defining these traits. But note that the new-io-uring branch will be merged into main soon.)

@JackKelly
Copy link
Owner Author

  • The grouping should also affect downstream processing (not just IO). Eg all decompression for group 1 should finish before group 2 starts? Or, at least, group 1 should have priority in the queue. Maybe, if one decompression task for group 1 is taking ages on one cpu core, it'd still be good to use the other cores to make a start on group 2.
  • it might be nice to be able to use different compute functions per group. Eg decompress one group using zstd. Decompress another group using LZ4. (eg reading from multiple datasets)

@JackKelly JackKelly added this to the 0040: Group operations milestone May 21, 2024
@JackKelly
Copy link
Owner Author

See this comment onwards: #104 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Todo
Development

No branches or pull requests

1 participant