Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Operation pools POC #209

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

DXist
Copy link

@DXist DXist commented Sep 5, 2023

This PR is not finished and mostly done to discuss my ideas and get feedback.

I'm exploring monoio for my project and would like to split execution of a single thread application into synchronous and asynchronous parts (sync/async).

In other words I see it as a loop that interleaves between

  • the sync part which is a deterministic state machine that manipulates inmemory state and runs in discrete ticks
  • and async part that drives network/storage IO until either all IO work is completed or tick timeout is hit (~10ms)

I see the async part as a single future that always completes within timeout limit.

Besides I need some way to pass execution context between sync and async parts. I don't like approaches with returning BoxFuture's or spawning async tasks because I prefer to do allocations only during application startup.

One option that I started to implement in this PR is to have an API oriented on batched operations similar to IO uring:

  1. synchronously enqueue IO operations with user provided index (user_data) - this index is different from IO Uring user_data and is not sent via the ring
    If the ring is full, don't issue submit syscall but queue in the driver - to postpone context switch to asynchronous code
  2. asynchronously wait using some driver-level API function (OpPools::submit?) for at least one of operations in a batch is completed
  3. synchronously process IO results of completed operations
    • use the returned user_data index, get context of the application-level operations and
    • change the dependent inmemory state
  4. synchronously flush any operations enqueued to the driver but not submitted to the ring

Another idea is to establish configurable in compile time resource limits:

  • max total number of outstanding IO Uring operations

  • max number of outstanding operations of each type - for example,

    • limit for Accept op is regulated by anticipated maximum number of incoming connections or
    • maximum number of inflight read/write operations

I included in the POC AcceptPool structure that

  • allocates and reuses socketaddr storage for N Accept Ops and
  • holds the data of Accept operations until user code will process completed results.

Result processing is expected to return the acquired allocated resources to the pool.

Each IO operation could have it's own pool and driver could have configured pools for all operations (see OpPools structure).

I would like to get feedback and check if the project is aligned with the illustrated ideas.

@CLAassistant
Copy link

CLAassistant commented Sep 5, 2023

CLA assistant check
All committers have signed the CLA.

@DXist DXist marked this pull request as draft September 5, 2023 21:22
@ihciah
Copy link
Member

ihciah commented Sep 15, 2023

Sorry, I can't quite get the problem you want to solve. Now the model is like:

  1. Run sync code instantly, or use spawn_blocking and await it if heavy(this will turn it into async code).
  2. Wrap sync code to async code finally.

Under what circumstances would we want to call asynchronous code from synchronous code?

@DXist
Copy link
Author

DXist commented Sep 15, 2023

I decided not to use async approach because I don't want to do memory allocation on task::spawn().

I plan to run my appication with callback-based approach like this

// preallocate memory on startup
let mut app_state_machine = AppStateMachine::with_config(config);
let mut io_queue = VecDeque<OpKey>;

let mut finished = false;
while (!finished) {
     finished = app_state_machine.tick(&mut io_queue);
     // app_state_machine can dispatch completions
     io_driver.run_until_timeout(&mut app_state_machine, &mut io_queue, timeout)?; // fail only on fatal errors
}

So I need a lower level IO driver rather than IO runtime. I don't know if monoio wants to expose cross-platform driver but I found an alternative that provides public driver.

@ihciah
Copy link
Member

ihciah commented Sep 19, 2023

Please give some time to investigate it...

@nyabinary
Copy link

Please give some time to investigate it...

Any updates?

@DXist
Copy link
Author

DXist commented Jan 4, 2024

I have an update from my side - I've customized completion-style IO driver backed by io_uring/IOCP/kqueue. I integrated it in my application that preallocates memory for IO buffers and IO operation arguments during startup and runs custom event loop.

Summary of customizations:

  • fixed size submission queue for operations
  • external runtime could submit an external queue of not yet queued operations as a single batch
  • timers are exposed as Timeout operation and use suspend-aware CLOCK_BOOTTIME clock source when it's available
  • file descriptors could be registered to remove refcounting overhead in io_uring
  • Separate implementations of nonvectored and vectored Recv/Send and RecvFrom/SendTo - to exclude msghdr related overhead
  • Use send/recv syscalls/opcodes over read/write to remove the associated overhead
  • Add Read/Write operations for nonseekable files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants