-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discussing an alternative std.event.Loop architecture #5311
Comments
Consider pread, preadv, pwrite, pwritev - these specify the position in the syscall, so they are safe to do concurrently. One could imagine writing a database application, where it writes multiple records at the same time to a backing file. |
Relating to the last question, what would be some example use cases where an operation is waiting for readable or writable, but not only readable or only writable? |
@andrewrk thanks I was focusing only on sockets and forgot about seekable streams. @kprotty good point, ultimately, I'm not sure it's a thing. This is what I know (and how I got there):
In any case, since it was there originally, I wanted to make sure it would be possible to support the case if needed. If anybody that knows more can confirm it's not a thing, I'd be super happy to remove that list. |
You've already put more thought into it than I originally did when writing |
@kristoff-it AFAIK Another thing that doesn't seem to be currently handled is if there were any errors after an event comes in. The current implementation re-performs the operation to catch the error but this could trigger a SIGPIPE being generated if trying to write() to a peer-closed socket on linux without a |
At the very least, I think we can say that multiple simultaneous awaiters on one file is a fairly specialty case, and that most Zig programs using evented IO won't do this, or will do this only in highly specialized contexts. If there's any chance that making this work will cause a performance pessimization in the more common cases (which putting a linked list in std.fd.file would definitely do), it should be something that the programmer opts into within a scope. Maybe something more like this: // register the event with an object that contains the linked list
const file_multi_awaiter = FileWithMultiAwait(&underlying_file, event_loop_with_multi_await_support, allocator);
// do stuff that may suspend and await the event
var a = async writeAtPosition(&file_multi_awaiter, event_loop_with_multi_await_support, pos1, data1);
var b = async writeAtPosition(&file_multi_awaiter, event_loop_with_multi_await_support, pos2, data2);
// wait for it to finish
await a; await b;
// deinit the multi-awaiter
file_multi_awaiter.deinit(); This form makes it clear that you can't relocate the FileWithMultiAwait, since a pointer is passed into the async functions. It also lets the user control when the multi-await event is active, to avoid extra queue events for files that won't be awaited. And it forces the user to choose the allocator that will be used for the linked list. |
One of the beautiful things about linked lists is that they are "bring your own memory" data structures. Each node in a linked list can be allocated with a different allocator, or with no allocator, and operations on linked lists do not need a reference to an allocator. So no allocator would be needed here. The linked list node can be inside the |
Small status update: event loop integration was successfully moved from Other than This change is probably even more compelling if we want to set ourselves up for io_uring, since it would break the assumption that evented io must always be implemented as a failed syscall + a retry notification in a loop. |
After thinking about this for a while (and having had a short break from it because of other engagements), I'm back at it and I'd like to discuss some practical actions with anybody that's interested. While I started with the idea of changing the internal architecture of the event loop to support a different way of interfacing with epoll/kqueue, I think this change should be postponed, and instead I would recommend to proceed with a different change first, as mentioned in my last comment. To reiterate what I mean: I think we should first make sure to pull up all integration with the event loop from Additionally, I would recommend to pull inside I think at that point it will become much easier to experiment with different architectures and integrate with more system APIs. @andrewrk thoughts? If the plan sounds fine I could proceed with the changes and file a pr, but I wouldn't mind some guidance on how to test that I didn't break I/O for other platforms, etc. |
I agree on both points! Thanks for the overview, it helps to understand the context when looking at PRs. As far as testing goes, it's about time to add CI testing for async I/O modes, but we don't have it set up yet. That would probably be worth investing in directly before or after these changes. Async I/O is still "experimental" so it's ok to break master branch on accident. Personally I would move forward with your improvements, and look into the additional testing infrastructure afterwards. |
Current architecture
Currently
std.event.Loop
relies on the following general process to orchestrate async frames (to my understanding):listen
/accept
/connect
/read
/write
. If the syscall response isEAGAIN
(or equivalents), the function adds a oneshot event in epoll/kqueue/... for the specific action they need (i.e. read or write). The event contains a pointer to the async frame itself. Once the event is added, the frame suspends. The expectation is that the event loop will resume it once the socket becomes actionable again. This is used to retry some syscalls (e.g.read
,write
) and to know when one has completed (e.g.connect
). If a syscall completes withoutEAGAIN
, no event will be slotted in epoll/kqueue/..., and the async frame will not suspend.while(true)
loop and block on the polling request. The request will unblock when an event slotted in (1) triggers. The thread will then resume the frame pointer present in the event just obtained and, once the frame is done executing (the top levelreturn
in the frame "chain" is hit) or suspends again, the thread will return to its mainwhile(true)
polling loop.Event diagram
This diagram shows a possible set of events slotted in epoll/kqueue/... during a program execution.
Pros of this system:
Cons:
writable OR readable
event when used in this fashion, which is what originally made me look into the event loop code (Making evented I/O work on macOS #5286).Alternative architecture
An example of an alternative architecture is one where, when a new file descriptor is opened, we arm epoll/kqueue/.. with a "multishot" (i.e. with the oneshot flag disabled) event that triggers every time the file descriptor goes from non-actionable to actionable (e.g. from non-readable, to readable).
At this point the question becomes: if we only slot one event per file descriptor, then what information should we include in the event struct, so that we can then resume work once the event comes back from epoll/kqueue/...?
One reasonable way is to have the event point to the corresponding
std.fd.File
struct in the application, and add a field to the struct so that it points to the head of a linked list of suspended frames. See the following diagram.Event diagram
Things not shown in the above graph:
std.fs.File
struct).std.fs.File
structs don't have a single list of suspended frames, but rather 3: one list for frames awaiting for the file descriptor to become readable, one list for writable, and one list for readable or writable. The event loop inspects which condition caused the event to fire and resumes frames from the corresponding list(s).Exploration work
A couple of days ago Andrew came on stream to briefly explain some details of the current implementation and provide some advice on how to proceed. See this VOD, he shows up two times, one near the beginning, and one near the end of the stream.
Yesterday I worked on the new implementation. Since it's my first time trying this kind of work, I just aimed to get the thing working in the most straightforward way possible, ignoring error conditions, platforms other than the one I'm developing on (macOS), and necessary cleanup.
The result is the first commit of this branch: https://github.com/kristoff-it/zig/tree/loop-b
It's able to run some test code (see this gist, the zag_ version compiles against my branch, while the zig_ version works with current zig). It's also currently single threaded.
I will report any progress & findings. For now I have two.
Findings
std.fs.File
pointers to epoll/kqueue/... makes those structs non-copiable. This is an annoying and pervasive change.Questions
One big question that I'm left with, is the following: all these systems expect multiple frames to be concurrently waiting to operate on the same stream, while this is not what you would want to happen in any reasonable use case that I can think of, since you would potentially mess up read/write order. For example, Python even throws an exception when this situation occurs (here you can see me demonstrate this case).
Can anybody give me an example of why this is considered an important situation to support?
The text was updated successfully, but these errors were encountered: