RFC #0007: Event loop refactor #7

straight-shoota · 2024-05-02T19:06:00Z

This RFC is very much work in progress and has a lot of unresolved questions which I intend to discuss here.

Preview: https://github.com/crystal-lang/rfcs/blob/rfc/0005/text/0007-event_loop-refactor.md

crysbot · 2024-05-03T12:49:07Z

This pull request has been mentioned on Crystal Forum. There might be relevant details there:

https://forum.crystal-lang.org/t/rfc-0007-event-loop-refactor/6812/1

yxhuvud · 2024-05-03T16:44:37Z

text/0007-even_loop-refactor.md

+
+Should events from the Crystal runtime be part of the event loop as well?
+
+- Fiber: sleep


Yes, under the assumption that it is sleep with argument and not endless sleep. At least for io_uring the easiest implementation is to just emit the appropriate op (ie timeout) and it will just work in a way that is extremely straightforward (also note the even simpler implementation of Fiber#yield a few lines beneath it. That is the territory of the scheduler, but event loop and schedulers are not necessarily all that separate, in practice..).

yxhuvud · 2024-05-03T16:50:25Z

text/0007-even_loop-refactor.md

+- **Generic API**: Independent of a specific event loop design
+- **Black Box**: No interference with internals of the event loop
+- **Pluggable**: It must be possible to compile multiple event loop implementations and choose at runtime.
+  Only one type of implementation will typically be active (it would be feasible to have different event loop implementations in different execution contexts, if there's a use case for that).


(Feasible, but problematic). Some default values may be wanted to be different when using file descriptors in different event loops, like for example if sockets are opened in nonblocking mode or not. This is also quite problematic for the pluggable-at-runtime switching. :(

To properly solve this, I suppose it would require the opening of file descriptors to be passed to the event loop as well.

Deferring it to the event loop help a little but doesn't really solve it as it means fds cannot be passed around. So for example, it would not work to open the socket in one context, and then start to listen on it and spawn handlers in another. And if descriptors are not setup properly, the error symptoms would be weird. Not having that confusion and that footgun around is worth more than what it would fix.

My preference for linux in the really long term would be to keep the default to nonblocking until there is no longer any support for any linux version that doesn't have a sufficiently modern io_uring, and then switch the default once we have an implementation we are happy with. The latter can be implemented using poll ops until the switch (similar to the uring branch) - less neat and does more work than necessary, but it would work. But at some time it would be nice to switch as the polling adds a little overhead both to code and performance.

(FWIW, note that there is still cases where wait_readable and wait_writable is useful even with io_uring. perhaps their fate needs to be a separate discussion as they don't provide value/make sense on certain other platforms)

Yeah, I don't think it makes sense to move open fds between different event loop implementations. That's just going to be chaos. And I'm not sure it would even be very useful anyway.
However I could potentially see use cases for using one kind of event loop implementation for a part of an application, and another one in another part. Not saying that this definitely makes sense, but it might.

Yeah, I don't think it makes sense to move open fds between different event loop implementations.

So what you are saying is that each event loop need to open their own copy of std_* and keep track of them separately? And that it shouldn't be possible to safely pass file descriptors around without restrictions? That seems a lot more chaotic to me. The gains for adding restrictions like that on the user seems very tiny.

At least on linux/mac. No idea what would be the best choice on windows.

edit: I am not against allowing having loop-private fds, but I don't think it make sense as a default choice.

edit2: It could also make sense to have it unspecified. Allowing globality where it make sense (due to the fds actually being global like the linux default), or local (when they are not global, eg handles on windows or privately registered fds which is something uring can do).

As a use case I was imagining a set of file descriptors being handled by a separate event loop from the rest of the application. This could be a set of sockets for some specific communication purpose. There doesn't need to be any standard IO on such an event loop. Or any other interference with file descriptors that are not part of that set.

If I understand correctly we want to have multiple event loop implementations compiled together (i.e. have io_uring + epoll on Linux) but only one implementation will be activated after a runtime check for the whole program.

Couldn't the event loop implementation tell whether it wants O_NONBLOCK for example?

yxhuvud · 2024-05-03T16:53:22Z

text/0007-even_loop-refactor.md

+
+### Optional event loop features
+
+Some activities are managed on the event loop on one platform but not on others. Example would be `Process#wait` which goes through IOCP on Windows but on Unix it’s part of signal handling. (Note: Perhaps we could try to get that on the event loop on Unix as well? **🤔** But there are other examples of system differences)


linux has signalfd which could help, but I am not aware if mac/bsd have any good solutions.

BSD and Darwin have kqueue that can receive signals (and many other things).
Recent Linux kernels have pidfd that look even more lovely than signalfd.

It's good to know we have the option to put signals on the event loop. It's a different question whether we'd want that. It could be useful as it would help simplify the implementation, I imagine?

But this is just an example. The real question here is how to handle events that need to be on the event loop on one system and we cannot put them on on another one. Not sure if there's any currently relevant use case except signals, but it's something to ponder for future extensibility.

yxhuvud · 2024-05-03T17:00:17Z

text/0007-even_loop-refactor.md

+
+### Bulk events without fibers
+
+For some applications it might be useful to interact with the event loop directly, being able to push operations in bulk without having to spawn (and wait) a fiber for each one.


One could also think of the reverse. For example io_uring supports multishot accept, ie it is not necessary to rearm the op after getting one without emitting more events, so a usage of it could either spawn new fibers or reuse a set of existing fibers with each trigger.

text/0007-even_loop-refactor.md

crysbot · 2024-05-07T16:41:43Z

This pull request has been mentioned on Crystal Forum. There might be relevant details there:

https://forum.crystal-lang.org/t/curious-about-the-eventloop-updates/6825/3

straight-shoota · 2024-05-07T17:17:22Z

text/0007-event_loop-refactor.md

+  # Opens a connection on *socket* to the target *address* and continues fiber
+  # when the connection has been established.
+  # Returns `IO::Error` but does not raise.
+  abstract def connect(socket : ::Socket, address : ::Socket::Addrinfo | ::Socket::Address, timeout : ::Time::Span?) : IO::Error?


The address type could perhaps be abstracted. We need to support both types, though they both boil down to the same interface: #to_unsafe returning a pointer to a LibC::Sockaddr and #size returning its size in bytes.

So it could be an option to define an interface for that and use that as type restriction. But we cannot actually make it strictly type-safe because the return type of #to_unsafe is Void* and we cannot enforce a pointer to LibC::Sockaddr (so Array would implement this interface for example).

Alternatively, we could remove the type restriction entirely because any type that fulfills the implicit interface should work.

I think it's better to be explicit, either with ::Socket::Addrinfo | ::Socket::Address or an abstract module interface.
Introducing a new model would also solve the issue that the Socket name space is not in core lib (see https://github.com/crystal-lang/rfcs/blob/rfc/0005/text/0007-event_loop-refactor.md#socket).

It's actually hard to stub out ::Socket::Addrinfo and ::Socket::Address because that would require defining the Socket namespace, which is a class that inherits IO and that's so much complexity we could just include socket.cr.

I've considered introducing a module which could be included in both types. But that's actually not a great solution for when we return an address.
So I think it's probably best to introduce an internal struct type which holds a reference to LibC::Sockaddr and its size.

So I think it's probably best to introduce an internal struct type which holds a reference to LibC::Sockaddr and its size.

Or to leave it as the union?

Those types are not in core lib. Their existence depends on require "socket". So leaving the union won't work as is. Trying to see if we can stub it out (or replace with an internal type) is one possible way to deal with that.

But I think a better approach is described in #7 (comment)

dsisnero · 2024-05-15T21:35:43Z

Here's a good link on CancelToken instead of looking for Timeouts or even signals - use a CancelToken and have those as subclasses of it https://vorpus.org/blog/timeouts-and-cancellation-for-humans/

straight-shoota · 2024-05-27T10:17:04Z

text/0007-event_loop-refactor.md

+#### Socket
+
+One instance of this problem shows already in the core features: The event loop interface has type restrictions of the `Socket` namespace in abstract defs, but `Socket` is not in the core lib.
+
+Options:
+
+- Omit those abstract defs (dilutes the interface, so not ideal)
+- Split `EventLoop` interface and add parts of it only with `require “socket”`
+- Add stub declarations for the involved types (`Socket::Handle` and `Socket::Address` - `Socket` itself is only used as parameter type which is technically okay for abstract methods)


I considered option 3 (stubbing the affected types in core lib) as maybe the most approachable solution, but I don't think it is.
We technically don't need to stub Socket because it's only used as type restriction in the parameters which are not validated by the abstract method checker. However, I believe this should eventually be covered as well, which would break this solution in the future. And I think it's generally better to be strict about abstract method implementations, even if the compiler won't complain.
Either way, I don't think it makes much sense to stub that many types (2 or 3) for the event loop API. It would probably be easier to just use the actual types directly.

Considering that we may need to handle optional event loop features eventually (see previous section), it might be a good exercise to start with sockets.

Here's how this could work:

Define a module Crystal::System::EventLoop::Socket

This module is empty by default.

module Crystal::System::EventLoop::Socket # empty end

In crystal/system/socket.cr, add abstract defs to Crystal::System::EventLoop::Socket (this could probably be in a different file).

module Crystal::System::EventLoop::Socket abstract def close(socket : ::Socket) : Nil # etc. end

All current event loop implementations implement the socket interface, so they can unconditionally include Crystal::System::EventLoop::Socket. There's no need to hide the implementation when sockets are not used.
We can even include it in Crystal::System::EventLoop (and pull it out when we ever get an event loop implementation that doesn't implement sockets).

class Crystal::LibEvent::EventLoop include Crystal::EventLoop::Socket def close(socket : ::Socket) : Nil # ... end # etc. end

In a program where sockets are not used, the abstract defs are not checked because the module is empty. The implementation methods are never called, so their type restrictions don't matter.

In a program where sockets are used, the module has abstract defs and their implementation will be checked.

I know I was an advocate for this style, but now that I see it I'm having a little itch. How would it look like with a split interface for the EventLoop? Say, EventLoop::SocketInterface, EventLoop::FileInterface, ... and then let each EL implementation include the module interfaces they comply to.

Yeah, that's basically what this is.
However, there's the problem that at the event loop implementation we can't know whether sockets are used in the program (i.e. if Socket is defined) or not. The event loop gets required as part of the prelude while require "socket" will only be in user code.

So the trick here is to have the socket interface empty by default and only fill it with abstract defs if sockets are used.
That makes it possible for the event loop implementation to always include the socket interface. We get abstract def checks if sockets are used.

text/0007-event_loop-refactor.md

beta-ziliani · 2024-05-27T19:20:32Z

text/0007-event_loop-refactor.md

+  # Opens a connection on *socket* to the target *address* and continues fiber
+  # when the connection has been established.
+  # Returns `IO::Error` but does not raise.
+  abstract def connect(socket : ::Socket, address : ::Socket::Addrinfo | ::Socket::Address, timeout : ::Time::Span?) : IO::Error?


So I think it's probably best to introduce an internal struct type which holds a reference to LibC::Sockaddr and its size.

Or to leave it as the union?

text/0007-event_loop-refactor.md

beta-ziliani · 2024-05-27T19:58:26Z

text/0007-event_loop-refactor.md

+#### Socket
+
+One instance of this problem shows already in the core features: The event loop interface has type restrictions of the `Socket` namespace in abstract defs, but `Socket` is not in the core lib.
+
+Options:
+
+- Omit those abstract defs (dilutes the interface, so not ideal)
+- Split `EventLoop` interface and add parts of it only with `require “socket”`
+- Add stub declarations for the involved types (`Socket::Handle` and `Socket::Address` - `Socket` itself is only used as parameter type which is technically okay for abstract methods)


I know I was an advocate for this style, but now that I see it I'm having a little itch. How would it look like with a split interface for the EventLoop? Say, EventLoop::SocketInterface, EventLoop::FileInterface, ... and then let each EL implementation include the module interfaces they comply to.

straight-shoota · 2024-05-28T13:13:38Z

text/0007-event_loop-refactor.md

+    # Reads at least one byte from the socket into *slice* and continues fiber
+    # when the read is complete.
+    # Returns the number of bytes read.
+    abstract def read(socket : ::Socket, slice : Bytes) : Int32
+
+    # Writes at least one byte from *slice* to the socket and continues fiber
+    # when the write is complete.
+    # Returns the number of bytes written.
+    abstract def write(socket : ::Socket, slice : Bytes) : Int32


I'm wondering if we should use the terms read/write or send/receive. Their meaning is essentially equivalent.
read/write would be more similar to the FileDescriptor equivalents.
But the imlementations tend to use the send/receive system APIs. So I might prefer that.

straight-shoota · 2024-05-29T17:51:43Z

I have created two PRs implementing the base event loop interfaces:

FileDescriptor: Add EventLoop::FileDescriptor module crystal#14639
Socket: Add EventLoop::Socket module crystal#14643

text/0007-event_loop-refactor.md

Co-authored-by: Beta Ziliani <beta@manas.tech>

ysbaddaden

I left some suggestions for the API documentation (typos, improve FD#read and FD#write).

text/0007-event_loop-refactor.md

Co-authored-by: Julien Portalier <julien@portalier.com>

straight-shoota self-assigned this May 2, 2024

Event loop refactor

9a93501

straight-shoota force-pushed the rfc/0005 branch from fe631ed to 9a93501 Compare May 2, 2024 19:07

straight-shoota marked this pull request as draft May 2, 2024 19:07

straight-shoota changed the title ~~Event loop refactor~~ RFC #0007: Event loop refactor May 2, 2024

straight-shoota mentioned this pull request May 3, 2024

RFC: Refactor Crystal::EventLoop to disconnect it from LibEvent crystal-lang/crystal#10766

Open

yxhuvud reviewed May 3, 2024

View reviewed changes

Blacksmoke16 reviewed May 7, 2024

View reviewed changes

text/0007-even_loop-refactor.md Outdated Show resolved Hide resolved

straight-shoota added 3 commits May 7, 2024 18:44

Fix filename

04a7c37

Fill guide-level explanation

c70c896

Add abstract interface description

ba1e1a4

straight-shoota commented May 7, 2024

View reviewed changes

straight-shoota mentioned this pull request May 24, 2024

Extract #system_read and #system_write for FileDescriptor and Socket crystal-lang/crystal#14626

Merged

straight-shoota added 2 commits May 27, 2024 12:15

Fix associate *Socket* subsection with *Optional event loop features*

3e40217

fixup

3bff4d1

straight-shoota commented May 27, 2024

View reviewed changes

text/0007-event_loop-refactor.md Outdated Show resolved Hide resolved

beta-ziliani reviewed May 27, 2024

View reviewed changes

straight-shoota mentioned this pull request May 28, 2024

Drop Crystal::System::Socket#system_send crystal-lang/crystal#14637

Merged

straight-shoota added 5 commits May 28, 2024 15:03

Resolve *read and write behaviour*

9afc76f

Resolve *Timeout and Resume events*

9ab3a8e

Fix typo

0647def

Drop EventLoop#send, #receive

113e401

Separate EventLoop interface into modules

b1978fa

straight-shoota commented May 28, 2024

View reviewed changes

straight-shoota added 2 commits May 28, 2024 15:14

typo

a42930c

Add run and interrupt for completeness

5db0edb

straight-shoota mentioned this pull request May 28, 2024

Add EventLoop::FileDescriptor module crystal-lang/crystal#14639

Merged

straight-shoota added 2 commits May 29, 2024 19:36

fixup

a08f935

Improve API docs for EventLoop::Socket

5b7f663

straight-shoota mentioned this pull request May 29, 2024

Add EventLoop::Socket module crystal-lang/crystal#14643

Merged

beta-ziliani reviewed May 30, 2024

View reviewed changes

text/0007-event_loop-refactor.md Outdated Show resolved Hide resolved

Update text/0007-event_loop-refactor.md

8b9b09b

Co-authored-by: Beta Ziliani <beta@manas.tech>

ysbaddaden reviewed May 31, 2024

View reviewed changes

Apply suggestions from code review

04462e8

Co-authored-by: Julien Portalier <julien@portalier.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC #0007: Event loop refactor #7

RFC #0007: Event loop refactor #7

straight-shoota commented May 2, 2024 •

edited

crysbot commented May 3, 2024

yxhuvud May 3, 2024 •

edited

yxhuvud May 3, 2024 •

edited

straight-shoota May 21, 2024

yxhuvud May 21, 2024

straight-shoota May 21, 2024

yxhuvud May 21, 2024 •

edited

straight-shoota May 21, 2024

ysbaddaden May 23, 2024

yxhuvud May 3, 2024

ysbaddaden May 3, 2024

straight-shoota May 3, 2024

yxhuvud May 3, 2024

crysbot commented May 7, 2024

straight-shoota May 7, 2024

straight-shoota May 21, 2024

beta-ziliani May 27, 2024

straight-shoota May 27, 2024

dsisnero commented May 15, 2024

straight-shoota May 27, 2024 •

edited

straight-shoota May 27, 2024 •

edited

beta-ziliani May 27, 2024

straight-shoota May 27, 2024

beta-ziliani May 27, 2024

beta-ziliani May 27, 2024

straight-shoota May 28, 2024

straight-shoota commented May 29, 2024

ysbaddaden left a comment


		Should events from the Crystal runtime be part of the event loop as well?

		- Fiber: sleep


		### Optional event loop features

		Some activities are managed on the event loop on one platform but not on others. Example would be `Process#wait` which goes through IOCP on Windows but on Unix it’s part of signal handling. (Note: Perhaps we could try to get that on the event loop on Unix as well? 🤔 But there are other examples of system differences)


		### Bulk events without fibers

		For some applications it might be useful to interact with the event loop directly, being able to push operations in bulk without having to spawn (and wait) a fiber for each one.

RFC #0007: Event loop refactor #7

Are you sure you want to change the base?

RFC #0007: Event loop refactor #7

Conversation

straight-shoota commented May 2, 2024 • edited

crysbot commented May 3, 2024

yxhuvud May 3, 2024 • edited

Choose a reason for hiding this comment

yxhuvud May 3, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yxhuvud May 21, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

crysbot commented May 7, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dsisnero commented May 15, 2024

straight-shoota May 27, 2024 • edited

Choose a reason for hiding this comment

straight-shoota May 27, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

straight-shoota commented May 29, 2024

ysbaddaden left a comment

Choose a reason for hiding this comment

straight-shoota commented May 2, 2024 •

edited

yxhuvud May 3, 2024 •

edited

yxhuvud May 3, 2024 •

edited

yxhuvud May 21, 2024 •

edited

straight-shoota May 27, 2024 •

edited

straight-shoota May 27, 2024 •

edited