Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

M:N Threads, now w/ macOS support (kqueue) #9178

Merged
merged 2 commits into from Dec 20, 2023
Merged

Conversation

jpcamara
Copy link
Contributor

@jpcamara jpcamara commented Dec 9, 2023

Adds macOS support to M:N threads by adding in kqueue/kevent calls when present. Technically this would open up support for FreeBSD as well, but I don't have a way of testing that so i'm not sure how well it does or doesn't work there.

I wanted to get support going for macOS so more devs can try out M:N threads and test it. I do think there should be a larger discussion around the potential relationship between this and some of the awesome fiber scheduler work over the last few years (and the potential of utilizing anything from io-event?). But that can be a topic for another day - let's get macOS support in the meantime!

Disclaimer: C isn't my day-to-day language, so I could definitely use feedback there. I'm also more of a kernel event queue (kqueue, epoll, io_uring) enthusiast, but kqueue isn't something I have specific experience writing with - just lots of reading code and small toy code up until this point.

https://bugs.ruby-lang.org/issues/20053

cc @ko1 @byroot @ioquatix

thread.c Outdated Show resolved Hide resolved
@@ -3081,7 +3094,7 @@ rb_thread_create_timer_thread(void)

CLOSE_INVALIDATE_PAIR(timer_th.comm_fds);
#if HAVE_SYS_EPOLL_H && USE_MN_THREADS
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally made this #if (HAVE_SYS_EPOLL_H || HAVE_SYS_EVENT_H) && USE_MN_THREADS, but several specs failed with EBADF errors. In the docs (https://man.openbsd.org/kqueue.2) it says The queue is not inherited by a child created with [fork(2)](https://man.openbsd.org/fork.2). Similarly, kqueues cannot be passed across UNIX-domain sockets.. Possibly that is the reason it's invalid at this point?

@@ -575,7 +585,12 @@ timer_thread_register_waiting(rb_thread_t *th, int fd, enum thread_sched_waiting
}
else {
VM_ASSERT(fd >= 0);
#if HAVE_SYS_EVENT_H
EV_SET(&ke[num_events], fd, EVFILT_READ, EV_ADD, 0, 0, (void *)th);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on different usages, you often see EV_ADD included with EV_CLEAR or EV_ONESHOT. Since the MN code automatically handles unregistering events these possibly are not necessary here.

thread_pthread_mn.c Outdated Show resolved Hide resolved
}
break;

default:
RUBY_DEBUG_LOG("%d event(s)", r);

#if HAVE_SYS_EVENT_H
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block of code is very similar to the epoll code, with small kevent specific changes. Maybe they should be consolidated more?


th->sched.waiting_reason.flags = thread_sched_waiting_none;
th->sched.waiting_reason.data.fd = -1;
th->sched.waiting_reason.data.result = filter;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could receive multiple filters back, so setting this over and over again and ending on the last value probably doesn't make sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually fine for now. Because the flags get set back to thread_sched_waiting_none, any subsequent event in the loop pointing at this thread will act like a no-op and hit the // already released else clause. It is not granular about which event fired, just that the thread can be active again.

@jpcamara
Copy link
Contributor Author

jpcamara commented Dec 9, 2023

Got some Mac test failures I'll look into

@ioquatix
Copy link
Member

ioquatix commented Dec 9, 2023

Do you mind me asking, if two threads wait for the same IO to become readable (or writable) or both, how is that handled?

// Linux 2.6.9 or later is needed to pass NULL as data.
if (epoll_ctl(timer_th.epoll_fd, EPOLL_CTL_DEL, fd, NULL) == -1) {
if (epoll_ctl(timer_th.kq_fd, EPOLL_CTL_DEL, fd, NULL) == -1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it wrong to use epoll_ctl with kq_fd?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was somewhat worried this may bring confusion 😕

I renamed the existing epoll_fd to kq_fd. Since both epoll and kqueue are kernel event queues, it felt reasonable as a name.

But because it sounds so much like kqueue, maybe it's confusing. Since they're both int descriptors, it allowed me to avoid some #if checks by having a shared name. So I could:

  1. Change the name to something less overloaded sounding? event_fd?
  2. Keep both and just have one or two additional checks.
  3. Leave it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to event_fd

@jpcamara
Copy link
Contributor Author

jpcamara commented Dec 9, 2023

Do you mind me asking, if two threads wait for the same IO to become readable (or writable) or both, how is that handled?

Definitely the type of question I want, thanks! Also, I'm not sure I have an exact answer for it but I'll share some thoughts.

  1. That seems like undefined/corrupting behavior in normal threaded ruby - it doesn't seem safe/advisable to be operating on the same IO across 1:1 threads. And since M:N act like regular threads, the same basic answer probably applies? Ie, it's a user bug. But maybe @ko1 could provide more context as to whether the mn implementation attempts to do anything about this (I don't think so?)
  2. If nothing is in place to stop it, I think it would cause events to not fire on the appropriate thread. kqueue itself is thread safe, but when I set the filter in EV_SET I set the udata to the thread. EV_SET(&ke[num_events], fd, EVFILT_READ, EV_ADD, 0, 0, (void *)th). Seems like if multiple threads did this it'd only return the last thread once the event fires. Do you have any insight into that kqueue behavior @ioquatix and if that's accurate?

I'm not sure if it'd be expected to handle this scenario in this code or not, but I'm open to it if there is an appropriate solution or behavior for mn threads.

@ioquatix using the fiber scheduler with Io-event on Mac (ie, kqueue), what would be the behavior if multiple fibers tried to wait for the same IO to be readable or writable?

I am curious to see the behavior - so I will give it a try

@jpcamara
Copy link
Contributor Author

jpcamara commented Dec 9, 2023

Do you mind me asking, if two threads wait for the same IO to become readable (or writable) or both, how is that handled?

Hmmm I do see now that epoll can return EEXIST which means another thread has already registered a fd and it has special handling in the current code - might need additional handling here

@ioquatix
Copy link
Member

ioquatix commented Dec 11, 2023

I think it would be better to split out the implementations into different files, i.e. I'm against the complexity of the code when 2+ different OS implementations are in the same file. In addition, it feels like we are solving all the problems that are already solved by io-event gem. Maybe it would be better just to use that instead and use the existing hooks.

@jpcamara
Copy link
Contributor Author

jpcamara commented Dec 11, 2023

I think it would be better to split out the implementations into different files, i.e. I'm against the complexity of the code when 2+ different OS implementations are in the same file.

Totally open to changing the pre processor approach and making more split out functions. The implementation is very small, so having a dedicated file seems like overkill. Considering how flat the CRuby codebase is I did not want to add an extra file without some @ko1 feedback on the approach. But I could at least get rid of all the preprocessor chunks if requested.

In addition, it feels like we are solving all the problems that are already solved by io-event gem. Maybe it would be better just to use that instead and use the existing hooks.

As a production user of async/io-event, this was my thought when I first saw the implementation of M:N threads. I think there are limits the current M:N approach will hit that the fiber scheduler is currently capable of. But I couldn't see an easy way to utilize io-event. The implementations seem pretty incompatible.

The existing fiber scheduler hooks are a well defined interface and it'd be great to utilize them. I would love to see these two approaches come together. But it's also a large, fundamental change.

I haven't seen activity or discussion around bringing the two together, and I wanted to see support for macOS for M:N threads. That's why I've gone with this current approach.

@jpcamara
Copy link
Contributor Author

Do you mind me asking, if two threads wait for the same IO to become readable (or writable) or both, how is that handled?

Hmmm I do see now that epoll can return EEXIST which means another thread has already registered a fd and it has special handling in the current code - might need additional handling here

I'm not sure I can even come up with a valid use-case of this, do you have a suggestion? When would it be correct for multiple threads to listen to the same IO? It'd still be good to be able to handle it, but it seems like that would be a bug to begin with?

@ioquatix
Copy link
Member

ioquatix commented Dec 12, 2023

I'm not sure I can even come up with a valid use-case of this, do you have a suggestion? When would it be correct for multiple threads to listen to the same IO? It'd still be good to be able to handle it, but it seems like that would be a bug to begin with?

It's fairly normal to have one thread reading and several threads writing to the same network socket, e.g. HTTP/2.

Right now, it's absolutely possible to have several threads writing to $stdout for example (via puts) so for sure this should work. The naive implementation of the selector won't work at all, so I advise you to use libev instead if you want something that works with minimal effort. libev takes care of multiplexing multiple waiters on a single file descriptor using whatever is most efficient for the implementation. I recommend you look at the implementation of either libev or io-event in that respect.

Using libev might present a license issue though...

@jpcamara jpcamara force-pushed the kqueue_mn_threads branch 3 times, most recently from 45c1999 to 12d9b03 Compare December 13, 2023 02:26
@hsbt hsbt assigned ko1 Dec 13, 2023
@hsbt hsbt requested a review from ko1 December 13, 2023 03:29
@jpcamara jpcamara force-pushed the kqueue_mn_threads branch 3 times, most recently from 15db3e3 to 18acf94 Compare December 14, 2023 02:02
@jpcamara
Copy link
Contributor Author

jpcamara commented Dec 14, 2023

@ioquatix thanks for the reply! Fundamentally I agree that multiple registrations on the same IO is a scenario that should be accounted for - i'm just not sure how to create a valid scenario to test with.

It's fairly normal to have one thread reading and several threads writing to the same network socket, e.g. HTTP/2.

Considering you've implemented an HTTP/2 server, you definitely have alot more experience here than me! But in my understanding, while multiple threads/fibers can share the same socket in HTTP/2, they don't read/write to that socket simultaneously. There would have to be coordinated access to the shared socket to make sure valid chunks were read, and valid chunks were written.

  1. Even though they'd coordinate reads/writes, are you saying that a sensible threaded HTTP/2 server would have each thread individually register for reads/write availability on that same socket, rather than a coordinator registering once and handing off to each thread?
  2. Does Falcon work this way? Every fiber that shares a particular socket also registers their own events with it?

It's tough in the ruby world to find a concrete example of this since afaik falcon is the only http/2 enabled ruby server. Is there a place I can look in the falcon code for how multiple reads/writes are registering on the same socket as an example to mimic?

Right now, it's absolutely possible to have several threads writing to $stdout for example (via puts) so for sure this should work.

This is also a tough example, since $stdout seems to always be writable so it never tries to wait for the event to be ready. Even using io-event I haven't been able to trigger any waiting behavior. I think I may have gotten it to trigger in io-event by handing enormous strings to puts (10s to 100s of megabytes?), which may have to do with the buffer usage there and having a maximum_size:

// https://github.com/socketry/io-event/blob/main/ext/io/event/selector/kqueue.c#L697C2-L709C38
rb_io_buffer_get_bytes_for_reading(arguments->buffer, &base, &size);
size_t length = arguments->length;
size_t offset = arguments->offset;
size_t total = 0;
size_t maximum_size = size - offset;

But in MN threaded code I have not been able to get it to register even with multiple threads putsing huge amounts of content.

The naive implementation of the selector won't work at all, so I advise you to use libev instead if you want something that works with minimal effort. libev takes care of multiplexing multiple waiters on a single file descriptor using whatever is most efficient for the implementation.
Using libev might present a license issue though...

Yea - even license aside, it does not seem well maintained. And would require the ruby team to agree to bringing in that dependency. And would still likely be a total rewrite of how MN threads currently works.

It'd be nice to find an example that doesn't represent buggy behavior I could try to actually test this scenario with, from Ruby code.

@ioquatix
Copy link
Member

Even though they'd coordinate reads/writes, are you saying that a sensible threaded HTTP/2 server would have each thread individually register for reads/write availability on that same socket, rather than a coordinator registering once and handing off to each thread?

Reads and writes don't need to be coordinated. However, usually there is one thread doing the reading and several threads doing writing (usually with a mutex). So yes, potentially.

Yea - even license aside, it does not seem well maintained.

Don't mistake extremely stable for "not well maintained".

It'd be nice to find an example that doesn't represent buggy behavior I could try to actually test this scenario with, from Ruby code.

Yes, I agree. I tried to create one but I'm unsure if it's running on the NM model. Is there any way to tell? e.g. RubyVM.mn_threads? or something?

@jpcamara
Copy link
Contributor Author

Even though they'd coordinate reads/writes, are you saying that a sensible threaded HTTP/2 server would have each thread individually register for reads/write availability on that same socket, rather than a coordinator registering once and handing off to each thread?

Reads and writes don't need to be coordinated. However, usually there is one thread doing the reading and several threads doing writing (usually with a mutex). So yes, potentially.

A Falcon example of this would be great 👍🏼

Yea - even license aside, it does not seem well maintained.

Don't mistake extremely stable for "not well maintained".

Fair enough - but it's a big lift, and if there was a big lift a better one would be to use io-event. Neither would be trivial and would involve decisions well beyond what I can do in isolation.

It'd be nice to find an example that doesn't represent buggy behavior I could try to actually test this scenario with, from Ruby code.

Yes, I agree. I tried to create one but I'm unsure if it's running on the NM model. Is there any way to tell? e.g. RubyVM.mn_threads? or something?

To enable it you use RUBY_MN_THREADS=1 and when running ruby -v you'd see +MN in the output. For example:

ruby 3.3.0dev (2023-12-14T02:02:44Z kqueue_mn_threads 18acf94fe4) +MN [arm64-darwin23]

It's not 100% clear right now which paths will lead to MN kernel event queue code, and which paths won't. I've found that using Net::HTTP.get for instance will run through the MN code, but trying to do a socket#readpartial isn't for me. Still looking into it.

@ioquatix
Copy link
Member

A Falcon example of this would be great 👍🏼

Falcon uses async and supports HTTP/2. Every HTTP/2 connection has a background reader, while individual HTTP/2 streams will write packets to the socket. I'm not sure this example will help you with the MN scheduler though.

Fair enough - but it's a big lift, and if there was a big lift a better one would be to use io-event. Neither would be trivial and would involve decisions well beyond what I can do in isolation.

Makes sense.

To enable it you use RUBY_MN_THREADS=1 and when running ruby -v you'd see +MN in the output.

Thanks that's helpful, yes, I can see the output, but I can't see any debug logs from the MN scheduler... so I don't think it's working?

@ioquatix
Copy link
Member

I'm rethinking this comment a bit. EEXIST would only get triggered if another thread already registered a fd - but MN shares threads. So if multiple MN threads are sharing the same thread, this won't be triggered. And epoll registrations afaik will overwrite previous event registrations for the same fd.

I think the result you'd expect to see here is a deadlock or errno/exception if it wasn't working correctly.

@ioquatix
Copy link
Member

ioquatix commented Dec 14, 2023

It looks like to me, the epoll support is very optimistic. It will try to register the fd here:

ruby/thread.c

Line 1689 in b7e89d4

thread_sched_wait_events(TH_SCHED(th), th, fd, waitfd_to_waiting_flag(events), NULL);
but if it fails (which is the case with more than one thread waiting on the same IO), it will fall back to a blocking system call. In fact, it looks like the return value isn't even checked, so maybe it still does a blocking system call on every wait operation? I don't really understand what the intention was and there are very few comments explaining the behaviour.

@jpcamara
Copy link
Contributor Author

It looks like to me, the epoll support is very optimistic. It will try to register the fd here:

ruby/thread.c

Line 1689 in b7e89d4

thread_sched_wait_events(TH_SCHED(th), th, fd, waitfd_to_waiting_flag(events), NULL);
but if it fails (which is the case with more than one thread waiting on the same IO), it will fall back to a blocking system call. In fact, it looks like the return value isn't even checked, so maybe it still does a blocking system call on every wait operation? I don't really understand what the intention was and there are very few comments explaining the behaviour.

Yea the primary place I see MN actually take effect is in rb_thread_wait_for_single_fd. I agree that it is a bit hard to discern what behavior is expected in different scenarios.

@jpcamara
Copy link
Contributor Author

This has been a good discussion @ioquatix, because it's made me dig deeper into a few different things. I also think after reviewing alot of the MN code and codepaths that this kqueue approach is likely good as it is, because there are lots of improvements needed in general in MN. It should behave almost identically to the current epoll work. It gets macOS running MN code, and I think there should be some real discussions about how to piggyback on all the clear hooks into blocking behavior that are well-defined by the fiber scheduler (and ideally port over io-event into a way that works for both systems).

I know you know this part in particular, but i'd like to lay it out:

The fiber scheduler has a very well-defined interface for hooking into blocking operations and handling them efficiently using kernel-specific technologies (epoll, kqueue, io_uring, etc). At just about every possible blocking operation, a fiber scheduler implementation can take over. So io-event/async handle everything mentioned in scheduler.c.

 *  Hook methods are:
 *
 *  * #io_wait, #io_read, #io_write, #io_pread, #io_pwrite, and #io_select, #io_close
 *  * #process_wait
 *  * #kernel_sleep
 *  * #timeout_after
 *  * #address_resolve
 *  * #block and #unblock
 *  * (the list is expanded as Ruby developers make more methods having non-blocking calls)

It's pretty exhaustive 👏!

The MN implementation is two parts:

  1. A green thread implementation which is basically a thread wrapper around a coroutine with a shared pool of native threads. See native_thread_create_shared for that creation process.
  2. A very basic event loop implementation

(1) seems like it was the biggest lift. It's tied into ractors and pthreads pretty deeply.
(2) seems like a more nascent initative. Unlike the fiber scheduler, it is implemented in very few places. There are only three places it is hooked into right now:

  • native_sleep
  • rb_thread_io_blocking_call
  • rb_thread_wait_for_single_fd

And even in those places, again unlike the fiber scheduler, it does not hijack the process. It uses epoll/kqueue to wait for readiness, then falls back to the rest of the normal code. So for instance in rb_thread_wait_for_single_fd, it will use kevent to wait for readiness on a socket, then because it doesn't return from the method it still calls select as well. Which doesn't block because kevent already waited for readiness. But this is clearly a WIP - I can't see how that's final desired behavior.

And when you hit scenarios that it is not in control of, the MN runtime falls back to creating actual native threads.
I'm not 100% following that whole flow yet from end-to-end, but the final decision is made in this method:

static int
native_thread_check_and_create_shared(rb_vm_t *vm)
{
    bool need_to_make = false;

    rb_native_mutex_lock(&vm->ractor.sched.lock);
    {
        unsigned int snt_cnt = vm->ractor.sched.snt_cnt;
        if (!vm->ractor.main_ractor->threads.sched.enable_mn_threads) snt_cnt++; // do not need snt for main ractor

        if (((int)snt_cnt < MINIMUM_SNT) ||
            (snt_cnt < vm->ractor.cnt  &&
             snt_cnt < vm->ractor.sched.max_cpu)) {

            RUBY_DEBUG_LOG("added snt:%u dnt:%u ractor_cnt:%u grq_cnt:%u",
                           vm->ractor.sched.snt_cnt,
                           vm->ractor.sched.dnt_cnt,
                           vm->ractor.cnt,
                           vm->ractor.sched.grq_cnt);

            vm->ractor.sched.snt_cnt++;
            need_to_make = true;
        }
        else {
            RUBY_DEBUG_LOG("snt:%d ractor_cnt:%d", (int)vm->ractor.sched.snt_cnt, (int)vm->ractor.cnt);
        }
    }
    rb_native_mutex_unlock(&vm->ractor.sched.lock);

    if (need_to_make) {
        struct rb_native_thread *nt = native_thread_alloc();
        nt->vm = vm;
        return native_thread_create0(nt);
    }
    else {
        return 0;
    }
}

If you run ruby code using MN, and ruby debug logging, you'll see alot of either "added snt" or "snt:". If you see alot of "added snt", the runtime is just creating native threads (This can happen a little during normal MN processing too, since the shared pool does not have to be 1. But it shouldn't end up as 1:1).

I'll provide two examples - one that properly uses shared MN threads and kqueue, and one that defaults back to full native threads and totally bypasses kqueue.

10.times.map do |i|
  Thread.new do
    # I've got a local server which just sleeps for 15 seconds
    Net::HTTP.get(URI("http://localhost:8080/15000ms"))
  end
end.each(&:join)

In this example you'd see the following debug logs. It repeatedly shares the same thread (if you monitor activity, it only ever creates one additional native thread), and uses kqueue.

native_thread_check_and_create_shared added snt:0 dnt:1 ractor_cnt:1 grq_cnt:0
thread_sched_wait_events	wait fd:14
thread_sched_wait_events	sleep

native_thread_check_and_create_shared snt:2 ractor_cnt:1
thread_sched_wait_events	wait fd:15
thread_sched_wait_events	sleep

...and on and on 10ish times

This example will instead fallback to native threads - I'm assuming it must not hit the limited kqueue/epoll paths setup by MN (ie, native_sleep, rb_thread_io_blocking_call, rb_thread_wait_for_single_fd):

hostname = 'localhost'
port = 8080
path = '/15000ms'

10.times.map do
  Thread.new do
    socket = TCPSocket.open(hostname, port)
    request = "GET #{path} HTTP/1.1\r\nHost: #{hostname}\r\n\r\n"
    socket.puts(request)
    response = socket.readpartial(1024)
    socket.close
  end
end.each(&:join)

You'd see the following debug logs. It repeatedly creates new threads (if you monitor activity, it is a 1:1 native thread to ruby thread), and uses select exclusively.

native_thread_check_and_create_shared added snt:0 dnt:1 ractor_cnt:1 grq_cnt:0
native_thread_check_and_create_shared added snt:0 dnt:2 ractor_cnt:1 grq_cnt:1
native_thread_check_and_create_shared added snt:0 dnt:3 ractor_cnt:1 grq_cnt:1
...and on and on creating 10 threads...

Lots of room for growth and improvement, as future initatives past this PR 👍🏼

@@ -753,14 +883,51 @@ timer_thread_polling(rb_vm_t *vm)
// simply retry
break;
default:
perror("epoll_wait");
rb_bug("epoll_wait errno:%d", errno);
perror("kq");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kq specific?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Earlier naming, sorry - I changed it to event_wait to match the name at the top of the function.

}
}

rb_native_mutex_lock(&timer_th.waiting_lock);
{
#if HAVE_SYS_EVENT_H
if (num_events > 0) {
if (kevent(timer_th.event_fd, ke, num_events, NULL, 0, NULL) == -1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with kevent system call but is there any failure we can ingore?

Copy link
Contributor

@ko1 ko1 Dec 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other words kevent should success any time? (otherwize is it a BUG?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a list of the possible errors: https://man.freebsd.org/cgi/man.cgi?query=kevent&apropos=0&sektion=0&manpath=FreeBSD+9.0-RELEASE&arch=default&format=html#end

Screenshot 2023-12-15 at 10 30 24 PM

I mostly followed the lead of the epoll_ctl call, in particular for the EBADF. I'm not really sure if ignorning them is safe/a good idea.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

io1 = open("/tmp/x", 'w')
io2 = IO.for_fd(io1.fileno)
p io1.close #=> nil
p io2.close #=> Bad file descriptor (Errno::EBADF)

I think EBADF is possible.

thread.c Outdated
#ifdef RUBY_THREAD_PTHREAD_H

static bool
thread_sched_wait_events_timeval(int fd, int events, struct timeval *timeout)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's receive th as a function parameter.

Copy link
Contributor

@ko1 ko1 Dec 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also if (!th_has_dedicated_nt(th)) can be located at first.

thread_sched_wait_events_timeval(th, ...)
{
#ifdef RUBY_THREAD_PTHREAD_H
  if (!th->nt->dedicated) {
    ...
  }
#endif // RUBY_THREAD_PTHREAD_H
  return 0;
}

and caller can be simpler like:

  if (thread_sched_wait_events_timeval()) { return ...; // timeouit }

Copy link
Contributor Author

@jpcamara jpcamara Dec 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great suggestions! Much cleaner now - you can review the specific change made here 19fd62d

@ko1
Copy link
Contributor

ko1 commented Dec 15, 2023

I didn't use macOS so I couldn't check it. I'm okay to merge it if it has good quality but I can't judge it.
@nurse do you have any idea about it on merging it now?


BTW I'd planned to restructure the code when merging kqueue support. But the re-structuring can apply later, after ruby 3.3.

@ko1
Copy link
Contributor

ko1 commented Dec 15, 2023

For me (if my kqueue knowledge is correct) it seems no problem.

@ko1
Copy link
Contributor

ko1 commented Dec 15, 2023

You'd see the following debug logs. It repeatedly creates new threads (if you monitor activity, it is a 1:1 native thread to ruby thread), and uses select exclusively.

Do you know which method generating the native threads more? TCPSocket.open?

@jpcamara jpcamara force-pushed the kqueue_mn_threads branch 2 times, most recently from 418b558 to 5543d5c Compare December 16, 2023 04:06
@jpcamara
Copy link
Contributor Author

You'd see the following debug logs. It repeatedly creates new threads (if you monitor activity, it is a 1:1 native thread to ruby thread), and uses select exclusively.

Do you know which method generating the native threads more? TCPSocket.open?

@ko1 The readpartial call is where it hangs and the native threads start getting created. But I think I know at least part of the problem. rb_thread_io_blocking_region always calls rb_thread_io_blocking_call with an events of 0. So it never tries to use the MN thread_sched_wait_events function because it's hard-coded to always fail the check:

VALUE
rb_thread_io_blocking_region(rb_blocking_function_t *func, void *data1, int fd)
{
    return rb_thread_io_blocking_call(func, data1, fd, 0);
}

VALUE
rb_thread_io_blocking_call(rb_blocking_function_t *func, void *data1, int fd, int events)
{
    // ...
    // is always false because `events` is always 0
    if (events && !th_has_dedicated_nt(th)) {
        // ...
        thread_sched_wait_events(TH_SCHED(th), th, fd, waitfd_to_waiting_flag(events), NULL);
    }
#endif

@jpcamara jpcamara force-pushed the kqueue_mn_threads branch 4 times, most recently from a9dfe23 to b71c0cd Compare December 17, 2023 03:34
@jpcamara jpcamara requested a review from ko1 December 17, 2023 10:32
* Allows macOS users to use M:N threads (and technically FreeBSD, though it has not been verified on FreeBSD)

* Include sys/event.h header check for macros, and include sys/event.h when present

* Rename epoll_fd to more generic kq_fd (Kernel event Queue) for use by both epoll and kqueue

* MAP_STACK is not available on macOS so conditionall apply it to mmap flags

* Set fd to close on exec

* Log debug messages specific to kqueue and epoll on creation

* close_invalidate raises an error for the kqueue fd on child process fork. It's unclear rn if that's a bug, or if it's kqueue specific behavior

Use kq with rb_thread_wait_for_single_fd

* Only platforms with `USE_POLL` (linux) had changes applied to take advantage of kernel event queues. It needed to be applied to the `select` so that kqueue could be properly applied

* Clean up kqueue specific code and make sure only flags that were actually set are removed (or an error is raised)

* Also handle kevent specific errnos, since most don't apply from epoll to kqueue

* Use the more platform standard close-on-exec approach of `fcntl` and `FD_CLOEXEC`. The io-event gem uses `ioctl`, but fcntl seems to be the recommended choice. It is also what Go, Bun, and Libuv use

* We're making changes in this file anyways - may as well fix a couple spelling mistakes while here

Make sure FD_CLOEXEC carries over in dup

* Otherwise the kqueue descriptor should have FD_CLOEXEC, but doesn't and fails in assert_close_on_exec
* When we have the thread already, it saves a lookup

* `event_wait`, not `kq`

Clean up the `thread_sched_wait_events_timeval` calls

* By handling the PTHREAD check inside the function, all the other code can become much simpler and just call the function directly without additional checks
ko1 added a commit to ko1/ruby that referenced this pull request Dec 19, 2023
use `rb_thread_io_blocking_call()` instead of
`rb_thread_io_blocking_region()` more.

See ruby#9178 (comment)
ko1 added a commit that referenced this pull request Dec 19, 2023
use `rb_thread_io_blocking_call()` instead of
`rb_thread_io_blocking_region()` more.

See #9178 (comment)
@ko1
Copy link
Contributor

ko1 commented Dec 20, 2023

I got kevent fails:

[129/176] TestSocket_TCPSocket#test_accept_multithreadkevent: No such file or directory
../../src/trunk/test/runner.rb: [BUG] unregister/kevent fails. errno:2
ruby 3.3.0dev (2023-12-19T23:29:36Z :detached: 32ecda354f) +MN [arm64-darwin21]

RUBY_MN_THREADS=1 make test-all TESTS=socket blocks so I kill the process (kill PID) and I got this rb_bug().

@nurse
Copy link
Member

nurse commented Dec 20, 2023

I didn't use macOS so I couldn't check it. I'm okay to merge it if it has good quality but I can't judge it. @nurse do you have any idea about it on merging it now?

BTW I'd planned to restructure the code when merging kqueue support. But the re-structuring can apply later, after ruby 3.3.

Sounds good. Let's try it!

@ko1 ko1 merged commit 256f34a into ruby:master Dec 20, 2023
97 checks passed
@ko1
Copy link
Contributor

ko1 commented Dec 20, 2023

🎉

@jeremyevans
Copy link
Contributor

I checked and OpenBSD can also use M:N threads with this. Haven't done any testing with it yet, though.

@jpcamara
Copy link
Contributor Author

I checked and OpenBSD can also use M:N threads with this. Haven't done any testing with it yet, though.

@jeremyevans yes, it should! I would have given it a try there too but I am not sure of a way to test it without having OpenBSD installed on a machine. Is there an easy way to give it a try?

@jeremyevans
Copy link
Contributor

Setting up OpenBSD in a virtual machine is probably the best way to give it a try.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
5 participants