Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Take advantage of Ruby 3 scheduler. #111

Merged
merged 9 commits into from
Jul 16, 2021
Merged

Take advantage of Ruby 3 scheduler. #111

merged 9 commits into from
Jul 16, 2021

Conversation

ioquatix
Copy link
Member

Description

Take direct advantage of the Ruby 3 scheduler interface. While we will try to maintain compatibility as much as possible, there might be some internal scheduler changes which impact the concurrency model (order of operations).

Types of Changes

  • New feature.
  • Breaking change.
  • Performance improvement.

@ioquatix ioquatix changed the title Change result into attribute and add specs. Take advantage of Ruby 3 scheduler. Apr 24, 2021
@ioquatix ioquatix force-pushed the native-scheduler branch 3 times, most recently from 0fa7c7b to e567794 Compare May 7, 2021 05:27
@ioquatix
Copy link
Member Author

ioquatix commented May 7, 2021

Okay, interesting, macOS is passing, Linux is failing.

rspec ./spec/async/scheduler_spec.rb:126 is particularly racey.

@ioquatix
Copy link
Member Author

ioquatix commented May 7, 2021

Async Scheduler thread tests:

-- C level backtrace information -------------------------------------------
/home/runner/.rubies/ruby-head/lib/libruby.so.3.1(rb_print_backtrace+0x11) [0x7f0a11fdd08a] vm_dump.c:759
/home/runner/.rubies/ruby-head/lib/libruby.so.3.1(rb_vm_bugreport) vm_dump.c:1041
/home/runner/.rubies/ruby-head/lib/libruby.so.3.1(rb_bug_for_fatal_signal+0xf4) [0x7f0a11de2d14] error.c:800
/home/runner/.rubies/ruby-head/lib/libruby.so.3.1(sigbus+0x4d) [0x7f0a11f324cd] signal.c:946
/lib/x86_64-linux-gnu/libc.so.6(0x7f0a11b54210) [0x7f0a11b54210]
/home/runner/.rubies/ruby-head/lib/libruby.so.3.1(RBASIC_CLASS+0x0) [0x7f0a11fcfbd8] ./include/ruby/internal/globals.h:128
/home/runner/.rubies/ruby-head/lib/libruby.so.3.1(gccct_method_search) vm_eval.c:422
/home/runner/.rubies/ruby-head/lib/libruby.so.3.1(rb_funcallv_scope) vm_eval.c:1013
/home/runner/.rubies/ruby-head/lib/libruby.so.3.1(rb_funcallv) vm_eval.c:1039
/home/runner/.rubies/ruby-head/lib/libruby.so.3.1(rb_fiber_scheduler_unblock+0x3e) [0x7f0a11f31f1e] scheduler.c:162
/home/runner/.rubies/ruby-head/lib/libruby.so.3.1(rb_threadptr_join_list_wakeup+0x2a) [0x7f0a11f7eb5a] thread.c:548
/home/runner/.rubies/ruby-head/lib/libruby.so.3.1(thread_start_func_2) thread.c:893
/home/runner/.rubies/ruby-head/lib/libruby.so.3.1(register_cached_thread_and_wait+0x0) [0x7f0a11f7f2a2] thread_pthread.c:1033
/home/runner/.rubies/ruby-head/lib/libruby.so.3.1(thread_start_func_1) thread_pthread.c:1040
/lib/x86_64-linux-gnu/libpthread.so.0(start_thread+0xd9) [0x7f0a11ad8609]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x43) [0x7f0a11c30293]

Seems like rb_fiber_scheduler_unblock is racing with something.

@ioquatix
Copy link
Member Author

ioquatix commented May 7, 2021

This failure only occurs with EPoll on Linux:

image

@ioquatix
Copy link
Member Author

ioquatix commented May 7, 2021

Okay, at least part of this was a bug due to the GVL releasing code being backwards in the EPoll backend of event: socketry/io-event@181c7a0

@ioquatix
Copy link
Member Author

ioquatix commented Jun 7, 2021

I found an interesting problem today. In async-pool Semaphore#acquire enter and exit can be context switching point. Because of this, we need to be careful to release resource after acquiring it if we fail. socketry/async-pool@8c6ef4d

In addition, I considered that some parts of async-pool can invoke non-blocking interfaces, e.g. any time we call a method on the supplied class we should be cautious that it can fail with an exception... so we need to probably improve the robustness of the implementation.

Ruby 3 allows mixing `Fiber#resume/#yield` and `Fiber#transfer`. We can
take advantage of that to minimise the impact of non-blocking operations
on user flow control.

Previously, non-blocking operations would invoke `Fiber.yield` and this
was a user-visible side-effect. We did take advantage of it, but it also
meant that integration of Async with existing usage of Fiber could be
problematic. We tracked the most obvious issues in `enumerator_spec.rb`.

Now, non-blocking operations transfer directly to the scheduler fiber
and thus don't impact other usage of resume/yield.
@ioquatix ioquatix merged commit e7dd8c1 into master Jul 16, 2021
@ioquatix ioquatix deleted the native-scheduler branch July 16, 2021 05:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants