-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Always deliver postponed job to main ractor #4336
Conversation
Profilers use `rb_postponed_job_register_one()` in signal handlers, and signals don't necessarily land on a thread that runs Ruby code and has a thread local execution context. The MJIT worker thread doesn't have an execution context, for example. Without an execution context, postponed job APIs were crashing. Always use the main ractor's execution context so postponed job APIs can work from non Ruby threads like in older versions. Tests courtesy of John Crepezzi <john.crepezzi@gmail.com> and John Hawthorn <john@hawthorn.email>. [Bug #17573]
|
It's Stackprof's signal handler that's running So it's not an MJIT problem, really. The crashes I want to solve happen on threads that C extensions spawn. I'm using MJIT to reproduce a similar situation without the need for extra gem dependencies. |
@XrXr @jhawthorn How about to mix #4108 and this PR? diff --git a/vm_trace.c b/vm_trace.c
index 63e4c3ac1a..3bc55f2c3a 100644
--- a/vm_trace.c
+++ b/vm_trace.c
@@ -1582,6 +1582,14 @@ postponed_job_register(rb_execution_context_t *ec, rb_vm_t *vm,
return PJRR_SUCCESS;
}
+static rb_execution_context_t *
+get_valid_ec(rb_vm_t *vm)
+{
+ rb_execution_context_t *ec = GET_EC();
+ if (ec == NULL) ec = rb_vm_main_ractor_ec(vm);
+ return ec;
+}
+
/*
* return 0 if job buffer is full
* Async-signal-safe
@@ -1589,8 +1597,8 @@ postponed_job_register(rb_execution_context_t *ec, rb_vm_t *vm,
int
rb_postponed_job_register(unsigned int flags, rb_postponed_job_func_t func, void *data)
{
- rb_execution_context_t *ec = GET_EC();
- rb_vm_t *vm = rb_ec_vm_ptr(ec);
+ rb_vm_t *vm = GET_VM();
+ rb_execution_context_t *ec = get_valid_ec(vm);
begin:
switch (postponed_job_register(ec, vm, flags, func, data, MAX_POSTPONED_JOB, vm->postponed_job_index)) {
@@ -1608,8 +1616,8 @@ rb_postponed_job_register(unsigned int flags, rb_postponed_job_func_t func, void
int
rb_postponed_job_register_one(unsigned int flags, rb_postponed_job_func_t func, void *data)
{
- rb_execution_context_t *ec = GET_EC();
- rb_vm_t *vm = rb_ec_vm_ptr(ec);
+ rb_vm_t *vm = GET_VM();
+ rb_execution_context_t *ec = get_valid_ec(vm);
rb_postponed_job_t *pjob;
rb_atomic_t i, index;
|
It complete your test. Advantages:
|
Sounds good to me. |
ebc6cc4
to
89f5e36
Compare
…Backport #17573] Use valid `ec` for postponed job. Postponed job can be registered from non-Ruby thread, which means `ec` in TLS can be NULL. In this case, use main thread's `ec` instead. See #4108 and #4336 --- vm_trace.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) Avoid assert failure when NULL EC is expected After 5680c38, postponed job APIs now expect to be called on native threads not managed by Ruby and handles getting a NULL execution context. However, in debug builds the change runs into an assertion failure with GET_EC() which asserts that EC is non-NULL. Avoid the assertion failure by passing `false` for `expect_ec` instead as the intention is to handle when there is no EC. Add a test from John Crepezzi and John Hawthorn to exercise this situation. See GH-4108 See GH-5094 [Bug #17573] Co-authored-by: John Hawthorn <john@hawthorn.email> Co-authored-by: John Crepezzi <john.crepezzi@gmail.com> --- ext/-test-/postponed_job/postponed_job.c | 31 ++++++++++++++++++++++++++ test/-ext-/postponed_job/test_postponed_job.rb | 7 ++++++ vm_trace.c | 2 +- 3 files changed, 39 insertions(+), 1 deletion(-)
Profilers use
rb_postponed_job_register_one()
in signal handlers, andsignals don't necessarily land on a thread that runs Ruby code and has a
thread local execution context. The MJIT worker thread doesn't have an
execution context, for example.
Without an execution context, postponed job APIs were crashing. Always
use the main ractor's execution context so postponed job APIs can work
from non Ruby threads like in older versions.
Tests courtesy of John Crepezzi and John Hawthorn.
Bug #17573
The test crashes without the code change.