-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support ruby 3.2 + 3.3+ #17
Conversation
* There are lots of changes to how Ruby 3.3+ works with thread instrumentation. * There is no guarantee which thread will trigger hooks. For instance, in Ruby 3.3 the RUBY_INTERNAL_THREAD_EVENT_STARTED event is always called from a different thread * As a result, you have to use the `event_data->thread` member in conjunction with rb_internal_thread_specific_get/set to emulate "thread local" data tracking. You can't use actual thread locals anymore because the hooks are not fired on the same thread * As a result, we can't reliably retrieve the native thread id because ruby does not expose a way to do this and using system calls won't work on different threads. Now uses object_id in ruby 3.3+ * These changes also work better with the M:N thread scheduler, since even if we could access native thread ids the M:N scheduler mostly reuses threads so the associated thread ids would be wrong * Because the events can fire from different threads, the current way of getting the thread name (pthread_getname_np), will probably be wrong * Using rb_thread_current() in the instrumentation hook calls causes a segfault, but it seems to work fine in the GC methods and when starting/stopping tracing * Because we malloc the thread data in ruby 3.3, we free it up when we stop tracing, or whenever a thread exits
Windows is still broken which I thought was turned off since it would never work but I'll check it another time |
Any chance you could weigh in on the approach @byroot? In terms of the need to manually Also not sure the best approach for getting thread names to display. Most of my testing involves the threads exiting before the block can pull them from Thread.list so the names are always either wrong or empty |
thread_meta = (thread_meta_t*)malloc(sizeof(thread_meta_t)); | ||
memset(thread_meta, 0, sizeof(thread_meta_t)); | ||
thread_meta->thread = thread; | ||
thread_meta->thread_object_id = RB_NUM2INT(rb_obj_id(thread)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could Thread#native_thread_id
work here?
Although there's a stipulation that might make it unusable here:
Note
If the thread is not associated yet or already deassociated with a native thread, it returns nil. If the Ruby implementation uses M:N thread model, the ID may change depending on the timing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the input @amomchilov !
There are a couple obstacles, unfortunately.
First, as far as I can tell there is no way to safely access the native_thread_id
method from C without the GVL. The thread api exposed to C extension code is pretty limited, that is safe to run without the GVL. With the way this code works it is always updating the file directly during the hook events, which have no GVL guarantees. Part of me is hoping you'll tell me i'm wrong here and there is a way to invoke thread methods safely in this context :D
Second is like you mentioned - ideally we'd solve M:N support here as well. The M:N model is actually even more strict than that note, it always returns nil in the current implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense! I’m just squinting and doing loose pattern matching here. I have no expertise in this domain :)
There aren’t all that many threads. Touching their object IDs would be totally fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since for M:N threads the notion of a native thread id no longer exists in a lot of cases, I think it's a good idea to use the object id instead.
Having said that, I expect that for a while most folks will not be using M:N threads for Ruby 3.3 (e.g. most people won't be using ractors + won't be using M:N scheduling for the main ractor).
It would be nice to still use the native thread for such cases, methinks!
Having said all that, I realize there's already quite a few moving parts for this PR so I'm perfectly fine with starting with object id for all on 3.3 and punt on doing a fancier thing ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't reviewed too deeply, but the changes overall make sense to me.
For a sec I was worried the state allocation in the callback may be racy, but actually it's fine because a given thread can't process two events concurrently, so seems OK to me.
thread_meta = (thread_meta_t*)malloc(sizeof(thread_meta_t)); | ||
memset(thread_meta, 0, sizeof(thread_meta_t)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thread_meta = (thread_meta_t*)malloc(sizeof(thread_meta_t)); | |
memset(thread_meta, 0, sizeof(thread_meta_t)); | |
thread_meta = ZALLOC(thread_meta_t); |
#elif HAVE_GETTID | ||
native_thread_id = gettid(); | ||
#else | ||
native_thread_id = thread_local_meta->current_thread_serial; // TODO: Better fallback for Windows? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if you know, but the GVL API is a noop on Windows
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uh, I didn't know about that!
In that case, maybe go with removing current_thread_serial
and adding an ifdef
to make the gem become a no-op on windows may be the way to go.
(I think it's still nice if the gem builds/installs, so that folks don't need to have conditions on their Gemfile
s)
@casperisfine Thanks for the feedback! I'll take a look at GVL tracing and see if I can clean this up a bit based on that. Do you have any thoughts on a way to track thread names that die before we can call |
// Used to mark function arguments that are deliberately left unused | ||
#ifdef __GNUC__ | ||
#define UNUSED_ARG __attribute__((unused)) | ||
#else | ||
#define UNUSED_ARG | ||
#endif | ||
|
||
typedef struct { | ||
VALUE thread; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just noticed this. If you store a VALUE reference in that struct, you MUST mark it. Otherwise it can be GCed and the object slot re-used, or GC compaction could move that object in another slot.
But if you do that, it also solves your question about how to get the thread name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just noticed this. If you store a VALUE reference in that struct, you MUST mark it. Otherwise it can be GCed and the object slot re-used, or GC compaction could move that object in another slot.
@byroot Thanks for the feedback here! Can you provide an example of how to do this in the context of this code? I've tried piecing it together hopping between gvltools and cruby, but hours in and I haven't gotten anything working. I can keep chipping away, but tbh the appropriate ways of when and where to mark in cruby are still pretty opaque to me.
Same goes for things like TypedData_Make_Struct
. I thought that might be a good approach because of the free/mark options it has, but I landed on this current version because I just couldn't get that working.
But if you do that, it also solves your question about how to get the thread name.
I think this lines up with @ivoanjo's comment:
(E.g. keep it in a C array, that we need to mark but otherwise can operate on without the GVL)
So i'm assuming a combo of mark + a large-ish C array (maybe 1024 like the CRuby instrumentation tests?) would suffice to hold onto the metadata I need so I can get the thread names once the GvlTracing block finishes.
I am curious - for something like vernier, which presumably would be used to wrap long running ruby processes, how would you expect it to efficiently keep track of thread names? For gvl-tracing, you're not terribly likely to be running 1k+ threads. But for something like vernier there really are no limits to what the code running inside of it might do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jpcamara Not sure if you know this guide -- https://silverhammermba.github.io/emberb/c/#data -- it has a pretty good guide of how to setup a TypedData
object, including the marking.
Currently, gvl-tracing is all class methods ("static"), but I guess to be able to mark we'd need to introduce a TypedData object, so maybe something like a GvlTracing::State
object, which would contain inside it the threads array, that would then get marked.
I am curious - for something like vernier, which presumably would be used to wrap long running ruby processes, how would you expect it to efficiently keep track of thread names? For gvl-tracing, you're not terribly likely to be running 1k+ threads. But for something like vernier there really are no limits to what the code running inside of it might do.
This is a good point. I haven't looked at how Vernier does it but we actually don't need to necessarily do it at the end of the tracing. We could check from time to time if a name is available (e.g. to avoid doing it on every event), and then render it and move on, rather than waiting until the end.
The "wait until the end" strategy was just a bit of a simplification ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey!
First of all, big thanks for working on this! And for the patience with my sloooow review haha >_>
Having to free the data at the end of tracing. I guess that's ok? It was the only way I could figure out how to get the new rb_internal_thread_specific_* api working properly without a memory leak
This seems very fine! In fact, it's better than the previous design, where the data would "live forever" as part of the native thread's thread-local stuff.
Name is not going to work correctly in ruby 3.3+ as this code is right now, since the events don't guarantee to fire from their associated thread. Ideally we'd get the name from the thread, but that is kind of tricky since ruby does not expose a method for it, and Thread.list doesn't contain any threads that have exited at the end of the tracing
Yeah... This is a bit annoying. Maybe something similar to @casperisfine suggested would work: we could keep references to all the threads we've seen (even those that die) and then in stop
, we'd use that instead of Thread.list
to iterate and get names from.
(E.g. keep it in a C array, that we need to mark but otherwise can operate on without the GVL)
I know object_id is considered inefficient (Reduce usage of object_id in core gems ruby/ruby#9276), but the overhead here seems minimal enough? I like it better as an identifier than a serial id and it lets MN threads work the same as native threads
I think if you create enough threads for object_id
calls to add any measurable overhead, you're probably already holding it wrong 🤣 .
There is no guarantee which thread will trigger hooks. For instance, in Ruby 3.3 the RUBY_INTERNAL_THREAD_EVENT_STARTED event is always called from a different thread
I'm curious -- is that the case even when not using M:N scheduling on the main ractor? What thread does it get called from? 🤔
void *data = rb_internal_thread_specific_get(thread, gvl_tracing_key); | ||
thread_meta_t* thread_meta = NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: Assigning directly to thread_meta simplifies a few things below I think :)
void *data = rb_internal_thread_specific_get(thread, gvl_tracing_key); | |
thread_meta_t* thread_meta = NULL; | |
thread_meta_t* data = (thread_meta_t *) rb_internal_thread_specific_get(thread, gvl_tracing_key); |
thread_meta = (thread_meta_t*)malloc(sizeof(thread_meta_t)); | ||
memset(thread_meta, 0, sizeof(thread_meta_t)); | ||
thread_meta->thread = thread; | ||
thread_meta->thread_object_id = RB_NUM2INT(rb_obj_id(thread)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since for M:N threads the notion of a native thread id no longer exists in a lot of cases, I think it's a good idea to use the object id instead.
Having said that, I expect that for a while most folks will not be using M:N threads for Ruby 3.3 (e.g. most people won't be using ractors + won't be using M:N scheduling for the main ractor).
It would be nice to still use the native thread for such cases, methinks!
Having said all that, I realize there's already quite a few moving parts for this PR so I'm perfectly fine with starting with object id for all on 3.3 and punt on doing a fancier thing ;)
thread_meta->thread = thread; | ||
thread_meta->thread_object_id = RB_NUM2INT(rb_obj_id(thread)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking there's a few places where we're initializing the thread-local structure. What do you think of adding a function to initialize it that can be used throughout?
static inline void initialize_thread_id(thread_meta_t *thread_meta) { | ||
thread_meta->current_thread_seen = true; | ||
thread_meta->thread_object_id = RB_NUM2INT(rb_obj_id(thread_meta->thread)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems somewhat redundant/duplicate from current_thread_meta
above?
#elif HAVE_GETTID | ||
native_thread_id = gettid(); | ||
#else | ||
native_thread_id = thread_local_meta->current_thread_serial; // TODO: Better fallback for Windows? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uh, I didn't know about that!
In that case, maybe go with removing current_thread_serial
and adding an ifdef
to make the gem become a no-op on windows may be the way to go.
(I think it's still nice if the gem builds/installs, so that folks don't need to have conditions on their Gemfile
s)
static thread_meta_t* current_thread_meta(VALUE thread) { | ||
return &thread_local_meta; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like this approach!
@@ -49,6 +49,10 @@ def start(file) | |||
|
|||
def stop | |||
thread_list = Thread.list | |||
thread_list.each { |t| | |||
GvlTracing._clean(t) | |||
thread_label(t) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is calling thread_label
needed here? 🤔
Yes, it's fired from the parent thread. That simplified the implementation quite a bit. |
Thanks for the feedback! I'll make some changes in the next few days so we can get this buttoned up 🎉 |
Closing in favor of #18 |
Fix #17 Paired with @byroot This PR adds support for Ruby 3.3 by using the new `rb_internal_thread_specific_get` which allows accessing data specific to a Ruby thread without requiring the GVL to be locked. It also handles N:M threads automatically. More info here: ruby/ruby@352a885 In Ruby 3.3, the thread events don't necessarily run from the thread they correspond to, so the thread corresponding to the event is sent in `event_data`, and now `GT_LOCAL_STATE` needs to take the thread too. To handle GC, the thread specific data is wrapped into a TypedData struct and stored in the Ruby thread as a instance variable. This requires getting the GVL, so we have an `allocate` flag on `GT_LOCAL_STATE` too, which is false when we're not running from the thread and can't allocate. The hook only runs from its corresponding thread for the `RUBY_INTERNAL_THREAD_EVENT_STARTED` and `RUBY_INTERNAL_THREAD_EVENT_RESUMED` events, so we can allocate there. If the local state hasn't been initialized yet, we just return early. This means we would skipping events from already existing threads that haven't been resumed yet. To avoid this, we initialize all threads when tracing starts. We do this from Ruby, which ensures we have the GVL. We added a spec for that, which passes on main on 3.2, but not on 3.3. To support N:M threads (and differentiate events from different threads but the same native thread), instead of returning the native thread id, we know return the address of the Ruby thread in `tid` (on Ruby 3.3).
This is still WIP, but I wanted to open it up so the general approach is clear. Main oddities for me right now:
rb_internal_thread_specific_*
api working properly without a memory leakThread.list
doesn't contain any threads that have exited at the end of the tracingobject_id
is considered inefficient (Reduce usage ofobject_id
in core gems ruby/ruby#9276), but the overhead here seems minimal enough? I like it better as an identifier than a serial id and it lets MN threads work the same as native threadsCommit notes
There are lots of changes to how Ruby 3.3+ works with thread instrumentation.
There is no guarantee which thread will trigger hooks. For instance, in Ruby 3.3 the RUBY_INTERNAL_THREAD_EVENT_STARTED event is always called from a different thread
As a result, you have to use the
event_data->thread
member in conjunction with rb_internal_thread_specific_get/set to emulate "thread local" data tracking. You can't use actual thread locals anymore because the hooks are not fired on the same threadAs a result, we can't reliably retrieve the native thread id because ruby does not expose a way to do this and using system calls won't work on different threads. Now uses object_id in ruby 3.3+
These changes also work better with the M:N thread scheduler, since even if we could access native thread ids the M:N scheduler mostly reuses threads so the associated thread ids would be wrong
Because the events can fire from different threads, the current way of getting the thread name (pthread_getname_np), will probably be wrong
Using rb_thread_current() in the instrumentation hook calls causes a segfault, but it seems to work fine in the GC methods and when starting/stopping tracing
Because we malloc the thread data in ruby 3.3, we free it up when we stop tracing, or whenever a thread exits