Draft sketch of "external debug section" for feedback #6706

KJTsanaktsidis · 2022-11-10T12:53:25Z

This is a draft PR sketch to collect some feedback about how we might integrate external profiling tools with Ruby processes.

Bug tracker issue: https://bugs.ruby-lang.org/issues/19119

rb_method_entry_t (and its const cousin rb_callable_method_entry_t) need to be RVALUEs (they are stored in T_IMEMO objects managed by the GC). However, these structs are currently using all five words of the RVALUE. Thus, it is not possible to add additional fields to them. In order to solve this, we define a rb_method_entry_ext_t structure to hold additional attributes and manage it in a similar way to how rb_classext_struct is managed for RClass. If USE_RVARGC is on (i.e. multiple size pools in the GC are enabled), then we store the method entry in a larger size pool and place the rb_method_entry_ext_t data inline with the object. Otherwise, we store a pointer to a C heap-allocated rb_method_entry_ext_t in one of the RVALUE words. The details of either method are abstracted behind METHOD_ENTRY_EXT() and CALLABLE_METHOD_ENTRY_EXT() macros, which act analogously to the RCLASS_EXT() one. In order to make room for the ext pointer in non-RVARGC configurations, the "owner" field of rb_method_entry_t has been oved to the rb_method_entry_ext_t structure.

This commit adds a method Module#debug_name, which prints a human-readable name intended for use in profiling tools for describing a class. The rules are documented in a test in test_module.rb, but boil down to: * "<refinement Foo of Bar>" for a refinement module adding methods to Bar * "<singleton of Foo>" for Foo's singleton class * "<instance of Foo>" for the singleton class of a particular instance of Foo * "<anonymous subclass of Foo>" for an anonymous subclass * The usual classpath string otherwise Importantly, none of these strings contain any addresses in them (i.e. no %p of VALUEs). This is arguably useless anyway; now that the Ruby GC is a compacting GC, and with moves afoot to compact by default, these addresses are not even guaranteed to be stable from moment to moment within the same process. However they're _especially_ useless for aggregating across different processes. The intended use for these strings is to build up fully-qualified method names for profiling tools; addresses in the class parts of those method names would just cause under-aggregation.

This is a method analogue for Module#debug_name; it prints the name of the method qualified with the class name its on, as would be printed by Module#debug_name. Thus, it gives a name for the method that is guaranteed not to contain any addresses etc. and thus be suitable for aggregation across processes in e.g. profilers.

This prints a thread backtrace using the same format as Method#debug_name.

This commit adds some extra atomic operations to atomic.h: * ATOMIC_PTR_SET & ATOMIC_SIZE_SET; these are like ATOMIC_SET (which already exists), but for `void *` and `size_t` types respectively. These do a store in a way that is a) guaranteecd not to tear and b) ordered with respect to other stores using the atomic.h macros. * ATOMIC_LOAD, ATOMIC_PTR_LOAD and ATOMIC_SIZE_LOAD; these do an atomic load operation and work with `ruby_atomic_t`, `void *`, and `size_t` respectively. Again, these are needed to perform variable reads that a) are guaranteecd to read a valid, non-torn value and b) ordered with respect to other loads/stores through atomic.h. * ATOMIC_BARRIER, which issues a memory fence/barrier instruction and orderes loads/stores before the barrier with respect to loads/stores after the barrier, even if those loads/stores are not done using the atomic.h macros. The motivation for adding these is for use in the debug_external structures; these structures are intended to be read from other processes, and so it is important that this access is done through instructions that valid/non-torn data is read from the external process.

This commit introduces a "debug_external" interface for Ruby. This is a block of memory inside a Ruby process that exposes information about the program for consumption by external tools in a documented manner. The entrypoint is the rb_debug_ext_section global variable, which is stored in its own ELF/PE/MachO/etc section so that it its address is discoverable even in Ruby binaries which have had their symbols stripped. Ruby will keep information in that structure up-to-date as the program executes. The first piece of information stored in there is a list of Ruby execution contexts (i.e. fibres & threads), and the current call stack for each one; this is also introduced in this commit.

KJTsanaktsidis added 6 commits November 10, 2022 19:58

Implement Thread::Backtrace::Location#debug_label

f4f9132

This prints a thread backtrace using the same format as Method#debug_name.

KJTsanaktsidis closed this Nov 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Draft sketch of "external debug section" for feedback #6706

Draft sketch of "external debug section" for feedback #6706

KJTsanaktsidis commented Nov 10, 2022 •

edited

Draft sketch of "external debug section" for feedback #6706

Draft sketch of "external debug section" for feedback #6706

Conversation

KJTsanaktsidis commented Nov 10, 2022 • edited

KJTsanaktsidis commented Nov 10, 2022 •

edited