Permalink
Fetching contributors…
Cannot retrieve contributors at this time
133 lines (105 sloc) 5.74 KB

rbspy architecture

rbspy is a little complicated. I want other people to be able to contribute to it easily, so here is an architecture document to help you understand how it works.

Here’s what happens you run rbspy snapshot --pid $PID. This is the simplest subcommand (it takes a PID and gets you the current stack trace from that PID), and if you understand how snapshot works you can relatively easily understand how the rest of the rbspy subcommands work as well.

The implementation of the snapshot function in main.rs is really simple: just 6 lines of code. The goal of this document is to explain how that code works behind the scenes.

fn snapshot(pid: pid_t) -> Result<(), Error> {
    let getter = initialize::initialize(pid)?;
    let trace = getter.get_trace()?;
    for x in trace.iter().rev() {
        println!("{}", x);
    }
    Ok(())
}

Phase 1: Initialize. (initialize.rs + address_finder.rs)

Our first goal is to create a struct (StackTraceGetter) which we can call .get() on to get a stack trace. This struct contains a PID, a function, and the address in the target process of the current thread. The initialization code is somewhat complicated but has a simple interface: you give it a PID, and it returns a struct that you can call .get_trace() on:

let getter = initialize.initialize(pid)
getter.get_trace()

Here's what happens when you call initialize(pid).

Step 1: Find the Ruby version of the process. The code to do this is in a function called get_ruby_version.

Step 2: Find the address of the ruby_current_thread global variable. This address is the starting point for getting a stack trace from our Ruby process -- we start there every. How we do this depends on 2 things -- whether the Ruby process we’re profiling has symbols, and the Ruby version (in 2.5.0+ there are some small differences).

If there are symbols, we find the address of the current thread using the symbol table. (current_thread_address_location_symbol_table function). This is pretty straightforward. We look up ruby_current_thread or ruby_current_execution_context_ptr depending on the Ruby version.

If there aren’t symbols, instead we use a heuristic (current_thread_address_location_search_bss) where we search through the .bss section of our binary’s memory for something that plausibly looks like the address of the current thread. This assumes that the address we want is in the .bss section somewhere. How this works:

  • Find the address of the .bss section and read it from memory
  • Cast the .bss section to an array of usize (so an array of addresses).
  • Iterate through that array and for every address run the is_maybe_thread function on that address. is_maybe_thread is a Ruby-version-specific function (we compile a different version of this function for every Ruby version). We'll explain this later.
  • Return an address if is_maybe_thread returns true for any of them. Otherwise abort.

Step 3: Get the right stack_trace function. We compile 30+ different functions to get stack_traces (will explain this later). The code to decide which function to use is basically a huge switch statement, depending on the Ruby version.

  "1.9.1" => self::ruby_1_9_1_0::get_stack_trace,
  "1.9.2" => self::ruby_1_9_2_0::get_stack_trace,
  "1.9.3" => self::ruby_1_9_3_0::get_stack_trace,

Step 4: Return the getter struct.

Now we're done! We return our StackTraceGetter struct.

pub fn initialize(pid: pid_t) -> Result<StackTraceGetter, Error> {
    let version = get_ruby_version_retry(pid).context("Couldn't determine Ruby version")?;
    debug!("version: {}", version);
    Ok(StackTraceGetter {
        pid: pid,
        current_thread_addr_location: os_impl::current_thread_address(pid, &version)?,
        stack_trace_function: stack_trace::get_stack_trace_function(&version),
    })
}

impl StackTraceGetter {
    pub fn get_trace(&self) -> Result<Vec<StackFrame>, MemoryCopyError> {
        let stack_trace_function = &self.stack_trace_function;
        stack_trace_function(self.current_thread_addr_location, self.pid)
    }
}

Phase 2: Get stack traces (ruby_version.rs, ruby-bindings/ crate, bindgen.sh)

Once we've initialized, all that remains is calling the get_trace function. How does that function work?

Like we said before -- we compile a different version of the code to get stack traces for every Ruby version. This is because every Ruby version has slightly different struct layouts.

The Ruby structs are defined in a ruby-bindings crate. All the code in that crate is autogenerated by bindgen, using a hacky script called bindgen.sh.

These functions are defined through a bunch of macros (4 different macros, for different ranges of Ruby versions) which implement get_stack_trace for every Ruby version. Each one uses the right Ruby.

There's a lot of code in ruby_version.rs but this is the core of how it works. First, it defines a $ruby_version module and inside that module uses bindings::$ruby_version which includes all the required struct definitions for that Ruby version.

Then it includes more macros which together make up the body of that module. This is because some functions are the same across all Ruby versions (like get_ruby_string) and some are different (like get_stack_frame which changes frequently because the way Ruby organizes that code changes a lot).

macro_rules! ruby_version_v_2_0_to_2_2(
    ($ruby_version:ident) => (
       pub mod $ruby_version {
            use bindings::$ruby_version::*;
            ...
            get_stack_trace!(rb_thread_struct);
            get_ruby_string!();
            get_cfps!();
            get_lineno_2_0_0!();
            get_stack_frame_2_0_0!();
            is_stack_base_1_9_0!();
}