Profiling for Ruby #2013

cleptric · 2023-03-03T09:56:08Z

We want to look into extending our Profiling product with support for Ruby.

On other platforms, we used an existing profiler as a starting point or completely relied on external packages.

When a txn starts, a new profiling_sample_rate, which works in relation to traces_sample_rate, determines if this txn should be profiled. Our default sample rate on other platforms is 101Hz.

On finish, we convert the raw output to our own profiling format and attach it as a profile envelope item to the transaction.

https://github.com/rbspy/rbspy might be worth looking into, as it's a sampling profiler that also supports outputting a speedscope compatible format, something our own format is derived from.

The text was updated successfully, but these errors were encountered:

sl0thentr0py · 2023-03-06T12:58:27Z

notes

rbspy will probably be the most performant but is an external binary so it's mostly out
the other standard sampling profiler is stackprof this is what we'll likely go with
the standard tracing profiler is ruby-prof which is also out since it's not a sampling profiler

indragiek · 2023-03-06T16:30:33Z

@sl0thentr0py agree with this, we can't use an external process and we can't use a deterministic (non-sampling) profiler, so stackprof seems like a reasonable choice. Are there any other options for in-process sampling profilers?

sl0thentr0py · 2023-03-06T16:42:41Z

not that I know of or can find, @st0012 do you know any other ones?
The table here is a good survey of the ruby profiler landscape
https://jvns.ca/blog/2017/12/17/how-do-ruby---python-profilers-work-/

st0012 · 2023-03-14T11:44:37Z

For Rails/Rack apps, I think we can use Shopify's app_profiler.

At the first glance:

It's a gem with native Rails integration
It also provides a Rack middleware or can be invoked with a block
It's still actively maintained
It's battle-tested at Shopify

However, I don't have much experience profiling in production either. So we'll need to do some testing with it ourselves, but do we have a good application/partner for that?

sl0thentr0py · 2023-03-22T12:51:03Z

copypasta from internal convo for public documentation

I tried out stackprof which is a C extension / sampling / signal based profiler
PR: Add stackprof based profiler #2024
sample profile from that: https://sentry-sdks.sentry.io/profiling/profile/sentry-ruby/6e07894c24374791917a3566d2c[…]106048%2C13&query=&sorting=call+order&tid=42&view=top+down
it has its own data format which we need to map to ours (mapping frame ids and some other massaging)
it is based on SIGALRM (wall) / SIGPROF (cpu) which we said we rejected in other languages because of timing inconsistencies
it does not include threading info on the captured stacks, which is a problem for us

given this, how do we want to proceed? Options are:

patch stackprof upstream / fork to send threading info somehow
write our own profiler which just captures stuff in our format directly and just using a thread timer (no signals)
- in ruby
- in C

st0012 · 2023-03-22T16:59:01Z

Ah sorry I didn't notice app_profiler is based on stackprof 🤦‍♂️

sample profile from that

That looks amazing.

it does not include threading info on the captured stacks

Can you explain more on this?

patch stackprof upstream / fork to send threading info somehow

Given that we don't have experience writing profilers yet, maybe we can go with this first? Even if the result is not accepted by the upstream, at least we'll know more on what we actually want to do and/or can do.

sl0thentr0py · 2023-03-27T11:49:15Z

yes of course and thanks for the feedback!

threading info

In a multi-threaded server like puma, there's stuff happening on each thread, and our Sample structure takes a thread_id so you can record and store call stacks from each thread.

Here's the thread switcher in the UI from a multi-threaded python profile.

How stackprof works (in :wall mode) is

stores the thread Stackprof.start was called from in target_thread and forwards the incoming SIGALRM signal to that thread
then uses this signal and calls ruby's rb_profile_frames API to record the call stack in that current thread
- see this open issue for multi-threaded profiling support
- and this abandoned attempt - stackprofx at augmenting stackprof - this is basically what we will eventually want to do

maybe we can go with this first?

For the first MVP, we decided that we will just go with a limited single thread use case, i.e. Stackprof can only run on one thread at a time and will early return if already running. Eventually based on adoption, we might invest time in doing something similar to stackprofx above. :)

sl0thentr0py · 2023-03-27T11:54:51Z

I also fixed some timing issues (but still having some more spurious samples that I'm trying to debug). I'll post some better profiles once everything's done.

If you're curious, I also tried out some simple multi-threaded use cases below with this code.

require 'sentry-ruby'
require 'debug'

Sentry.init do |config|
  config.release = "profiling"
  config.debug = true
  config.traces_sample_rate = 1.0
end

def t1
  10000.times { 2 + 2 }
  sleep 0.2
  puts('t1')
end

def t2
  transaction = Sentry.start_transaction(name: 'profiling')
  Sentry.get_current_scope.set_span(transaction)

  sleep 0.5
  t1
  20000.times { 2 * 2 }
  puts('t2')

  transaction.finish
end

def main
  sleep 0.2
  20000.times { 2 ** 2 }
  puts('main')
end


# transaction = Sentry.start_transaction(name: 'profiling')
# Sentry.get_current_scope.set_span(transaction)

threads = [Thread.new { t1 }, Thread.new { t2 }]
main
threads.each(&:join)

# transaction.finish

transaction on main thread
https://sentry-sdks.sentry.io/profiling/profile/sentry-ruby/6cde0f4d2f2f405a9246d1f93c8e836b/flamechart/?colorCoding=by+system+vs+application+frame&query=&sorting=call+order&tid=42&view=top+down

transaction on child thread
https://sentry-sdks.sentry.io/profiling/profile/sentry-ruby/3794ecdb8ad84140af610646017422c0/flamechart/?colorCoding=by+system+vs+application+frame&query=&sorting=call+order&tid=42&view=top+down

as you can see, we always record call stacks from the thread Stackprof.start is called on.

sl0thentr0py · 2023-03-28T13:51:41Z

opened issue upstream re: timing issues on puma
tmm1/stackprof#201

sl0thentr0py · 2023-04-13T15:44:06Z

docs/wizard PRs:

cleptric added enhancement Status: Backlog labels Mar 3, 2023

HazAT assigned sl0thentr0py Mar 6, 2023

sl0thentr0py mentioned this issue Mar 21, 2023

Add stackprof based profiler #2024

Merged

bruno-garcia mentioned this issue Apr 1, 2023

JVM Profiling getsentry/sentry-java#2635

Open

sl0thentr0py closed this as completed in #2024 Apr 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Profiling for Ruby #2013

Profiling for Ruby #2013

cleptric commented Mar 3, 2023

sl0thentr0py commented Mar 6, 2023 •

edited

indragiek commented Mar 6, 2023

sl0thentr0py commented Mar 6, 2023

st0012 commented Mar 14, 2023 •

edited

sl0thentr0py commented Mar 22, 2023

st0012 commented Mar 22, 2023 •

edited

sl0thentr0py commented Mar 27, 2023 •

edited

sl0thentr0py commented Mar 27, 2023 •

edited

sl0thentr0py commented Mar 28, 2023

sl0thentr0py commented Apr 13, 2023 •

edited

Profiling for Ruby #2013

Profiling for Ruby #2013

Comments

cleptric commented Mar 3, 2023

sl0thentr0py commented Mar 6, 2023 • edited

indragiek commented Mar 6, 2023

sl0thentr0py commented Mar 6, 2023

st0012 commented Mar 14, 2023 • edited

sl0thentr0py commented Mar 22, 2023

st0012 commented Mar 22, 2023 • edited

sl0thentr0py commented Mar 27, 2023 • edited

sl0thentr0py commented Mar 27, 2023 • edited

sl0thentr0py commented Mar 28, 2023

sl0thentr0py commented Apr 13, 2023 • edited

sl0thentr0py commented Mar 6, 2023 •

edited

st0012 commented Mar 14, 2023 •

edited

st0012 commented Mar 22, 2023 •

edited

sl0thentr0py commented Mar 27, 2023 •

edited

sl0thentr0py commented Mar 27, 2023 •

edited

sl0thentr0py commented Apr 13, 2023 •

edited