Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Profiling for Ruby #2013

Closed
cleptric opened this issue Mar 3, 2023 · 10 comments · Fixed by #2024
Closed

Profiling for Ruby #2013

cleptric opened this issue Mar 3, 2023 · 10 comments · Fixed by #2024
Assignees

Comments

@cleptric
Copy link
Member

cleptric commented Mar 3, 2023

We want to look into extending our Profiling product with support for Ruby.

On other platforms, we used an existing profiler as a starting point or completely relied on external packages.

When a txn starts, a new profiling_sample_rate, which works in relation to traces_sample_rate, determines if this txn should be profiled. Our default sample rate on other platforms is 101Hz.

On finish, we convert the raw output to our own profiling format and attach it as a profile envelope item to the transaction.

https://github.com/rbspy/rbspy might be worth looking into, as it's a sampling profiler that also supports outputting a speedscope compatible format, something our own format is derived from.

@sl0thentr0py
Copy link
Member

sl0thentr0py commented Mar 6, 2023

notes

  • rbspy will probably be the most performant but is an external binary so it's mostly out
  • the other standard sampling profiler is stackprof this is what we'll likely go with
  • the standard tracing profiler is ruby-prof which is also out since it's not a sampling profiler

@indragiek
Copy link
Member

@sl0thentr0py agree with this, we can't use an external process and we can't use a deterministic (non-sampling) profiler, so stackprof seems like a reasonable choice. Are there any other options for in-process sampling profilers?

@sl0thentr0py
Copy link
Member

not that I know of or can find, @st0012 do you know any other ones?
The table here is a good survey of the ruby profiler landscape
https://jvns.ca/blog/2017/12/17/how-do-ruby---python-profilers-work-/

image

@st0012
Copy link
Collaborator

st0012 commented Mar 14, 2023

For Rails/Rack apps, I think we can use Shopify's app_profiler.

At the first glance:

  • It's a gem with native Rails integration
  • It also provides a Rack middleware or can be invoked with a block
  • It's still actively maintained
  • It's battle-tested at Shopify

However, I don't have much experience profiling in production either. So we'll need to do some testing with it ourselves, but do we have a good application/partner for that?

@sl0thentr0py
Copy link
Member

copypasta from internal convo for public documentation


given this, how do we want to proceed? Options are:

  • patch stackprof upstream / fork to send threading info somehow
  • write our own profiler which just captures stuff in our format directly and just using a thread timer (no signals)
    • in ruby
    • in C

@st0012
Copy link
Collaborator

st0012 commented Mar 22, 2023

Ah sorry I didn't notice app_profiler is based on stackprof 🤦‍♂️

sample profile from that

That looks amazing.

it does not include threading info on the captured stacks

Can you explain more on this?

patch stackprof upstream / fork to send threading info somehow

Given that we don't have experience writing profilers yet, maybe we can go with this first? Even if the result is not accepted by the upstream, at least we'll know more on what we actually want to do and/or can do.

@sl0thentr0py
Copy link
Member

sl0thentr0py commented Mar 27, 2023

yes of course and thanks for the feedback!

threading info

In a multi-threaded server like puma, there's stuff happening on each thread, and our Sample structure takes a thread_id so you can record and store call stacks from each thread.

Here's the thread switcher in the UI from a multi-threaded python profile.

image


How stackprof works (in :wall mode) is

maybe we can go with this first?

For the first MVP, we decided that we will just go with a limited single thread use case, i.e. Stackprof can only run on one thread at a time and will early return if already running. Eventually based on adoption, we might invest time in doing something similar to stackprofx above. :)

@sl0thentr0py
Copy link
Member

sl0thentr0py commented Mar 27, 2023

I also fixed some timing issues (but still having some more spurious samples that I'm trying to debug). I'll post some better profiles once everything's done.

If you're curious, I also tried out some simple multi-threaded use cases below with this code.

require 'sentry-ruby'
require 'debug'

Sentry.init do |config|
  config.release = "profiling"
  config.debug = true
  config.traces_sample_rate = 1.0
end

def t1
  10000.times { 2 + 2 }
  sleep 0.2
  puts('t1')
end

def t2
  transaction = Sentry.start_transaction(name: 'profiling')
  Sentry.get_current_scope.set_span(transaction)

  sleep 0.5
  t1
  20000.times { 2 * 2 }
  puts('t2')

  transaction.finish
end

def main
  sleep 0.2
  20000.times { 2 ** 2 }
  puts('main')
end


# transaction = Sentry.start_transaction(name: 'profiling')
# Sentry.get_current_scope.set_span(transaction)

threads = [Thread.new { t1 }, Thread.new { t2 }]
main
threads.each(&:join)

# transaction.finish

transaction on main thread
https://sentry-sdks.sentry.io/profiling/profile/sentry-ruby/6cde0f4d2f2f405a9246d1f93c8e836b/flamechart/?colorCoding=by+system+vs+application+frame&query=&sorting=call+order&tid=42&view=top+down

transaction on child thread
https://sentry-sdks.sentry.io/profiling/profile/sentry-ruby/3794ecdb8ad84140af610646017422c0/flamechart/?colorCoding=by+system+vs+application+frame&query=&sorting=call+order&tid=42&view=top+down

as you can see, we always record call stacks from the thread Stackprof.start is called on.

@sl0thentr0py
Copy link
Member

opened issue upstream re: timing issues on puma
tmm1/stackprof#201

@sl0thentr0py
Copy link
Member

sl0thentr0py commented Apr 13, 2023

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants