Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sentry::SessionFlusher causes ThreadError and delay shutdown of puma worker #1848

Closed
arthurwozniak opened this issue Jul 20, 2022 · 3 comments · Fixed by #1851
Closed

Sentry::SessionFlusher causes ThreadError and delay shutdown of puma worker #1848

arthurwozniak opened this issue Jul 20, 2022 · 3 comments · Fixed by #1851
Assignees
Projects

Comments

@arthurwozniak
Copy link

Issue Description

After upgrading to sentry 5.3 we ran into a situation where the session tracking feature caused problems for the puma server.

In our stack, we integrate the rack_timeout gem (https://github.com/zombocom/rack-timeout) to prevent requests from running longer than a specific period of time. If the request should be killed, rack_timeout sends SIGTERM to the puma worker and puma starts the shutdown process.

This leads to puma threads being in an aborting state, but current requests are still being finished. Later then Sentry::SessionFlusher logic causes freeze of aborting process and pumas worker_shutdown_timeout is reached. By default, if hangs puma worker for 30 seconds waiting to shutdown.

I doubted this is a sentry-related issue, but it seems that modification of SessionFlusher solves the whole thing. If it is needed to provide more details which would help, please guide me.

Reproduction Steps

Currently was not possible to reproduce on blank rails application. Probably dependent on stack complexity.

Expected Behavior

Immediate shutdown puma worker, skip session tracking when it is not possible to finish.

Actual Behavior

SessionFlusher attempts to spawn a new thread, even the current thread is in an aborting state. This leads to raise of ThreadError (can't alloc thread) (https://github.com/ruby/ruby/blob/5ae83151b16858083e6443fadc76e7fde0ff1199/thread.c#L895)

From some unclear reason for me, when ThreadError is raised, it also causes a hangup of the whole shutdown process of the puma worker.

If we skip whole rack_timeout logic and cause kill of puma worker from controller action, we get following log

# frozen_string_literal: true

class BrokenController < ActionController::API

  def fail
    Thread.new do
      raise StandardError
    end.abort_on_exception = true
  end
end
source=rack-timeout id=9f46256b-7617-4a36-918c-ed89ad27c189 timeout=60000ms term_on_timeout=1 state=ready
Started GET "/fail" for 127.0.0.1 at 2022-07-20 10:56:47 +0200
source=rack-timeout id=9f46256b-7617-4a36-918c-ed89ad27c189 timeout=60000ms service=1ms term_on_timeout=1 state=active
[Tracing] Discarding <rails.request> transaction </fail> because it's not included in the random sample (sampling rate = 0.2)
   (0.7ms)  SELECT "schema_migrations"."version" FROM "schema_migrations" ORDER BY "schema_migrations"."version" ASC
[Tracing] Discarding <rack.request> transaction </fail> because it's not included in the random sample (sampling rate = 0.2)
Processing by BrokenController#fail as HTML
#<Thread:0x000000000e60e6a0 /home/khanus/git_old/c2e_backend/app/controllers/broken_controller.rb:6 run> terminated with exception (report_on_exception is true):
/home/khanus/git_old/c2e_backend/app/controllers/broken_controller.rb:7:in `block in fail': StandardError (StandardError)
Killing session flusher
Completed 406 Not Acceptable in 2ms (ActiveRecord: 0.0ms | Allocations: 983)


Shutting down background worker
User excluded error: #<ActionController::MissingExactTemplate: BrokenController#fail is missing a template for request formats: text/html>
  
ThreadError (can't alloc thread):
  
sentry-ruby-core (5.3.0) lib/sentry/session_flusher.rb:71:in `new'
sentry-ruby-core (5.3.0) lib/sentry/session_flusher.rb:71:in `ensure_thread'
sentry-ruby-core (5.3.0) lib/sentry/session_flusher.rb:34:in `add_session'
sentry-ruby-core (5.3.0) lib/sentry/hub.rb:192:in `end_session'
sentry-ruby-core (5.3.0) lib/sentry/hub.rb:201:in `ensure in with_session_tracking'
sentry-ruby-core (5.3.0) lib/sentry/hub.rb:201:in `with_session_tracking'
sentry-ruby-core (5.3.0) lib/sentry-ruby.rb:351:in `with_session_tracking'
sentry-ruby-core (5.3.0) lib/sentry/rack/capture_exceptions.rb:17:in `block in call'
sentry-ruby-core (5.3.0) lib/sentry/hub.rb:59:in `with_scope'
sentry-ruby-core (5.3.0) lib/sentry-ruby.rb:331:in `with_scope'
sentry-ruby-core (5.3.0) lib/sentry/rack/capture_exceptions.rb:16:in `call'
bullet (7.0.2) lib/bullet/rack.rb:12:in `call'
apollo_upload_server (2.0.5) lib/apollo_upload_server/middleware.rb:11:in `call'
rack (2.2.4) lib/rack/etag.rb:27:in `call'
rack (2.2.4) lib/rack/conditional_get.rb:27:in `call'
rack (2.2.4) lib/rack/head.rb:12:in `call'
activerecord (6.1.6.1) lib/active_record/migration.rb:601:in `call'
actionpack (6.1.6.1) lib/action_dispatch/middleware/callbacks.rb:27:in `block in call'
activesupport (6.1.6.1) lib/active_support/callbacks.rb:98:in `run_callbacks'
actionpack (6.1.6.1) lib/action_dispatch/middleware/callbacks.rb:26:in `call'
actionpack (6.1.6.1) lib/action_dispatch/middleware/actionable_exceptions.rb:18:in `call'
sentry-rails (5.3.0) lib/sentry/rails/rescued_exception_interceptor.rb:12:in `call'
actionpack (6.1.6.1) lib/action_dispatch/middleware/debug_exceptions.rb:29:in `call'
sentry-ruby-core (5.3.0) lib/sentry/rack/capture_exceptions.rb:26:in `block (2 levels) in call'
sentry-ruby-core (5.3.0) lib/sentry/hub.rb:199:in `with_session_tracking'
sentry-ruby-core (5.3.0) lib/sentry-ruby.rb:351:in `with_session_tracking'
sentry-ruby-core (5.3.0) lib/sentry/rack/capture_exceptions.rb:17:in `block in call'
sentry-ruby-core (5.3.0) lib/sentry/hub.rb:59:in `with_scope'
sentry-ruby-core (5.3.0) lib/sentry-ruby.rb:331:in `with_scope'
sentry-ruby-core (5.3.0) lib/sentry/rack/capture_exceptions.rb:16:in `call'
actionpack (6.1.6.1) lib/action_dispatch/middleware/show_exceptions.rb:33:in `call'
railties (6.1.6.1) lib/rails/rack/logger.rb:37:in `call_app'
railties (6.1.6.1) lib/rails/rack/logger.rb:26:in `block in call'
activesupport (6.1.6.1) lib/active_support/tagged_logging.rb:99:in `block in tagged'
activesupport (6.1.6.1) lib/active_support/tagged_logging.rb:37:in `tagged'
activesupport (6.1.6.1) lib/active_support/tagged_logging.rb:99:in `tagged'
railties (6.1.6.1) lib/rails/rack/logger.rb:26:in `call'
actionpack (6.1.6.1) lib/action_dispatch/middleware/remote_ip.rb:81:in `call'
actionpack (6.1.6.1) lib/action_dispatch/middleware/request_id.rb:26:in `call'
rack (2.2.4) lib/rack/runtime.rb:22:in `call'
rack-timeout (0.6.0) lib/rack/timeout/core.rb:151:in `block in call'
rack-timeout (0.6.0) lib/rack/timeout/support/timeout.rb:19:in `timeout'
rack-timeout (0.6.0) lib/rack/timeout/core.rb:150:in `call'
activesupport (6.1.6.1) lib/active_support/cache/strategy/local_cache_middleware.rb:29:in `call'
actionpack (6.1.6.1) lib/action_dispatch/middleware/executor.rb:14:in `call'
actionpack (6.1.6.1) lib/action_dispatch/middleware/static.rb:24:in `call'
rack (2.2.4) lib/rack/sendfile.rb:110:in `call'
actionpack (6.1.6.1) lib/action_dispatch/middleware/debug_locks.rb:41:in `call'
actionpack (6.1.6.1) lib/action_dispatch/middleware/host_authorization.rb:148:in `call'
rack-cors (1.1.1) lib/rack/cors.rb:100:in `call'
railties (6.1.6.1) lib/rails/engine.rb:539:in `call'
puma (5.6.4) lib/puma/configuration.rb:252:in `call'
puma (5.6.4) lib/puma/request.rb:77:in `block in handle_request'
puma (5.6.4) lib/puma/thread_pool.rb:340:in `with_force_shutdown'
puma (5.6.4) lib/puma/request.rb:76:in `handle_request'
puma (5.6.4) lib/puma/server.rb:441:in `process_client'
puma (5.6.4) lib/puma/thread_pool.rb:147:in `block in spawn_thread'
source=rack-timeout id=9f46256b-7617-4a36-918c-ed89ad27c189 timeout=60000ms service=80ms term_on_timeout=1 state=completed

At the moment we disabled automatic session tracking. However, following change to Sentry::SessionFlusher eliminates unexpected behavior.

def ensure_thread
  return if @thread&.alive?
  return if Thread.current.status == "aborting"

  @thread = Thread.new do
    loop do
      sleep(FLUSH_INTERVAL)
      flush
    end
  end
rescue ThreadError => e # capture exception due to potential possibility of terminating current thread between state check and child thread spawn.
  nil
end

Ruby Version

3.0.2

SDK Version

5.3.0

Integration and Its Version

Rails 6.1.4, Rack 2.2.4

Sentry Config

Sentry.init do |config|
  config.dsn = "dummy://12345:67890@sentry.localdomain/sentry/42"
  config.background_worker_threads = Rails.env.test? ? 0 : 5
  config.sample_rate = 1
  config.traces_sample_rate = 0.2
  config.auto_session_tracking = true

  config.send_default_pii = true
  filter = ActiveSupport::ParameterFilter.new(Rails.application.config.filter_parameters)
  config.before_send = ->(event, _hint) { filter.filter(event.to_hash) }
end
@st0012 st0012 added this to To do in 5.x via automation Jul 20, 2022
@st0012
Copy link
Collaborator

st0012 commented Jul 20, 2022

@sl0thentr0py Can you take a look?

@sl0thentr0py
Copy link
Member

hey @arthurwozniak thanks for the detailed issue, much appreciated! I will fix this shortly.
Just for a consistent repro, can you give me your puma config?

@sl0thentr0py
Copy link
Member

@arthurwozniak I made #1851, the only difference from your solution is we are not checking for aborting on the current thread.
I don't think checking Thread.current.status is that clean semantically, we should just rely on ThreadError to tell us when creating a new thread is not possible. Can you try it out and tell me if it fixes your problem?

5.x automation moved this from To do to Done Jul 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
5.x
Done
Development

Successfully merging a pull request may close this issue.

3 participants