Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Occasional deadlocks between Dependencies::Interlock and db adapter lock #34310

Open
tessereth opened this issue Oct 25, 2018 · 10 comments
Open
Labels

Comments

@tessereth
Copy link

tessereth commented Oct 25, 2018

In our rspec test environment, some feature specs occasionally hang. Generating stack traces using the sigdump gem and ActionDispatch::DebugLocks I was able to narrow it down to roughly the following.

There are two request threads, A and B, and another thread accessing the database C. When requests start they call ActiveSupport::Dependencies.interlock.start_sharing and if they need to autoload any files, they upgrade to interlock.exclusive. When threads use the database, they call AbstractAdapter#log which acquires a separate lock for the database connection (which is a LoadInterlockAwareMonitor). Then I have the following:

Thread A Thread B Thread C
gets interlock share lock
gets interlock share lock
gets db connection lock
calls AbstractAdapter#log
notes it can't get db lock right away (blocked on C)
releases interlock share and waits on db lock
tries to autoload a model
upgrades to an exclusive interlock
releases the db connection lock
takes the db lock
tries to reclaim interlock share lock
blocked on thread A which has exclusive lock
requires a db query to load schema
calls AbstractAdapter#log
notes it can't get db lock right away (blocked on B)
does not release interlock as exclusive and waits on db lock
waits on db connection lock

And then A and B wait forever. Also of note, rspec cleanup requires rolling back a transaction which requires the db connection lock so rspec also hangs forever.

Steps to reproduce

It's hard to actually reproduce this as it's so timing based but I created this unit test 7ab4f35. I also added the full ActionDispatch::DebugLocks output at the end.

Expected behavior

All requests complete

Actual behavior

The test suite hangs

System configuration

Rails version: 5.1.6 (the test also reproduces against master)

Ruby version: 2.3.6 (the test also reproduces against 2.5.1)

ActionDispatch::DebugLocks output

Thread 0 [0xa06e4b0 sleep]  No lock (yielded share)
  Waiting in yield_shares
  may be pre-empted for: "load", share
  blocked by: 1

/usr/local/lib/ruby/2.3.0/monitor.rb:111:in sleep' /usr/local/lib/ruby/2.3.0/monitor.rb:111:in wait'
/usr/local/lib/ruby/2.3.0/monitor.rb:111:in wait' /usr/local/lib/ruby/2.3.0/monitor.rb:123:in wait_while'
/usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/concurrency/share_lock.rb:221:in wait_for' /usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/concurrency/share_lock.rb:189:in block in yield_shares'
/usr/local/lib/ruby/2.3.0/monitor.rb:214:in mon_synchronize' /usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/concurrency/share_lock.rb:187:in yield_shares'
/usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/dependencies/interlock.rb:45:in permit_concurrent_loads' /usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/concurrency/load_interlock_aware_monitor.rb:13:in mon_enter'
/usr/local/lib/ruby/2.3.0/monitor.rb:212:in mon_synchronize' /usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/connection_adapters/abstract_adapter.rb:612:in block in log'
/usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/notifications/instrumenter.rb:21:in instrument' /usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/connection_adapters/abstract_adapter.rb:604:in log'
/usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/connection_adapters/postgresql_adapter.rb:614:in exec_no_cache' /usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/connection_adapters/postgresql_adapter.rb:603:in execute_and_clear'
/usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/connection_adapters/postgresql/database_statements.rb:79:in exec_query' /usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/connection_adapters/abstract/database_statements.rb:371:in select'
/usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/connection_adapters/abstract/database_statements.rb:42:in select_all' /usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/connection_adapters/abstract/query_cache.rb:97:in select_all'
/usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/relation/calculations.rb:176:in pluck' <app code which queries the database> /usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/callbacks.rb:413:in instance_exec'
/usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/callbacks.rb:413:in block in make_lambda' /usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/callbacks.rb:197:in block (2 levels) in halting'
/usr/local/bundle/gems/actionpack-5.1.6/lib/abstract_controller/callbacks.rb:12:in block (2 levels) in <module:Callbacks>' /usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/callbacks.rb:198:in block in halting'
/usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/callbacks.rb:507:in block in invoke_before' /usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/callbacks.rb:507:in each'
/usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/callbacks.rb:507:in invoke_before' /usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/callbacks.rb:130:in run_callbacks'
/usr/local/bundle/gems/actionpack-5.1.6/lib/abstract_controller/callbacks.rb:19:in process_action' /usr/local/bundle/gems/actionpack-5.1.6/lib/action_controller/metal/rescue.rb:20:in process_action'
/usr/local/bundle/gems/actionpack-5.1.6/lib/action_controller/metal/instrumentation.rb:32:in block in process_action' /usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/notifications.rb:166:in block in instrument'
/usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/notifications/instrumenter.rb:21:in instrument' /usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/notifications.rb:166:in instrument'
/usr/local/bundle/gems/actionpack-5.1.6/lib/action_controller/metal/instrumentation.rb:30:in process_action' /usr/local/bundle/gems/actionpack-5.1.6/lib/action_controller/metal/params_wrapper.rb:252:in process_action'
/usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/railties/controller_runtime.rb:22:in process_action' /usr/local/bundle/gems/actionpack-5.1.6/lib/abstract_controller/base.rb:124:in process'
/usr/local/bundle/gems/actionview-5.1.6/lib/action_view/rendering.rb:30:in process' /usr/local/bundle/gems/actionpack-5.1.6/lib/action_controller/metal.rb:189:in dispatch'
/usr/local/bundle/gems/actionpack-5.1.6/lib/action_controller/metal.rb:253:in dispatch' /usr/local/bundle/gems/actionpack-5.1.6/lib/action_dispatch/routing/route_set.rb:49:in dispatch'
/usr/local/bundle/gems/actionpack-5.1.6/lib/action_dispatch/routing/route_set.rb:31:in serve' /usr/local/bundle/gems/actionpack-5.1.6/lib/action_dispatch/routing/mapper.rb:16:in block in class:Constraints'
/usr/local/bundle/gems/actionpack-5.1.6/lib/action_dispatch/routing/mapper.rb:46:in serve' /usr/local/bundle/gems/actionpack-5.1.6/lib/action_dispatch/journey/router.rb:50:in block in serve'
/usr/local/bundle/gems/actionpack-5.1.6/lib/action_dispatch/journey/router.rb:33:in each' /usr/local/bundle/gems/actionpack-5.1.6/lib/action_dispatch/journey/router.rb:33:in serve'
/usr/local/bundle/gems/actionpack-5.1.6/lib/action_dispatch/routing/route_set.rb:844:in call' /usr/local/bundle/gems/warden-1.2.7/lib/warden/manager.rb:36:in block in call'
/usr/local/bundle/gems/warden-1.2.7/lib/warden/manager.rb:35:in catch' /usr/local/bundle/gems/warden-1.2.7/lib/warden/manager.rb:35:in call'
/usr/local/bundle/gems/rack-2.0.5/lib/rack/etag.rb:25:in call' /usr/local/bundle/gems/rack-2.0.5/lib/rack/conditional_get.rb:38:in call'
/usr/local/bundle/gems/rack-2.0.5/lib/rack/head.rb:12:in call' /usr/local/bundle/gems/rack-2.0.5/lib/rack/session/abstract/id.rb:232:in context'
/usr/local/bundle/gems/rack-2.0.5/lib/rack/session/abstract/id.rb:226:in call' /usr/local/bundle/gems/actionpack-5.1.6/lib/action_dispatch/middleware/cookies.rb:613:in call'
/usr/local/bundle/gems/actionpack-5.1.6/lib/action_dispatch/middleware/callbacks.rb:26:in block in call' /usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/callbacks.rb:97:in run_callbacks'
/usr/local/bundle/gems/actionpack-5.1.6/lib/action_dispatch/middleware/callbacks.rb:24:in call' /usr/local/bundle/gems/bugsnag-6.7.3/lib/bugsnag/integrations/rack.rb:46:in call'
/usr/local/bundle/gems/actionpack-5.1.6/lib/action_dispatch/middleware/debug_exceptions.rb:59:in call' /usr/local/bundle/gems/actionpack-5.1.6/lib/action_dispatch/middleware/show_exceptions.rb:31:in call'
/usr/local/bundle/gems/railties-5.1.6/lib/rails/rack/logger.rb:36:in call_app' /usr/local/bundle/gems/railties-5.1.6/lib/rails/rack/logger.rb:24:in block in call'
/usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/tagged_logging.rb:69:in block in tagged' /usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/tagged_logging.rb:26:in tagged'
/usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/tagged_logging.rb:69:in tagged' /usr/local/bundle/gems/railties-5.1.6/lib/rails/rack/logger.rb:24:in call'
/usr/local/bundle/gems/actionpack-5.1.6/lib/action_dispatch/middleware/remote_ip.rb:79:in call' /usr/local/bundle/gems/request_store-1.4.1/lib/request_store/middleware.rb:19:in call'
/usr/local/bundle/gems/actionpack-5.1.6/lib/action_dispatch/middleware/request_id.rb:25:in call' /usr/local/bundle/gems/rack-2.0.5/lib/rack/method_override.rb:22:in call'
/usr/local/bundle/gems/rack-2.0.5/lib/rack/runtime.rb:22:in call' /usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/cache/strategy/local_cache_middleware.rb:27:in call'
/usr/local/bundle/gems/actionpack-5.1.6/lib/action_dispatch/middleware/executor.rb:12:in call' /usr/local/bundle/gems/actionpack-5.1.6/lib/action_dispatch/middleware/static.rb:125:in call'
/usr/local/bundle/gems/rack-2.0.5/lib/rack/sendfile.rb:111:in call' /usr/local/bundle/gems/rack-utf8_sanitizer-1.4.0/lib/rack/utf8_sanitizer.rb:20:in call'
/usr/local/bundle/gems/secure_headers-6.0.0.alpha02/lib/secure_headers/middleware.rb:13:in call' /usr/local/bundle/gems/railties-5.1.6/lib/rails/engine.rb:522:in call'
/usr/local/bundle/gems/rack-2.0.5/lib/rack/urlmap.rb:68:in block in call' /usr/local/bundle/gems/rack-2.0.5/lib/rack/urlmap.rb:53:in each'
/usr/local/bundle/gems/rack-2.0.5/lib/rack/urlmap.rb:53:in call' /usr/local/bundle/gems/capybara-3.7.2/lib/capybara/server/middleware.rb:48:in call'
/usr/local/bundle/gems/rack-2.0.5/lib/rack/handler/webrick.rb:86:in service' /usr/local/lib/ruby/2.3.0/webrick/httpserver.rb:140:in service'
/usr/local/lib/ruby/2.3.0/webrick/httpserver.rb:96:in run' /usr/local/lib/ruby/2.3.0/webrick/server.rb:314:in block in start_thread'


Thread 1 [0x331cad8 sleep] Exclusive (yielded share)
blocking: 0

/usr/local/lib/ruby/2.3.0/monitor.rb:187:in lock' /usr/local/lib/ruby/2.3.0/monitor.rb:187:in mon_enter'
/usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/concurrency/load_interlock_aware_monitor.rb:13:in block in mon_enter' /usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/dependencies/interlock.rb:46:in block in permit_concurrent_loads'
/usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/concurrency/share_lock.rb:185:in yield_shares' /usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/dependencies/interlock.rb:45:in permit_concurrent_loads'
/usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/concurrency/load_interlock_aware_monitor.rb:13:in mon_enter' /usr/local/lib/ruby/2.3.0/monitor.rb:212:in mon_synchronize'
/usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/connection_adapters/abstract_adapter.rb:612:in block in log' /usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/notifications/instrumenter.rb:21:in instrument'
/usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/connection_adapters/abstract_adapter.rb:604:in log' /usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/connection_adapters/postgresql/database_statements.rb:59:in query'
/usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/connection_adapters/postgresql_adapter.rb:765:in column_definitions' /usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/connection_adapters/abstract_adapter.rb:167:in columns'
/usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/connection_adapters/schema_cache.rb:67:in columns' /usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/connection_adapters/schema_cache.rb:73:in columns_hash'
/usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/model_schema.rb:471:in load_schema!' /usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/attributes.rb:233:in load_schema!'
/usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/attribute_decorators.rb:50:in load_schema!' /usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/model_schema.rb:464:in block in load_schema'
/usr/local/lib/ruby/2.3.0/monitor.rb:214:in mon_synchronize' /usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/model_schema.rb:461:in load_schema'
/usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/model_schema.rb:353:in attribute_types' /usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/attribute_methods.rb:164:in attribute_names'
/usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/attribute_methods.rb:62:in block in define_attribute_methods' /usr/local/lib/ruby/2.3.0/mutex_m.rb:74:in synchronize'
/usr/local/lib/ruby/2.3.0/mutex_m.rb:74:in mu_synchronize' /usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/attribute_methods.rb:59:in define_attribute_methods'

/usr/local/bundle/gems/bootsnap-1.3.0/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:21:in require' /usr/local/bundle/gems/bootsnap-1.3.0/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:21:in block in require_with_bootsnap_lfi'
/usr/local/bundle/gems/bootsnap-1.3.0/lib/bootsnap/load_path_cache/loaded_features_index.rb:65:in register' /usr/local/bundle/gems/bootsnap-1.3.0/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:20:in require_with_bootsnap_lfi'
/usr/local/bundle/gems/bootsnap-1.3.0/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:29:in require' /usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/dependencies.rb:292:in block in require'
/usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/dependencies.rb:258:in load_dependency' /usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/dependencies.rb:292:in require'
/usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/dependencies.rb:379:in block in require_or_load' /usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/dependencies.rb:36:in block in load_interlock'
/usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/dependencies/interlock.rb:12:in block in loading' /usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/concurrency/share_lock.rb:149:in exclusive'
/usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/dependencies/interlock.rb:11:in loading' /usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/dependencies.rb:36:in load_interlock'
/usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/dependencies.rb:357:in require_or_load' /usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/dependencies.rb:511:in load_missing_constant'
/usr/local/bundle/gems/bootsnap-1.3.0/lib/bootsnap/load_path_cache/core_ext/active_support.rb:43:in load_missing_constant' /usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/dependencies.rb:202:in const_missing'
/usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/inflector/methods.rb:271:in const_get' /usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/inflector/methods.rb:271:in block in constantize'
/usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/inflector/methods.rb:267:in each' /usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/inflector/methods.rb:267:in inject'
/usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/inflector/methods.rb:267:in constantize' /usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/inflector/methods.rb:312:in safe_constantize'
/usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/dependencies.rb:589:in safe_get' /usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/dependencies.rb:620:in safe_constantize'
/usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/inheritance.rb:159:in block in compute_type' /usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/inheritance.rb:158:in each'
/usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/inheritance.rb:158:in compute_type' /usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/reflection.rb:408:in compute_class'
/usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/reflection.rb:404:in klass' /usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/reflection.rb:621:in automatic_inverse_of'
/usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/reflection.rb:609:in block in inverse_name' /usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/reflection.rb:609:in fetch'
/usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/reflection.rb:609:in inverse_name' /usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/reflection.rb:523:in has_inverse?'
/usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/reflection.rb:236:in check_validity_of_inverse!' /usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/reflection.rb:474:in check_validity!'
/usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/associations/association.rb:25:in initialize' /usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/associations.rb:265:in new'
/usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/associations.rb:265:in association' /usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/associations/builder/association.rb:111:in integration'

/usr/local/bundle/gems/actionpack-5.1.6/lib/action_controller/metal/basic_implicit_render.rb:4:in send_action' /usr/local/bundle/gems/actionpack-5.1.6/lib/abstract_controller/base.rb:186:in process_action'
/usr/local/bundle/gems/actionpack-5.1.6/lib/action_controller/metal/rendering.rb:30:in process_action' /usr/local/bundle/gems/actionpack-5.1.6/lib/abstract_controller/callbacks.rb:20:in block in process_action'
/usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/callbacks.rb:131:in run_callbacks' /usr/local/bundle/gems/actionpack-5.1.6/lib/abstract_controller/callbacks.rb:19:in process_action'
/usr/local/bundle/gems/actionpack-5.1.6/lib/action_controller/metal/rescue.rb:20:in process_action' /usr/local/bundle/gems/actionpack-5.1.6/lib/action_controller/metal/instrumentation.rb:32:in block in process_action'
/usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/notifications.rb:166:in block in instrument' /usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/notifications/instrumenter.rb:21:in instrument'
/usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/notifications.rb:166:in instrument' /usr/local/bundle/gems/actionpack-5.1.6/lib/action_controller/metal/instrumentation.rb:30:in process_action'
/usr/local/bundle/gems/actionpack-5.1.6/lib/action_controller/metal/params_wrapper.rb:252:in process_action' /usr/local/bundle/gems/activerecord-5.1.6/lib/active_record/railties/controller_runtime.rb:22:in process_action'
/usr/local/bundle/gems/actionpack-5.1.6/lib/abstract_controller/base.rb:124:in process' /usr/local/bundle/gems/actionview-5.1.6/lib/action_view/rendering.rb:30:in process'
/usr/local/bundle/gems/actionpack-5.1.6/lib/action_controller/metal.rb:189:in dispatch' /usr/local/bundle/gems/actionpack-5.1.6/lib/action_controller/metal.rb:253:in dispatch'
/usr/local/bundle/gems/actionpack-5.1.6/lib/action_dispatch/routing/route_set.rb:49:in dispatch' /usr/local/bundle/gems/actionpack-5.1.6/lib/action_dispatch/routing/route_set.rb:31:in serve'
/usr/local/bundle/gems/actionpack-5.1.6/lib/action_dispatch/routing/mapper.rb:16:in block in <class:Constraints>' /usr/local/bundle/gems/actionpack-5.1.6/lib/action_dispatch/routing/mapper.rb:46:in serve'
/usr/local/bundle/gems/actionpack-5.1.6/lib/action_dispatch/journey/router.rb:50:in block in serve' /usr/local/bundle/gems/actionpack-5.1.6/lib/action_dispatch/journey/router.rb:33:in each'
/usr/local/bundle/gems/actionpack-5.1.6/lib/action_dispatch/journey/router.rb:33:in serve' /usr/local/bundle/gems/actionpack-5.1.6/lib/action_dispatch/routing/route_set.rb:844:in call'
/usr/local/bundle/gems/warden-1.2.7/lib/warden/manager.rb:36:in block in call' /usr/local/bundle/gems/warden-1.2.7/lib/warden/manager.rb:35:in catch'
/usr/local/bundle/gems/warden-1.2.7/lib/warden/manager.rb:35:in call' /usr/local/bundle/gems/rack-2.0.5/lib/rack/etag.rb:25:in call'
/usr/local/bundle/gems/rack-2.0.5/lib/rack/conditional_get.rb:38:in call' /usr/local/bundle/gems/rack-2.0.5/lib/rack/head.rb:12:in call'
/usr/local/bundle/gems/rack-2.0.5/lib/rack/session/abstract/id.rb:232:in context' /usr/local/bundle/gems/rack-2.0.5/lib/rack/session/abstract/id.rb:226:in call'
/usr/local/bundle/gems/actionpack-5.1.6/lib/action_dispatch/middleware/cookies.rb:613:in call' /usr/local/bundle/gems/actionpack-5.1.6/lib/action_dispatch/middleware/callbacks.rb:26:in block in call'
/usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/callbacks.rb:97:in run_callbacks' /usr/local/bundle/gems/actionpack-5.1.6/lib/action_dispatch/middleware/callbacks.rb:24:in call'
/usr/local/bundle/gems/bugsnag-6.7.3/lib/bugsnag/integrations/rack.rb:46:in call' /usr/local/bundle/gems/actionpack-5.1.6/lib/action_dispatch/middleware/debug_exceptions.rb:59:in call'
/usr/local/bundle/gems/actionpack-5.1.6/lib/action_dispatch/middleware/show_exceptions.rb:31:in call' /usr/local/bundle/gems/railties-5.1.6/lib/rails/rack/logger.rb:36:in call_app'
/usr/local/bundle/gems/railties-5.1.6/lib/rails/rack/logger.rb:24:in block in call' /usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/tagged_logging.rb:69:in block in tagged'
/usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/tagged_logging.rb:26:in tagged' /usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/tagged_logging.rb:69:in tagged'
/usr/local/bundle/gems/railties-5.1.6/lib/rails/rack/logger.rb:24:in call' /usr/local/bundle/gems/actionpack-5.1.6/lib/action_dispatch/middleware/remote_ip.rb:79:in call'
/usr/local/bundle/gems/request_store-1.4.1/lib/request_store/middleware.rb:19:in call' /usr/local/bundle/gems/actionpack-5.1.6/lib/action_dispatch/middleware/request_id.rb:25:in call'
/usr/local/bundle/gems/rack-2.0.5/lib/rack/method_override.rb:22:in call' /usr/local/bundle/gems/rack-2.0.5/lib/rack/runtime.rb:22:in call'
/usr/local/bundle/gems/activesupport-5.1.6/lib/active_support/cache/strategy/local_cache_middleware.rb:27:in call' /usr/local/bundle/gems/actionpack-5.1.6/lib/action_dispatch/middleware/executor.rb:12:in call'
/usr/local/bundle/gems/actionpack-5.1.6/lib/action_dispatch/middleware/static.rb:125:in call' /usr/local/bundle/gems/rack-2.0.5/lib/rack/sendfile.rb:111:in call'
/usr/local/bundle/gems/rack-utf8_sanitizer-1.4.0/lib/rack/utf8_sanitizer.rb:20:in call' /usr/local/bundle/gems/secure_headers-6.0.0.alpha02/lib/secure_headers/middleware.rb:13:in call'
/usr/local/bundle/gems/railties-5.1.6/lib/rails/engine.rb:522:in call' /usr/local/bundle/gems/rack-2.0.5/lib/rack/urlmap.rb:68:in block in call'
/usr/local/bundle/gems/rack-2.0.5/lib/rack/urlmap.rb:53:in each' /usr/local/bundle/gems/rack-2.0.5/lib/rack/urlmap.rb:53:in call'
/usr/local/bundle/gems/capybara-3.7.2/lib/capybara/server/middleware.rb:48:in call' /usr/local/bundle/gems/rack-2.0.5/lib/rack/handler/webrick.rb:86:in service'
/usr/local/lib/ruby/2.3.0/webrick/httpserver.rb:140:in service' /usr/local/lib/ruby/2.3.0/webrick/httpserver.rb:96:in run'
/usr/local/lib/ruby/2.3.0/webrick/server.rb:314:in `block in start_thread'

@tessereth
Copy link
Author

Further tracking finds that the symmetric-encryption gem manually calls define_attribute_methods during class load, causing the database query to load the schema while still holding the exclusive dependency loading lock. While doing that is probably not overly desirable, it would be much nicer if rails either supported that behaviour or error/warned if it happened rather than just hanging forever.

@matthewd
Copy link
Member

Thanks, this is a super clear report!

Yeah, talking to the database inside a model definition is bad form, but it shouldn't hang.

It sounds like thread B needs to do the same "try to take lock, if unavailable then surrender other lock and wait" on its attempt to reclaim the share lock that it does on the DB lock.

@tessereth
Copy link
Author

Hm. But AFAICT you'd then have a thread constantly oselating between which lock it's waiting on, which either spins the CPU or requires manual sleeping in some way. And presumably loosing it's place in the queue each time. Which is not ideal but would perhaps solve the problem.

A bunch of the theory around deadlock prevention talks about acquiring all the locks in the same order. The way the LoadInterlockAwareMonitor works reverses the order creating the potential for this deadlock. I assume that class exists as it does to avoid another deadlock somewhere else but I wonder if it's possible to solve that a different way... I might have to continue playing around.

@tessereth
Copy link
Author

tessereth commented Oct 26, 2018

Okay, I checked out what happens if I revert the LoadInterlockAwareMonitor to just a vanilla Monitor. It deadlocks somewhere subtly different but it's basically the same problem. There's code that's roughly:

@lock = LoadInterlockAwareMonitor.new
...
@lock.synchronize do
  ActiveSupport::Dependencies.interlock.permit_concurrent_loads do
    execute_sql_query
  end
end

So with thread A being roughly the same, thread B:

  1. Has the share lock
  2. Takes the db connection lock
  3. Releases the share lock
  4. Executes query
  5. Blocks reclaiming share lock (still holding db connection lock)

One possibility would be to just never hold both locks at the same time, eg:

class LoadInterlockAwareMonitor < Monitor
  def synchronize
    ActiveSupport::Dependencies.interlock.permit_concurrent_loads do
      super do
        yield
      end
    end
  end
end

But this makes the rather strong assumption that no code inside AbstractAdapter#log will ever try to autoload something. That may be a safe enough claim for the built in adapters but I'm not sure how that would go with third party stuff.

Another possibility would be to not reclaim the interlock until after the lock is released, eg

@lock = Monitor.new # without the current overrides in LoadInterlockAwareMonitor
@lock.synchronize do
  do_stuff_that_can_autoload
  ActiveSupport::Dependencies.interlock.permit_concurrent_loads_until_mon_exit(@lock)
  do_known_safe_stuff
end # <- releases @lock then reacquires interlock share

This is more of a pain to code/maintain/understand but at least the caller has to opt-in. And a quick scan of the current uses of permit_concurrent_loads in active record suggests this might be a sufficient replacement.

What do you think? I'd be happy to give that last option a go if you think it might be worth trying.

@matthewd
Copy link
Member

I was feeling pretty enthusiastic about the idea of just legislating that the locks shall not meet... until I remembered The Tricky One: 😕

@connection.lock.synchronize do
begin
transaction = begin_transaction options
yield

IIRC that's the only place we invoke user-application code while holding the DB lock... but it's enough to complicate our lives substantially. For everything else we could probably get away with something like your permit-around-synchronize suggestion, but this one needs the interlock to be able to run the application code.

Your latter idea feels like it could almost still work here: begin_transaction currently drops the interlock (sometimes), but it probably doesn't need to: a BEGIN statement isn't in danger of getting stuck on a DB-level lock or anything, so we could retain the interlock until after the yield, and only then permit loads while we do the commit (because that one can get stuck). The "almost", however, comes in the application code: it could be doing anything, and thus has a good chance of deciding to permit_concurrent_loads for its own [also important] reasons -- at which point we're back to the same problem.

My earlier suggestion also doesn't work: when we're inside a transaction, we cannot give up the DB lock for any reason, because the connection is in a state no-one else should see.

@tessereth
Copy link
Author

tessereth commented Oct 26, 2018

Ah gross transactions. Yeah, that'll be a problem.

The more I think about it, the more I wonder if this is actually solvable in the general case. Reminding myself of the 4 necessary conditions for deadlocks:

  1. Mutual exclusion: I'd say it's fairly obvious that this is necessary given the reasons you just stated plus your interlock documentation.
  2. Hold and wait: Again fairly clear from the conversation so far that we can't avoid holding the db lock while waiting for the interlock. And I assume we can't let go of an exclusive interlock lock while waiting for the db lock.
  3. No preemption: Having one thread steal the lock from someone else seems super bad given all the reasons for 1 and 2.
  4. Circular wait: This is the one I've been trying to find ways to get around but I just don't think it's possible. Even just the way share locks get upgraded to exclusive locks could theoretically cause issues. Eg
Thread A Thread B
gets share lock (S) gets share lock (S)
gets unrelated lock (L)
tries to upgrade (S) to exclusive
yields (S) share and waits for B
blocked getting (L)

Which is all somewhat depressing. Maybe this is actually a problem of mitigation and/or detection. My plan would probably still make deadlocks less likely, though whether it's worth it for the added complexity is a question for a maintainer to decide. Googling suggests true deadlock detection requires knowing about all the locks, who has them and who's blocked by them which is a tall order in this situation. Unless we decide to only care about this pair of locks.

I guess at the end of the day there's always timeouts and exceptions...

@rails-bot
Copy link

rails-bot bot commented Jan 24, 2019

This issue has been automatically marked as stale because it has not been commented on for at least three months.
The resources of the Rails team are limited, and so we are asking for your help.
If you can still reproduce this error on the 5-2-stable branch or on master, please reply with all of the information you have about it in order to keep the issue open.
Thank you for all your contributions.

@rails-bot rails-bot bot added the stale label Jan 24, 2019
@rails-bot rails-bot bot closed this as completed Jan 31, 2019
@trcarden
Copy link

trcarden commented Mar 29, 2019

Came across a similar issue with globalize.

Because globalize's main method translate eventually calls a postgresql adapter method clear_cache! to mount its translated columns in the Model definition we get deadlocks during autoload in our test environment using system-tests.

Essentially any gem that has a DB call as part of a class/module definition that can be autoloaded will fall prey to deadlock issues. I would like to reopen this and see what we can do. It does seem like #34308 was making some progress against a related issue and this issue just plain stopped despite it being fairly important to larger existing apps adopting the new rails testing systems.

@matthewd I am bit new to the threading systems in rails do you know what we can do to help?

@tessereth
Copy link
Author

I just realised I never put our workaround in this thread. For what it's worth, we avoided this by eager loading classes in CI. This worked well enough for us. Locally it hangs rarely enough that people can just restart the test and we saw no performance change in CI.

test.rb

config.eager_load = !!ENV['CI']

I also noticed this news post which may or may not make this bug redundant in rails 6. But I haven't looked into it in particular detail.

https://weblog.rubyonrails.org/2019/2/22/zeitwerk-integration-in-rails-6-beta-2/

@trcarden
Copy link

trcarden commented Mar 29, 2019

Haha, thats funny. After trying to apply the patch and debug this thing for a few hours myself I came to a very similar conclusion independently and also forgot to drop it here.

We did ours in a slightly different way because we could reproduce the deadlock locally not just in a CI environment and call XXX::Application.eager_load! before the test suite kicks off.

Maybe also a quick reference to @eileencodes as I know she has been so helpful in getting the system test system online (#28083) so she is aware of the issue too.

Also for people who stumble upon this in the future I found this a related bug for others reference: rails/spring#519

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants