New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eager autoload mail gem when eager load is true #32808

Merged
merged 4 commits into from May 23, 2018

Conversation

Projects
None yet
7 participants
@sj26
Contributor

sj26 commented May 4, 2018

We had a production issue where our Sidekiq worker threads all became deadlocked while autoloading a file within the mail gem, required via ActionMailer, despite setting config.eager_load = true for our Rails application. Mail uses ruby's native autoload instead of ActiveSupport, so ActiveSupport can't eager load it automatically. Mail.eager_autoload! exists and works great, ActionMailer just doesn't call it during eager loading. Adding it to the ActionMailer Railtie's eager_load_namespaces takes care of calling Mail.eager_autoload! during the eager_load! initializer.

Eager autoload mail gem when eager load is true
We had a production issue where our Sidekiq worker threads all became
deadlocked while autoloading a file within the mail gem, required via
ActionMailer, despite setting our Rails applicaiton to eager load.
`Mail.eager_autoload!` exists and works great, ActionMailer just doesn't
call it during eager loading. Adding it to the ActionMailer Railtie's
eager_load_namespaces takes care of calling `Mail.eager_autoload!`
during the `eager_load!` initializer.
@rails-bot

This comment has been minimized.

rails-bot commented May 4, 2018

Thanks for the pull request, and welcome! The Rails team is excited to review your changes, and you should hear from @kaspth (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

This repository is being automatically checked for code quality issues using Code Climate. You can see results for this analysis in the PR status below. Newly introduced issues should be fixed before a Pull Request is considered ready to review.

Please see the contribution instructions for more information.

sj26 added some commits May 4, 2018

@sikachu

This comment has been minimized.

Member

sikachu commented May 6, 2018

Thank you for the patch.

Is it possible for you to create a simple Rails app that also simulate the issue so we can confirm that this issue has been fixed?

@sikachu sikachu added the actionmailer label May 6, 2018

@sj26

This comment has been minimized.

Contributor

sj26 commented May 6, 2018

It's a race condition within Ruby's native autoload functionality, I'm not sure if there's a way to replicate that except emergently under high load. But I can demonstrate that Mail is not eager loaded when Rails is asked to eager load:

gem "railties", "5.2.0"
gem "actionmailer", "5.2.0"

require "rails/application"
require "action_mailer/railtie"

class Application < Rails::Application
  config.eager_load = true
end

$LOADED_FEATURES.size
# => 440

Application.initialize!

$LOADED_FEATURES.size
# => 887

Mail.autoload? :ContentTypeElement
# => "mail/elements/content_type_element" # i.e. true

$LOADED_FEATURES.grep /content_type_element/
# => [] # empty, so mail/elements/content_type_element is not loaded

Mail::ContentTypeElement

Mail.autoload? :ContentTypeElement
# => nil # i.e. false

$LOADED_FEATURES.size
# => 901

$LOADED_FEATURES.grep /content_type_element/
# => [".../mail-2.7.0/lib/mail/elements/content_type_element.rb"]

Mail.eager_autoload!

$LOADED_FEATURES.size
# => 921

This means if two threads in the same process trigger these autoloads at the same time they may become deadlocked because Ruby's built-in autoload is infamously not thread safe.

Here's the same thing with a monkey patch per this PR:

gem "railties", "5.2.0"
gem "actionmailer", "5.2.0"

require "rails/application"
require "action_mailer/railtie"

ActionMailer::Railtie.config.before_eager_load do
  require "mail"
  Mail.eager_autoload!
end

class Application < Rails::Application
  config.eager_load = true
end

$LOADED_FEATURES.size
# => 440

Application.initialize!

$LOADED_FEATURES.size
# => 921

Mail.autoload? :ContentTypeElement
# => nil # i.e. false

$LOADED_FEATURES.grep /content_type_element/
# => [".../mail-2.7.0/lib/mail/elements/content_type_element.rb"]

Mail.eager_autoload!

$LOADED_FEATURES.size
# => 921

@deivid-rodriguez deivid-rodriguez referenced this pull request May 6, 2018

Merged

Upgrade Rails to 5.2 #3225

1 of 2 tasks complete
@sj26

This comment has been minimized.

Contributor

sj26 commented May 7, 2018

We discovered this issue after a lot of digging into frozen background job workers. Here's the thread list we saw:

$ rbtrace -p 1281 --interactive
*** attached to process 1281
irb> Thread.list
=> [#<Thread:0x00007f97ead06090@sidekiq-5.1.1/lib/sidekiq/util.rb:23 sleep>,
#<Thread:0x00007f97eacff538@sidekiq-5.1.1/lib/sidekiq/util.rb:23 sleep>,
#<Thread:0x00007f97eacf3558@concurrent-ruby-1.0.5/lib/concurrent/executor/ruby_thread_pool_executor.rb:317 sleep_forever>,
#<Thread:0x00007f97eacafbf0@sidekiq-5.1.1/lib/sidekiq/util.rb:23 sleep>,
#<Thread:0x00007f97eacafa38@sidekiq-5.1.1/lib/sidekiq/util.rb:23 sleep>,
#<Thread:0x00007f97eacaf790@sidekiq-5.1.1/lib/sidekiq/util.rb:23 sleep_forever>,
#<Thread:0x00007f97eacaf4e8@sidekiq-5.1.1/lib/sidekiq/util.rb:23 sleep_forever>,
#<Thread:0x00007f97eacaf268@sidekiq-5.1.1/lib/sidekiq/util.rb:23 sleep_forever>,
#<Thread:0x00007f97eacaef70@sidekiq-5.1.1/lib/sidekiq/util.rb:23 sleep_forever>,
#<Thread:0x00007f97eacaecf0@sidekiq-5.1.1/lib/sidekiq/util.rb:23 sleep_forever>,
#<Thread:0x00007f97eacae9f8@sidekiq-5.1.1/lib/sidekiq/util.rb:23 sleep_forever>,
#<Thread:0x00007f97eacae840@sidekiq-5.1.1/lib/sidekiq/util.rb:23 sleep_forever>,
#<Thread:0x00007f97eacae660@sidekiq-5.1.1/lib/sidekiq/util.rb:23 sleep_forever>,
#<Thread:0x00007f97eacae278@sidekiq-5.1.1/lib/sidekiq/util.rb:23 sleep_forever>,
#<Thread:0x00007f97eacade40@sidekiq-5.1.1/lib/sidekiq/util.rb:23 sleep_forever>,
#<Thread:0x00007f97eacad5d0@sidekiq-5.1.1/lib/sidekiq/util.rb:23 sleep_forever>,
#<Thread:0x00007f97eacad350@sidekiq-5.1.1/lib/sidekiq/util.rb:23 sleep_forever>,
#<Thread:0x00007f97eacad198@sidekiq-5.1.1/lib/sidekiq/util.rb:23 sleep_forever>,
#<Thread:0x00007f97eacacfb8@sidekiq-5.1.1/lib/sidekiq/util.rb:23 sleep_forever>,
#<Thread:0x00007f97e9ff7e08@sidekiq-5.1.1/lib/sidekiq/util.rb:23 sleep_forever>,
#<Thread:0x00007f97e9ff7c50@sidekiq-5.1.1/lib/sidekiq/util.rb:23 sleep_forever>,
#<Thread:0x00007f97e9ff79f8@sidekiq-5.1.1/lib/sidekiq/util.rb:23 sleep_forever>,
#<Thread:0x00007f97e9ff7840@sidekiq-5.1.1/lib/sidekiq/util.rb:23 sleep_forever>,
#<Thread:0x00007f97e9ff7688@sidekiq-5.1.1/lib/sidekiq/util.rb:23 sleep_forever>,
#<Thread:0x00007f97e9ff74d0@sidekiq-5.1.1/lib/sidekiq/util.rb:23 sleep_forever>,
#<Thread:0x00007f97e9ff7318@sidekiq-5.1.1/lib/sidekiq/util.rb:23 sleep_forever>,
#<Thread:0x00007f97e9ff7160@sidekiq-5.1.1/lib/sidekiq/util.rb:23 sleep_forever>,
#<Thread:0x00007f97e9ff6f08@sidekiq-5.1.1/lib/sidekiq/util.rb:23 sleep_forever>,
#<Thread:0x00007f97e9ff6d50@sidekiq-5.1.1/lib/sidekiq/util.rb:23 sleep_forever>,
#<Thread:0x00007f97e9ff6b98@sidekiq-5.1.1/lib/sidekiq/util.rb:23 sleep_forever>]

The regular sidekiq threads — the heartbeat and dispatcher — were working fine. The processor threads all had a backtrace to the same line in the mail gem:

irb> Thread.list.last.backtrace
=> ["mail-2.7.0/lib/mail/fields/content_type_field.rb:39:in `element'",
"mail-2.7.0/lib/mail/fields/content_type_field.rb:33:in `parse'",
...
"mail-2.7.0/lib/mail/message.rb:1327:in `[]='",
"actionmailer-5.1.5/lib/action_mailer/base.rb:962:in `block in create_parts_from_responses'",
...
"actionmailer-5.1.5/lib/action_mailer/base.rb:833:in `mail'",
"app/mailers/some_mailer.rb:21:in `some_action'",
"actionpack-5.1.5/lib/abstract_controller/base.rb:186:in `process_action'",
...
"sidekiq-5.1.1/lib/sidekiq/processor.rb:73:in `run'",
"sidekiq-5.1.1/lib/sidekiq/util.rb:16:in `watchdog'",
"sidekiq-5.1.1/lib/sidekiq/util.rb:25:in `block in safe_thread'"]

The line is pretty innocuous:

https://github.com/mikel/mail/blob/2.7.0/lib/mail/fields/content_type_field.rb#L39

        @element ||= Mail::ContentTypeElement.new(value)

It shouldn't be causing a thread to sleep. So I went looking deeper with gdb:

Thread 27 (Thread 0x7f97e5afe700 (LWP 1344)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007f98022abb89 in ?? () from /usr/lib/x86_64-linux-gnu/libruby-2.5.so.2.5
#2  0x00007f98022ad8bd in ?? () from /usr/lib/x86_64-linux-gnu/libruby-2.5.so.2.5
#3  0x00007f98022b1d62 in ?? () from /usr/lib/x86_64-linux-gnu/libruby-2.5.so.2.5
#4  0x00007f98022c953d in ?? () from /usr/lib/x86_64-linux-gnu/libruby-2.5.so.2.5
#5  0x00007f98021c0ce6 in rb_ensure () from /usr/lib/x86_64-linux-gnu/libruby-2.5.so.2.5
#6  0x00007f98022cccf0 in rb_autoload_load () from /usr/lib/x86_64-linux-gnu/libruby-2.5.so.2.5
#7  0x00007f98022cce8e in ?? () from /usr/lib/x86_64-linux-gnu/libruby-2.5.so.2.5
#8  0x00007f98022cd161 in ?? () from /usr/lib/x86_64-linux-gnu/libruby-2.5.so.2.5
#9  0x00007f98022dc659 in ?? () from /usr/lib/x86_64-linux-gnu/libruby-2.5.so.2.5
#10 0x00007f98022e1067 in ?? () from /usr/lib/x86_64-linux-gnu/libruby-2.5.so.2.5

rb_autoload_load — they were blocked on a futex within autoload because all the threads were trying to autoload the Mail::ContentTypeElement constant, the first time this constant was seen in this process, at the same time. I did a little more diagnosis in the shape of the examples demonstrating the lack of autoloading above to confirm this constant wasn't loaded yet at the time.

We were experiencing high load on the same particular mailer with very similar arguments at the time. It seemed like enough threads were doing exactly the same work at the same time in order to create the condition.

Adding a Mail.eager_autoload! to our production env fixed it for us, but my expectation was that this should be done by rails when eager load is turned on.

@sj26

This comment has been minimized.

Contributor

sj26 commented May 7, 2018

Also note that the Mail.eager_autoload! definitely exists since mail 2.5.4, the minimum targeted version in actionmailer.gemspec, through to the latest version (2.7.0) and master.

@matthewd

This comment has been minimized.

Member

matthewd commented May 7, 2018

rb_autoload_load — they were blocked on a futex within autoload because all the threads were trying to autoload the Mail::ContentTypeElement constant, the first time this constant was seen in this process, at the same time.

FWIW that sounds like a Ruby bug if they're all stuck there: per the issue you linked, @nahi made autoload threadsafe in 2.0. (😍)

autoload's lock can get tangled with the Rails load lock, though that shouldn't happen in production.

Either way, this sounds like a reasonable change to me. r? @rafaelfranca

@sj26

This comment has been minimized.

Contributor

sj26 commented May 7, 2018

I was gonna cc you @matthewd but thought you were busy enough 😅

I know they claim it’s fixed, but I don’t trust Ruby’s autoload any more. But it makes sense to me to eager load in production if we’re meant to eager load, regardless. 🙇🏻

@sj26

This comment has been minimized.

Contributor

sj26 commented May 21, 2018

How's this looking, folks? Anything else needed from me?

@f-mer

This comment has been minimized.

Contributor

f-mer commented May 22, 2018

Sorry for nittpicking 😅 but could Mail not be appended to config.eager_load_namespaces like ActionMailer is?

@sj26

This comment has been minimized.

Contributor

sj26 commented May 22, 2018

@f-mer please see the commits — I tried that but the Mail constant isn’t yet defined when evaluating the Railtie in some cases so it fails. Hence the additional require, too.

Also, Mail isn’t actually using ActiveSupport::Dependencies and exposes .eager_autoload, not .eager_load

@@ -10,6 +10,11 @@ class Railtie < Rails::Railtie # :nodoc:
config.action_mailer = ActiveSupport::OrderedOptions.new
config.eager_load_namespaces << ActionMailer
config.before_eager_load do
require "mail"
Mail.eager_autoload!

This comment has been minimized.

@rafaelfranca

rafaelfranca May 22, 2018

Member

Can we define a eager_load! in ActionMailer and move this code to there?

This comment has been minimized.

@sj26

sj26 May 22, 2018

Contributor

ActionMailer already has an implicit eager_load! as part of ActiveSupport::Dependencies which autoloads the ActionMailer namespace's autoloads. There doesn't seem to be a clean way to integrate non-ActiveSupport::Dependencies loads into the ActiveSupport::Dependencies eager load gear. The config.eager_load_namespaces << ActionMailer is causing ActionMailer.eager_load! to be called when Rails.application.config.eager_loader = true during Rails boot.

I'd also argue that this is orthogonal to ActionMailer's ActiveSupport::Dependencies load. This change is instead making sure that a third party dependency that ActionMailer introduces also respects and integrates with the Rails application lifecycle. imho, having this before eager load block is the cleanest solution.

This comment has been minimized.

@rafaelfranca

rafaelfranca May 22, 2018

Member

Yes, I get that. And the place to make sure a third party dependency that Action Mailer introduces is eager loaded is on the Action Mailer constant, since applications what are not Rails applications but use Action Mailer will not execute the railtie.

We should do what we do in ActiveRecord, defining a eager_load! method in ActionMailer and call Mail.eager_autoload! there. This way we avoid this before_eager_load lifecycle and applications that uses Action Mailer but not are rails can also eager load it.

This comment has been minimized.

@sj26

sj26 May 22, 2018

Contributor

Sorry, I didn’t see that prior art, and misinterpreted your comment. I’ll update. Thanks!

This comment has been minimized.

@sj26

sj26 May 22, 2018

Contributor

I'm also sorry for the tone of my original reply here, it's really brusque, patronising and argumentative on reading. Open source is hard! I appreciate your time and patience!

This comment has been minimized.

@sj26

sj26 May 22, 2018

Contributor

I've pushed another commit which moves the require and Mail.eager_autoload! into ActionMailer.eager_load!, like ActiveRecord.eager_load! I also tried removing the require, but the Mail constant doesn't always seem to be loaded or autoloadable at that point, even after eager loading ActionMailer itself.

Would you like me to squash the commits? They've become pretty noisy... Or will you do that on merge? Or not at all?

Thanks @rafaelfranca!

This comment has been minimized.

@rafaelfranca

rafaelfranca May 23, 2018

Member

No worries! Your comments were fine to me, it was a valid questioning. The patch looks good, I'll squash with the merge.

Thanks.

@rafaelfranca rafaelfranca merged commit 4d43b05 into rails:master May 23, 2018

2 checks passed

codeclimate All good!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

rafaelfranca added a commit that referenced this pull request May 23, 2018

Eager autoload mail gem when eager load is true (#32808)
* Eager autoload mail gem when eager load is true

We had a production issue where our Sidekiq worker threads all became
deadlocked while autoloading a file within the mail gem, required via
ActionMailer, despite setting our Rails applicaiton to eager load.
`Mail.eager_autoload!` exists and works great, ActionMailer just doesn't
call it during eager loading. Adding it to the ActionMailer Railtie's
eager_load_namespaces takes care of calling `Mail.eager_autoload!`
during the `eager_load!` initializer.

* 'Mail' isn't defined yet, use before_eager_load instead

* Make sure mail is loaded

* Move eager load of Mail into ActionMailer.eager_load!

[Samuel Cochran + Rafael Mendonça França]
@sj26

This comment has been minimized.

Contributor

sj26 commented Jun 5, 2018

Thanks for getting this merged, @rafaelfranca! 💖

@sj26 sj26 deleted the sj26:eager-autoload-mail branch Jun 5, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment