Skip to content

after_fork hook for daemonize without clustering? #335

Closed
grk opened this Issue Jul 18, 2013 · 22 comments
@grk
grk commented Jul 18, 2013

Hi,

One of our customers ran into an issue with running newrelic instrumentation on puma servers with the following config:

environment 'production'
pidfile '/tmp/puma.0.pid'
state_path '/tmp/puma.0.state'
stdout_redirect '/tmp/log/puma.stdout', '/tmp/log/puma.stderr'
threads 4, 16
bind 'tcp://0.0.0.0:2500'
daemonize true

It looks like the newrelic agent loads in the main process, but doesn't send any data after forking. We did a workaround for this by running puma in clustered mode with 1 worker and the following addition to the config:

on_worker_boot do
  require 'newrelic_rpm'
  NewRelic::Agent.manual_start
end

and adding require: false to newrelic_rpm in the Gemfile.

Do you have any suggestions for how to make it work in non-clustered mode with daemonize? I think a after_fork hook similar to on_worker_boot would be a solution.

@grk grk referenced this issue Jul 18, 2013
Closed

New Relic support #128

@jasonrclark

Hey @grk. I'd mentioned in another thread that I was hopeful this might be addressed by our cluster mode fix. Turns out we're only partway there.

The newly released 3.6.6.147 version of the Ruby agent has cluster mode support. From my testing, you shouldn't need to :require => false or add the on_worker_boot workarounds anymore, but you do continue to need to set cluster mode with 1 worker to get things working right.

If you're interested, I'd love to hear whether that works for you. I've also entered the Puma daemonize problem onto our issues list, so at some point we can hopefully get it fixed and comment back here.

You're absolutely right that an after_fork hook in Puma itself would make this a lot cleaner for us in the agent, but we might be able to find a fix without requiring Puma to change.

@grk
grk commented Jul 30, 2013

@jasonrclark that's great to hear! This fixes the issue for our customer, since we can run puma in clustered mode for him. However, clustered mode doesn't work on jruby, so a fix for daemonize would be nice. Luckily so far noone used a combination of jruby + puma + newrelic yet.

@renatosnrg

I'm having this same issue: NewRelic is not reporting data when running Puma in non-clustered mode with daemonize. If I enable the clustered mode it works fine.

I'm using the latest versions of both newrelic_rpm (3.6.6.147) and puma (2.4.0) gems. I also have an open ticket with NewRelic support. This issue thread finally explain what is happening :)

Thanks!

@ondrejbartas

Hello, I had same issue. Daemonize and 0 workers.

When I set workers to 3 (didn't try 1) it just started to work.

Thank you guys. This really helped me.

@HDLP9
HDLP9 commented Sep 18, 2013

Hi,

Still no fix for non-clustered mode? With clustered mode it works like a charm but it would be great to have a solution for this.

Thanks!

@jasonrclark

Hey @Helio-Pereira

We don't have a fix to this yet, but it is definitely on our list to address.

That said, spent a few minutes researching this morning and there's a hook since 2.3 for on_booted after the Process.daemon call that could be what we need. However, I'm not seeing where we can get to it from the agent (https://github.com/puma/puma/blob/master/lib/puma/events.rb#L95-L97) since I haven't found a way to access the current Runner or CLI objects Puma's running with.

Anyone more familiar with Puma have any ideas if there's a good way to do that I'm missing?

@evanphx
Puma member
evanphx commented Sep 19, 2013

I guess I need to dig into what changes newrelic made...

@jasonrclark

@evanphx Happy to help out in any way I can.

What we did with cluster-mode was to have our instrumentation automatically add an on_worker_boot handler. https://github.com/newrelic/rpm/blob/master/lib/new_relic/agent/instrumentation/puma.rb

I think we could do something similar for daemonize if we had a way to tap into the on_booted event, but haven't figured how to get to it from the agent.

@jasonrclark

Just released newrelic_rpm 3.7.1.180 which contains a (currently optional) mode that I believe should get the New Relic thread restarted cleanly even after daemonizing. If this works out for folks, we may enable it by default and be able to get away from having to register hooks in Puma itself, which would be awesome!

To enable the feature, update your agent to 3.7.1.180 and add this to your newrelic.yml

  restart_thread_in_children: true

Keen for any feedback on whether that helps. Thanks!

Note: original comment had the wrong key name (restart_threads_in_child) which would not work. I've updated the comment here so anyone finding it won't have to follow the comment trail to find my typo.

@HDLP9
HDLP9 commented Dec 26, 2013

Hi @jasonrclark!

Thanks for all the work. I've just updated to 3.7.1.180 version, added restart_thread_in_child: true to newrelic.yml and deployed the app. (Gemfile with gem 'newrelic_rpm')

In newrelic the app is now active (green) but no data is being sent. When checking the HTML I see that newrelic js tags are missing.

Do I need to make any more configuration?

This is really anoying beacause for a small, but important app, I need to have 2 ruby processes (more memory usage) to be able to send data to newrelic.

Many thanks!

@seuros
seuros commented Dec 26, 2013

same issue here!

@benweint

Hi @Helio-Pereira - thanks for trying this out! There shouldn't be any additional configuration that's needed.

It's possible that you're just hitting a bug in our implementation (as Jason alluded to, the restart_thread_in_children feature is still considered experimental), or there might be an unrelated reason that your app isn't sending data in this scenario.

It's difficult to say what the issue is remotely without more information, specifically:

  1. The contents of your newrelic_agent.log after a restart of Puma (extending for a few minutes after the restart, preferably). This will be most useful if you're able to set log_level: debug in your newrelic.yml file in order to increase the verbosity of the log.
  2. The exact command you're using to start puma
  3. The version of Puma you're using, and a brief description of your app (is it a Rails app, Sinatra, something else?)

If you don't want to post the log contents publicly, feel free to follow up via a support ticket (mention my name and it'll get routed to the right place) or with me directly.

Edit: original comment had an incorrect name for the configuration key. The correct name is restart_thread_in_children, not restart_thread_in_child.

@benweint

@Helio-Pereira we just uncovered an issue that may be causing the restart_thread_in_child functionality to not work, depending on what version of Ruby you're on.

If you're on MRI < 2.0.0-p353, you're likely hitting https://bugs.ruby-lang.org/issues/8616
If you're on Rubinius 2.x, you're likely hitting rubinius/rubinius#2861

In short: we rely on Thread#alive? to return an accurate value in order for our automatic restarting to work, and those two bugs cause it not to. We can probably figure out how to work around them, since a large number of folks are likely on versions of MRI / Rubinius that are affected.

@HDLP9
HDLP9 commented Dec 31, 2013

Hi @benweint,

sorry for not responding but you know...holidays!!!!

I'm using:

  • Puma 2.7.1
  • Rails 4.0.2 with turbolink enabled
  • ruby-2.0.0-p247

So the problem must be the ruby version. For me I don´t think that you need to work around this issue, not worth the trouble.

I´ll try to update to the new stable version 2.1 and let you know.

@jeroenj
jeroenj commented Jan 2, 2014

Hi @benweint,

I've also tried this config but it doesn't seem to work either:

[01/02/14 09:57:08 +0100 server (5772)] INFO : Starting the New Relic agent in "production" environment.
[01/02/14 09:57:08 +0100 server (5772)] INFO : To prevent agent startup add a NEWRELIC_ENABLE=false environment variable or modify the "production" section of your newrelic.yml.
[01/02/14 09:57:08 +0100 server (5772)] INFO : Reading configuration from config/newrelic.yml
[01/02/14 09:57:08 +0100 server (5772)] INFO : Enabling the Request Sampler.
[01/02/14 09:57:08 +0100 server (5772)] INFO : Environment: production
[01/02/14 09:57:08 +0100 server (5772)] INFO : Dispatcher: puma
[01/02/14 09:57:08 +0100 server (5772)] INFO : Application: trouw
[01/02/14 09:57:09 +0100 server (5772)] INFO : Installing ActiveRecord instrumentation
[01/02/14 09:57:09 +0100 server (5772)] INFO : Installing Net instrumentation
[01/02/14 09:57:09 +0100 server (5772)] INFO : Installing deferred Rack instrumentation
[01/02/14 09:57:09 +0100 server (5772)] INFO : Installing Puma cluster mode support
[01/02/14 09:57:09 +0100 server (5772)] INFO : Installing Rails 3 Controller instrumentation
[01/02/14 09:57:09 +0100 server (5772)] INFO : Installing Rails 3.1/3.2 view instrumentation
[01/02/14 09:57:09 +0100 server (5772)] INFO : Installing Rails3 Error instrumentation
[01/02/14 09:57:09 +0100 server (5772)] INFO : Finished instrumentation
[01/02/14 09:57:09 +0100 server (5772)] INFO : Doing deferred dependency-detection before Rack startup
[01/02/14 09:57:09 +0100 server (5772)] INFO : Starting Agent shutdown
[01/02/14 09:57:09 +0100 server (5790)] INFO : Starting Agent shutdown

I don't have any special configuration set for puma and restart_thread_in_child: true is set. I'm using rails 3.2.15 on ruby 2.1.0. If you need anything else, please let me know. :)

@benweint
benweint commented Jan 2, 2014

@jeroenj well this is embarrassing - the setting is actually called restart_thread_in_children, not restart_thread_in_child. The CHANGELOG and Jason and I's comments above are incorrect (the setting is declared in the source here). Mind giving it a try with the correctly-named setting?

Sorry for the mix-up! I'll get the CHANGELOG corrected.

@jeroenj
jeroenj commented Jan 2, 2014

We all make mistakes, right?! :)

Anyway, that seems to fix it. My application is now reporting to New Relic. :thumbsup:

@benweint
benweint commented Jan 2, 2014

@jeroenj Glad to hear it, thanks for following up!

@jasonrclark

:disappointed: ack! I've revised my comment above with a note so anyone stumbling on it will hopefully get the right key name first.

@washu
washu commented Jan 15, 2014

I just created a new gem call daemon_signals, require it after puma in your gemfile and you will get a nice SIGCONT after daemonization is called.

@schneems

Is this still an issue? The original issue reads somewhat like a feature request. Can we close this issue?

@evanphx evanphx closed this Mar 1, 2014
@benweint

Hey folks - I know this issue's closed, but I wanted to give you a brief update. We've just released a new version of the newrelic_rpm gem (3.8.0) that should fix this issue.

In the latest version of the gem, we will automatically restart our background thread when we notice that a web transaction is processed and the PID we last started the thread in differs from the current PID.

This is effectively the same as the restart_thread_in_children setting mentioned above switching from default off to default on.

If you're still having trouble getting Puma (in any mode) working with newrelic_rpm, please let us know via a support ticket or in our forum and we'll be happy to help you out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.