Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to use Infinite Tracing with a forking webserver (puma) #1706

Closed
chrisholmes opened this issue Dec 19, 2022 · 8 comments
Closed

Unable to use Infinite Tracing with a forking webserver (puma) #1706

chrisholmes opened this issue Dec 19, 2022 · 8 comments
Assignees
Labels
community To tag external issues and PRs submitted by the community core technology technical debt

Comments

@chrisholmes
Copy link

Description

We have a Rails application that is deployed with a forking webserver (puma) that we have recently tried to migrate onto Infinite tracing. While it works locally when puma is set to a single work mode we have found that it doesn't work when the tracer is set to clustered(forking) mode where the rails app is preloaded (for performance reasons) before forking .

We've found that after a fork, while other parts of the agent are able to restart successfully, the infinite tracer is not restarted successfully.

If we try to manually restart the infinite tracer's connection then a GRPC error is raised: RuntimeError: grpc cannot be used before and after forking. This issue discusses the issue and suggests that no fix is forthcoming soon: grpc/grpc#8798.

Do you have any recommendations for fixes/optimal configuration for puma? Is it possible to defer the infinite tracer to boot after a fork?

Expected Behavior

Tracing continues to work after a fork

Steps to Reproduce

  • Create a rails 7 app
  • Add infinite tracing
  • Run the rails app with puma in clustered mode with a preloaded app

Your Environment

Our application is a Rail 7 application running on Ruby 3.1 and we are currently running puma 5.6.

We have the following puma settings:

workers 2
preload_app!
@workato-integration
Copy link

@github-actions github-actions bot added the community To tag external issues and PRs submitted by the community label Dec 19, 2022
@hannahramadan
Copy link
Contributor

Hi @chrisholmes! Thank you for bringing attention to this and sharing the GRPC issue! We will take a look at workarounds and follow up with updates.

@chrisholmes
Copy link
Author

Hi @hannahramadan, thanks for quick reply.

After raising this, I realised I could do a workaround by not requiring the infinite tracing source until after work book. The workaround looks like:

#in Gemfile
gem 'newrelic-infinite_tracing', require: false

#in config/puma.rb
on_worker_boot do
  require 'newrelic/infinite_tracing'
  NewRelic::Agent.manual_start
end

@hannahramadan
Copy link
Contributor

@chrisholmes thanks for sharing your solution and it's great to see you've found a workaround. We are still interested in addressing this, so I'll leave the issue open while we continue to take a look :)

@hannahramadan
Copy link
Contributor

Hi @chrisholmes! Thanks again for reporting this and sharing your workaround. I'm changing the "bug" label just for some internal bookkeeping that differentiates between agent code bugs and agent compatibility issues, but we're going to keep this issue open for others to find your workaround and for us to continue to brainstorm compatibility solutions with forking. Cheers!

@x-yuri
Copy link

x-yuri commented Dec 28, 2022

Oh, I've been experimenting with newrelic, puma and sinatra as of late and have some understanding of how they work internally. But I've never used infinite tracing, so your steps to reproduce are kind of lacking for me. Still, you might want to try the fork_worker setting. It seems to be generally a better option than preload_app. Additionally, the NewRelic agent should start in the workers.

On a side note, from the description of newrelic-infinite_tracing:

If you want distributed tracing to use tail-based sampling (Infinite Tracing), you need to add both newrelic_rpm and newrelic-infinite_tracing to your application's Gemfile.

This seems outdated. newrelic-infinite_tracing-8.14.0 depends on newrelic_rpm and newrelic/infinite_tracing requires newrelic_rpm. And therefore should start the agent.

@tannalynn
Copy link
Contributor

Hello @chrisholmes

I've spent some time looking into this issue but I haven't been able to reproduce it.
I was using the latest version of the agent (tried both 9.0 and dev branch), rails 7, grpc 1.53.0 and puma 5.6.5

Originally I was seeing issues with infinite tracing when forking and thought I had reproduced the issue, but after looking into it more to try to solve the issue, it turns out that I was encountering a different issue that seems to only affect grpc forking on macs. I wasn't able to get it working using the workaround you provided and then saw that it was not the same RuntimeError: grpc cannot be used before and after forking you reported. In that situation, my puma workers kept crashing and giving the error

objc[42550]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.

whenever grpc would try to connect. This happened even if the agent was not started or loaded until after forking. However, this didn't happen if it never forked, or if grpc was turned off. Since this seemed to be specific to macs, I used docker after this.

Once I switched to using the ruby docker image, that error was no longer present and I was able to focus on infinite tracing and puma, but I was not able to reproduce the original infinite tracing issue you reported. I tried changing a couple settings in puma, including number of workers, and using/not using preload_app!, but didn't see any change or see the grpc forking error.

The agent has built in logic to defer startup until after forking when a forking dispatcher is detected. Since infinite tracing doesn't startup until after the agent has connected, this also forces grpc to wait until after forking as well. One thing you could check what dispatcher the agent is detecting by looking at agent logs. You should be able to find a line that includes INFO : Dispatcher: puma. If the agent isn't properly detecting the dispatcher, it would instead say INFO : No known dispatcher detected.. If that's the case, you could try forcing the dispatcher to puma by adding dispatcher: :puma to your newrelic.yml and see if that changes the behavior you're seeing at all.

For me though, even when no known dispatcher was detected I was still unable to reproduce the issue, but my test app is a very simple rails 7 app. It's possible my simple rails 7 test app is missing something that is affecting the issue.

What agent version are you using when you see the error? Any other details that you think might be helpful in reproducing the issue would also be appreciated.
Thank you!

@tannalynn
Copy link
Contributor

Closing this, since we have been unable to reproduce this behavior and have not heard back. Please reopen this if setting the dispatcher doesn't make any change or if there is some additional info you can provide to help with a reproduction.

Ruby Engineering Board automation moved this from Backlog to Done/Pending Release Jun 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community To tag external issues and PRs submitted by the community core technology technical debt
Projects
Archived in project
Ruby Engineering Board
  
Code Complete/Done
Development

No branches or pull requests

6 participants