Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Agent shutdown adds 3 seconds to short scripts #1313

Closed
jdelStrother opened this issue Aug 10, 2022 · 15 comments
Closed

Agent shutdown adds 3 seconds to short scripts #1313

jdelStrother opened this issue Aug 10, 2022 · 15 comments
Labels
community To tag external issues and PRs submitted by the community enhancement Enhancement to existing features feature request To tag feature request after Hero Triage for PM to disposition hacktoberfest help wanted

Comments

@jdelStrother
Copy link
Contributor

jdelStrother commented Aug 10, 2022

Description

When running a short-lived process, the newrelic agent shutdown often adds 3 seconds to the total runtime.

Without newrelic:

$ time bin/rails runner "puts 'hi'"
hi

real	0m1.591s
user	0m0.907s
sys	0m0.399s

with newrelic:

$ time bin/rails runner "puts 'hi'"
hi

real	0m5.338s
user	0m0.991s
sys	0m0.558s

Expected Behavior

NewRelic doesn't stall for 3 seconds while trying to shut down the agent thread.

The problem occurs here -

def stop_event_loop
@event_loop.stop if @event_loop
# Wait the end of the event loop thread.
if @worker_thread
unless @worker_thread.join(3)
::NewRelic::Agent.logger.debug("Event loop thread did not stop within 3 seconds")
end
end
end
- if @event_loop is still nil because we're still waiting for the connect in deferred_work to complete. (On my machine, connect takes around 1.1 seconds)

Not sure of a clean fix that completely avoids race conditions... maybe stop_event_loop needs to loop a few times?

      def stop_event_loop
        30.times do
          @event_loop.stop if @event_loop
          # Wait the end of the event loop thread.
          if @worker_thread
            if @worker_thread.join(0.1)
              return
            end
          end
        end

        ::NewRelic::Agent.logger.debug "Event loop thread did not stop within 3 seconds"
      end

pretty gross though, there must be a better way.

Impact

Low, those 3 seconds aren't going to kill me. It's just irritating when I run a script from the terminal, see it finish, but then have to sit there waiting like an idiot for newrelic to fail to kill its worker thread.

For Maintainers Only or Hero Triaging this bug

Suggested Priority (P1,P2,P3,P4,P5):
Suggested T-Shirt size (S, M, L, XL, Unknown):

@github-actions github-actions bot added the community To tag external issues and PRs submitted by the community label Aug 10, 2022
@kaylareopelle
Copy link
Contributor

Hello, @jdelStrother! Great to hear from you!

We're returning from a company-wide vacation and will triage this today.

Thanks for your patience and your submission!

@kaylareopelle kaylareopelle added feature request To tag feature request after Hero Triage for PM to disposition enhancement Enhancement to existing features and removed bug labels Aug 17, 2022
@kaylareopelle
Copy link
Contributor

kaylareopelle commented Aug 17, 2022

Hi @jdelStrother! Thanks for pointing this out to us.

I've re-categorize this as an enhancement/feature request since the three second wait time was deliberately programmed into the stop_event_loop method to give Resque workers a chance to finish.

It doesn't seem like it's necessary in all cases and is considerably slowing down the shutdown process. I'll see if we can skip the loop if Resque instrumentation isn't installed.

@angelatan2 angelatan2 moved this from Triage to Feature Request in Ruby Engineering Board Aug 23, 2022
@angelatan2 angelatan2 removed the feature request To tag feature request after Hero Triage for PM to disposition label Sep 12, 2022
@workato-integration
Copy link

@angelatan2 angelatan2 added the feature request To tag feature request after Hero Triage for PM to disposition label Sep 12, 2022
@jdelStrother
Copy link
Contributor Author

Impact

Low, those 3 seconds aren't going to kill me.

Actually, I'm starting to succumb to death-by-a-thousand-cuts, thinking of this issue every time I run a simple command like rails routes.

@tannalynn
Copy link
Contributor

Actually, I'm starting to succumb to death-by-a-thousand-cuts, thinking of this issue every time I run a simple command like rails routes.

Interesting, the agent shouldn't be running during a command like rails routes by default. It's included as one of the defaults for tasks that the agent should be ignoring. autostart.denylisted_rake_tasks Are you setting that config option? If you are, overwriting the default could explain that if the new value doesn't include the task. However, if you are not using that config and it should just be using the defaults, it's possible there is a bug in there somewhere with how we are recognizing rails tasks.

Let me know if it is still using the default value and should be ignoring that task. If that is the case, I can look into what might be going on.

@jdelStrother
Copy link
Contributor Author

Ah - rails routes isn't running a Rake task, it's running a Thor command in Rails 7 : https://github.com/rails/rails/blob/main/railties/lib/rails/commands/routes/routes_command.rb - so isn't caught by the denylisted_rake_tasks config.

How about rails runner - is there a good way of disabling the agent for that? We never use it for anything we want instrumenting by NewRelic.

@tannalynn
Copy link
Contributor

Ah - rails routes isn't running a Rake task, it's running a Thor command in Rails 7 : https://github.com/rails/rails/blob/main/railties/lib/rails/commands/routes/routes_command.rb - so isn't caught by the denylisted_rake_tasks config.

How about rails runner - is there a good way of disabling the agent for that? We never use it for anything we want instrumenting by NewRelic.

Oh interesting, that makes sense! That change flew under our radar, we haven't made updates to adjust to that. I'll have to add a ticket to our backlog for us to look into making ignoring those tasks automatically work with Thor as well.

We do have another config option to control when the agent starts, autostart.denylisted_constants. The default for this value is just 'Rails::Console'. I tried adding a few other constants I saw for rails commands, and saw that you can configure the agent to not run during a rails command if that class is added to this config option. They aren't defined when running other commands, like rails s. I tried testing it with several commands in rails 7, and this seemed to work consistently.

# newrelic.yml
autostart.denylisted_constants: 'Rails::Console,Rails::Command::RunnerCommand,Rails::Command::RoutesCommand'

Hopefully that helps! Let me know if you run into any issues with that, or if something isn't working as expected with this.

@jdelStrother
Copy link
Contributor Author

Thanks - that looks promising, I'll give it a go

@fallwith
Copy link
Contributor

Hi @jdelStrother and anyone else who comes across this issue. Regarding not wanting to have the agent start in certain CLI/TUI contexts such as when running rails routes, the updated configuration approach that @tannalynn listed above has now been set as an agent default via #2239

This change will go out in a future agent release. In the meantime, Rails 7 users can update their newrelic.yml config like so to ignore CLI/TUI contexts:

# newrelic.yml
-  # autostart.denylisted_constants: Rails::Console
+  autostart.denylisted_constants: Rails::Command::ConsoleCommand,Rails::Command::CredentialsCommand,Rails::Command::Db::System::ChangeCommand,Rails::Command::DbConsoleCommand,Rails::Command::DestroyCommand,Rails::Command::DevCommand,Rails::Command::EncryptedCommand,Rails::Command::GenerateCommand,Rails::Command::InitializersCommand,Rails::Command::NotesCommand,Rails::Command::RoutesCommand,Rails::Command::SecretsCommand,Rails::Console,Rails::DBConsole

@workato-integration
Copy link

Work has been completed on this issue.

@jdelStrother
Copy link
Contributor Author

jdelStrother commented Apr 24, 2024

FWIW I keep running into commands that still load the agent and so make them a lot slower than they should be. (eg bin/rails db:prepare, or bin/rails runner 123)

For now I've just updated my bin/rails script so that it prevents NewRelic from starting, unless it's specifically running bin/rails server.

#!/usr/bin/env ruby
# frozen_string_literal: true

# Prevent NewRelic from starting when using one-off short commands like `bin/rails db:prepare`.
# (NewRelic adds 3 seconds to short scripts, which is annoying for stuff that might otherwise take 0.5 seconds
# https://github.com/newrelic/newrelic-ruby-agent/issues/1313)
if !["server", "s"].include?(ARGV[0])
  ENV["NEW_RELIC_AGENT_ENABLED"] ||= "false"
end

APP_PATH = File.expand_path("../config/application", __dir__)
require_relative "../config/boot"
require "rails/commands"

but it would be nice if StartWorkerThread didn't have this slow shutdown behaviour.

@fallwith
Copy link
Contributor

Thanks for letting us know about the undesired behavior, @jdelStrother, and for the workaround too!

Given that something like rails runner is extremely common and not at all unique to your setup, I consider this to be a bug and I have opened #2596 to track it.

@fallwith
Copy link
Contributor

fallwith commented May 3, 2024

Hi @jdelStrother. PR #2623 should take care of rails runner and rails db commands. If there are any others you're aware of that involve the agent but shouldn't, just let us know.

@kaylareopelle
Copy link
Contributor

Hi @jdelStrother, we had a bug report for 9.10.0. Including Rails::Command::RakeCommand into the autostart.denylisted_constants default prevented the agent from automatically starting Solid Queue (#2677)

We decided to remove Rails::Command::RakeCommand from the default list as part of 9.10.2.

We're going to look more closely into how we can prevent the agent from starting with certain rake commands executed by Rails. The issue to track this work is: #2692

If you were relying on the default, we recommend you manually update autostart.denylisted_constants to include Rails::Command::RakeCommand. You can copy this string and add it to your newrelic.yml or as the value to an environment variable.

"Rails::Command::ConsoleCommand,Rails::Command::CredentialsCommand,Rails::Command::Db::System::ChangeCommand,Rails::Command::DbConsoleCommand,Rails::Command::DestroyCommand,Rails::Command::DevCommand,Rails::Command::EncryptedCommand,Rails::Command::GenerateCommand,Rails::Command::InitializersCommand,Rails::Command::NotesCommand,Rails::Command::RakeCommand,Rails::Command::RoutesCommand,Rails::Command::RunnerCommand,Rails::Command::SecretsCommand,Rails::Console,Rails::DBConsole"

@jdelStrother
Copy link
Contributor Author

No problem, I'll continue disabling newrelic in our bin/rails entrypoint for now.

(Just to be clear, I have no issue with newrelic autostarting in any of our processes - even the short lived ones. If the 3 second pause on shutdown could be eliminated I'd happily just let newrelic start anywhere.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community To tag external issues and PRs submitted by the community enhancement Enhancement to existing features feature request To tag feature request after Hero Triage for PM to disposition hacktoberfest help wanted
Projects
None yet
Development

No branches or pull requests

5 participants