Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discuss logging of AbruptTerminationException on ActorSystem shutdown #27465

Open
jrudolph opened this issue Aug 7, 2019 · 1 comment
Open
Labels
1 - triaged Tickets that are safe to pick up for contributing in terms of likeliness of being accepted discuss Tickets that need some discussion before proceeding. Not decided if it's a good idea. t:core t:stream

Comments

@jrudolph
Copy link
Member

jrudolph commented Aug 7, 2019

During ActorSystem shutdown AbruptTerminationException are injected into still running streams and may cause noisy error logging.

Why does it matter?

When you shut down a system you often don't care as much for errors as during steady state. These errors then spam test and production logs. Unfortunately, the usual loggers are not used because these logs happen after the logging infrastructure has already mostly shut down. (See also #25021)

What does happen?

  • system.terminate
    • runs CoordinatedShutdown and similar
    • tells system guardian to stop which
      • runs all termination hooks
      • terminates loggers and reinstates StandardOutLogger
      • stops itself which will trigger a cascade to stop all actors
        • StreamSupervisor is going to be stopped
          • ActorGraphInterpreters are stopped and execute their postStop
            • GraphInterpreter.tryAbort is run which injects AbruptTerminationException whereever possible into the stream and continue running the stream for a bit so that errors are actually somewhat handled
              • Some stages log errors they receive because that's the best they can do to report to the user

Which stages noisily log errors?

Basically all system-wide singletons that keep running streams and where it is non-obvious how to report errors otherwise.

  • akka-http connection pools
  • akka-http currently open server connections

In which cases is AbruptTerminationException actually helpful

When running long-running streams in non-singleton materializers and those are shutdown explicitly or implicitly, you might wonder why those streams are gone.

Potential solutions

  • Ignore AbruptTerminationExceptions (or log with decreased level) where you would otherwise log (but might we miss some cases where the exception would have been helpful?)
  • Do not log errors received on the stream by default but report them back to the user (but how to do that if there's no "requester" which you could tell)
  • Try to run materializer shutdown before loggers are shutdown
  • Clean up long-running streams gracefully before finally shutting down the actor system
  • Implement something like suggested in Actor System Startup/Shutdown Logging via customizable backend #25021, so that AbruptTerminationException could be filtered out during system shutdown (configurable and/or by default)
@jrudolph jrudolph added 1 - triaged Tickets that are safe to pick up for contributing in terms of likeliness of being accepted t:stream t:core discuss Tickets that need some discussion before proceeding. Not decided if it's a good idea. labels Aug 7, 2019
@patriknw
Copy link
Member

Sounds good to remove the noise somehow. Since it's typically because of ActorSystem shutdown, could we add some filtering mechanism to logging when the shutdown is in progress? Could be useful for other things than AbruptTerminationException.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1 - triaged Tickets that are safe to pick up for contributing in terms of likeliness of being accepted discuss Tickets that need some discussion before proceeding. Not decided if it's a good idea. t:core t:stream
Projects
None yet
Development

No branches or pull requests

2 participants