-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generic Host is not stopped when BackgroundService crashes #43637
Comments
Tagging subscribers to this area: @eerhardt, @maryamariyan |
I don't think the lifetime of Generic Host depends on what its background services do and therefore what you see is an expected outcome. |
We are using the following base class for background services that should terminate the application on error: public abstract class TerminatingBackgroundService : BackgroundService
{
protected IServiceProvider ApplicationServices { get; }
protected IHostApplicationLifetime ApplicationLifetime { get; }
protected ILogger Logger { get; }
protected TerminatingBackgroundService(IServiceProvider applicationServices)
{
ApplicationServices = applicationServices;
ApplicationLifetime = applicationServices.GetRequiredService<IHostApplicationLifetime>();
Logger = ApplicationServices.GetRequiredService<ILoggerFactory>().CreateLogger(GetType());
}
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
try
{
await ExecuteCoreAsync(stoppingToken);
Logger.LogInformation("Execute has finished. IsCancellationRequested: {IsCancellationRequested}", stoppingToken.I
}
catch (OperationCanceledException) when (stoppingToken.IsCancellationRequested)
{
// We're shutting down already, so we don't have to do anything else.
Logger.LogInformation("OperationCanceledException on shutdown");
}
catch (Exception exception)
{
Logger.LogCritical(exception, exception.Message);
// We need StopApplication() so that it properly flushes and disposes the logging system.
// However, it is a graceful shutdown that exits with code 0.
Environment.ExitCode = 1;
if (!stoppingToken.IsCancellationRequested)
{
ApplicationLifetime.StopApplication();
}
}
}
protected abstract Task ExecuteCoreAsync(CancellationToken stoppingToken);
} |
How can it be an "exepcted outcome" that the host process continues to run when the background service doing the actual work has crashed without the host process even knowing? This is a very dangerous situation where all monitoring systems that monitor running Windows services report that everything is OK while no work is being performed. As the host is unaware that the worker service has stopped it also cannot take any corrective actions like restarting the worker and/or reporting the error... |
Related to #36388 (comment) and #36388 (comment) |
In .NET 6 we have fixed #36017. Now when an exception is thrown from a BackgroundService, it will get logged as an error. |
Just logging the exception is not enough. This will still have the same effect: the process keeps running but doesn't perform any work anymore, causing all monitoring systems to think that everything is ok, unless you have some alerting system on your logs. I agree with @GeeWee in #36388 (comment) that you should be able to specify (per BackgroundService) what the host should do when that BackgroundService stops, as some background services are more important than others. My options would be: ignore, restart and stop the host with a (specifiable) error code. |
As per @davidfowl 's comment in #36388 (comment) there could be a new opt-in option on HostOptions that reacts to failures on a background service. It generally might not be always desirable to allow for a failure/crash on a background service to stop the application altogether and therefore logging the failure is a default behavior. |
Lets do this for 6.0. Add the option to stop the host on background service exception. namespace Microsoft.Extensions.Hosting
{
public class HostOptions
{
public bool StopOnBackgroundServiceException { get; set; }
}
} |
I think this is a pretty good API suggestion. I would like to suggest that it either
|
That is most definitely a breaking change, since if a previously compiled app is running against a new framework version, there will be no matching signature for the emitted call. |
Good point - I hadn't thought about that. |
We can argue about defaults but currently as proposed the default is false and there's no breaking change |
namespace Microsoft.Extensions.Hosting
{
public enum BackgroundServiceExceptionBehavior
{
Ignore,
StopHost
}
public class HostOptions
{
public BackgroundServiceExceptionBehavior BackgroundServiceExceptionBehavior { get; set; }
}
} |
I like it |
There are some different behaviours to consider here:
|
No, we can't track work done by IHostedService.
Yep, I think this makes sense though it might be confusing as it will be overridden when you set it. |
That's why I propose to make the setting nullable. Then, |
And then |
I hope that when setting BackgroundServiceExceptionBehavior to StopHost, the host still has a way to catch the exception, figure out what background service has failed and decide whether to stop or not. Or even has the option to restart the failed service? |
I'd imagine if you'd wanted to do that, wouldn't the "natural" course of action to do a try/catch block inside the |
I just experienced this problem and I vote for changing the default to stop the application on unhandled exceptions. As a .NET developer I don't expect exceptions to be swallowed by the framework. I can always do that myself if that is the desired behavior. Would it be an alternative that instead of the Since either solution won't change the different behavior of unhandled exceptions in |
I think one of the problems here is the fact that arbitrary libraries can add background services that run as part of your process. If those misbehave should it crash the application? It's one thing to crash if it's code you control but it's not clear to me if I'd want arbitrary code to crash my app. It's the same reason ASP.NET catches unhandled exceptions that happen on the request. Otherwise it could result in a process crash. If you have an orchestrator or watchdog monitoring your process and restarting it, crashing might be better (you can "alert" based on number of restarts). But if not it might be annoying figuring out why something failed (especially if the logs aren't written anywhere persistent) |
@davidfowl You certainly got an important concern there. But the unhandled exception on a request in ASP.NET isn't just swallowed, it shows up in the developer exception page or as http status code 500. The main problem I have is that (some) unhandled exceptions are swallowed. The logging added in .NET 6 certainly will help, but as others have pointed out that might not be enough in a production environment. As I mentioned, the fact that the sync part of This confusing behavior of |
Well when doing async you always have to, no way around it. That's not just for this case. When you call an async method you always can get exception both on the method call invocation (i.e. before the first await in the method) or while you're waiting on the await of the returned Task. Yes, when your code is just var task = MethodCallAsync(); // Might get a synchronous exception here
// Do some work while async does its thing
var result = await task; // Might get an asynchronous exception here Also note that you might get both normal |
Yes, of course. But when implementing |
Is this concern specific to background services? Don't every developer using any library in any type of application deal with this risk every day? To me, an unhandled exception in production indicates a bug, either in my code or in some library code. I'd like to know about that as soon as possible so I can fix it (or report it to the library author) and possible catch a specific exception from a library while waiting for a fix. |
The error page only shows up of that middleware is enabled. If it isn't the exception is caught and logged by the server (much like it is in .NET 6 with background services).
I agree with this, the behavior is confusing and I wish we had done a Task.Run from the start. There doesn't need to be an InitAsync because there's already a StartAsync. We can change it now but it'll be a breaking change.
I think that's a false equivalence. You don't usually worry about libraries throwing exceptions on arbitrary background threads crashing the process. Throwing exceptions is one thing but this specific abstraction is made for running background services and without everyone explicitly handling errors anyone can crash the entire application. In think having logs changes the equation and makes things easier to debug. That said, we agreed to change then default so we'll do that in .NET 6 and see if it's a net positive for everyone |
When I commented out
I don't see the problem with this, it's usually a good thing to fail fast on unknown errors instead of keep running and hope that whatever error occurred won't for example corrupt any data. The BCL is taking on a big responsibility here that the rest of the application won't be affected.
(From Best practices for exceptions)
👍😃 |
That's because the server is doing you a solid and preventing your application from crashing.
Easy to say that when logging is configured. I'm sure people would be annoyed if their application died without a trace. No logs, no crash dump.
This doesn't apply. |
I'm certainly annoyed when my background service die without a trace now in .NET 5. 😅 I think it would be preferable that an unhandled exception in the background service would be propagated to the application, but I guess that's not possible. Then logging the exception and shutting down the application as suggested is acceptable. |
Right, it was impossible to be because nothing was tracking the background task itself and that's why it had it be done by the instance and not the host. Raising it to the application would mean some sort of OnBackgroundSericeUnhandledException event? |
Such an event would not help much since it requires that the developer adds a handler. |
Unfortunately there's no practical way to make the application aware other than to crash |
What do you mean by "crash"? |
The current plan is to do that yes, which will in most cases unblock main and will quit the program. |
I'm a bit hard pressed to think of a specific case at the moment, but one of the comments above got me thinking... Would it be possible to have a default value on the I'm sure the current API is fine for now and it doesn't seem like it would preclude such an enhancement in the future. Just throwing it out there.... |
I see this is |
@IEvangelist - I assigned it to you. Thanks for taking this. I'm looking forward to seeing it being completed. |
Updated by @maryamariyan:
Description:
We want to add a new opt-in option on HostOptions that reacts to failures on a background service.
It generally might not be always desirable to allow for a failure/crash on a background service to stop the application altogether and therefore logging the failure is a default behavior.
API Proposal:
API below would add the option to stop the host on background service exception.
Original Description (click to view)
Description
According to the documentation of the BackgroundService.ExecuteAsync method that method should return a Task that represents the lifetime of the long running operation(s) being performed.
For me that implies that this Task is monitored by the generic host, so I would expect that when the BackgroundService.ExecuteAsync's Task instance is completed (successfully or due to an exception) the host itself would also stop. This is not the case. As a result, a Windows service where the BackgroundService.ExecuteAsync method has crashed or stopped will continue to appear running in the Services Management Console, but will no longer be performing any work., causing all monitoring systems to keep reporting that everything is ok.
Configuration
Sample application
The following sample demonstrates the problem. You can either run it on the console or install it as windows service and start it. The program will continue to run after the background service has crashed.
The text was updated successfully, but these errors were encountered: