-
Notifications
You must be signed in to change notification settings - Fork 4.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[API Proposal]: When stopping an IHost instance, IHostedService instances StopAsync method is called sequentially instead of asynchronously #68036
Comments
Tagging subscribers to this area: @dotnet/area-extensions-hosting Issue DetailsDescriptionAs it is right now, when an Why is this a problem? Well technically everything is functioning to spec, but in practice, it's problematic. So normally the idea of being able to set the The real problem with this is that, when deploying apps to ECS in AWS, for example, when ECS triggers an instance of an app to shutdown, the maximum amount of time it gives for the app to gracefully shutdown is 120 seconds (see There's no reason that i can see that the IHostedService instances should be shutdown sequentially, they are async for a reason, and if shutdown order is important, than it should be managed by the consumer themselves, or at the very least, there should be an option for asynchronous IHostedService shutdown. ConfigurationRegression?I don't think it is DataAnalysisLink to code causing the issue: HashSet<Task> tasks = new HashSet<Task>();
foreach (IHostedService hostedService in _hostedServices.Reverse()) /* Why does this need to be reversed anyways? */
{
tasks.Add(hostedService.StopAsync(token));
}
// WhenAll would be great here, but it will swallow up exceptions.. so this is the next best thing
foreach (Task task in tasks)
{
try
{
await task.ConfigureAwait(false);
}
catch (Exception ex)
{
exceptions.Add(ex);
}
}
|
cc @davidfowl - we were just discussing this a couple weeks ago. |
Agree we should do this and list it as a breaking change (as stop async will run concurrently now). I don't know if we need a flag to opt out of this behavior though. |
@jerryk414 would you be willing to send a PR for this change? |
Any knob to keep the old "started first — stopped last" would be appreciated. We use it to stop stateful in-process services which have cross dependencies. |
I can do that. I'll get an initial implementation done this week to see what thoughts are once the changes are visible. |
@davidfowl @quixoticaxis There are no tests yet - I want to get input if this is the route you would prefer prior to spending time on that. |
For our product it would be easy to switch back to, looks great. |
@davidfowl @quixoticaxis |
I've updated the API proposal above and marked this proposal as ready for review. We will need to get the API proposal approved before we can make this change. |
Thank you @eerhardt! |
namespace Microsoft.Extensions.Hosting;
public partial class HostOptions
{
public bool ServicesStartConcurrently { get; set; } // = false; Maybe change in 8?
public bool ServicesStopConcurrently { get; set; } // = false; Maybe change in 8?
} |
It would probably not only break dependencies, but also mess up the exception handling around |
@terrajobst I like the idea of just making it a bool since there really should only ever be two behaviors. I can work on getting this change together and updating the initial PR changes to account for this. Just to answer this question more specifically... when hosting apps in AWS ECS, you have an absolute maximum shutdown time of 120 seconds - even if you set that higher in the host itself via From the description in the issue: @quixoticaxis I can update it to behave the same as Basically copy what is being done here: |
@jerryk414 I personally am not certain Starting seems to be fundamentally different from stopping, IMHO. |
I would disagree with that for two reasons.
Another fun point, assuming StartAsync is awaited using |
No, I believe I don't, but, sorry, I'll post the code snippet tomorrow. Your seconds point seems valid to me though. |
Yeah maybe I'm just unaware, but I'm not aware of any way to guarantee stopping on a first error without running each startup one by one. I don't have anything in front of me right now, but is there even a cancelation token that's passed into each start async? Even if there is, I doubt many people check it. And assuming they don't check it, you could end up in a situation where either you're killing threads with these tasks, which could lead to corrupted states, or you could end up just swallowing other real exceptions, which is also bad. |
@jerryk414 , if someone starts the services concurrently, I suppose it's only fair to assume that their starting logic may be running for prolonged period of time, so there's a fair chance that they respect cancellation requests (at the very least, we do in our stateful services that pre-load data in Considering your point about using var stopper = CancellationTokenSource.CreateLinkedTokenSource(originalToken);
await Task.WhenAll(
services
.Select(
service => service.StartAsync(stopper.Token))
.Select(
task => task.ContinueWith(original =>
{
if (original.IsFaulted)
{
try
{
stopper.Cancel();
}
catch (AggregateException)
{
// probably report cancellation callback exceptions
}
var dispatcher = ExceptionDispatchInfo.Capture(
original.Exception!);
dispatcher.Throw();
}
}))); Anyway, the longer I think about it, the more I'm inclined to assume that concurrent start is much more tricky than concurrent stop. |
Yeah, I'm just thinking it doesn't have to be that complicated. With it being a new feature, it could just be accepted as being the way it is. If you have a problem with it and have interdependencies that could cause you to need to stop a start async method if one fails first or require order, you could revert back to the old way, or add your own inter-service communication. I personally would argue the ROI isn't there for the extra complexity of ensuring an immediate return on first exceptions. |
@jerryk414 my point was that I doubt that concurrent start should be implemented in the first place: too many different usage scenarios. |
@jerryk414 are you still planning to work on it? |
I am - I had a PR up but honestly, it's funny. I'm doing this on my own time - i've got kids, family, and a fulltime job, and it's the holidays, so i just didn't get around to addressing all comments in the PR, and my personal computer took a crap on me, so i've got to get this all setup on another machine. It certainly would be nice if someone else who is paid to do this took ownership and addressed the very minor issues with the PR below. If nobody does, I do have plans to get back around to it, but not until the new year. |
Sure understand, no rush or any pressure, this is open source and it's up to you if you work on it or when you work. Though we would like to assign the issue if somebody already started working on it, so that other contributes would not start working on it simultaneously, if nobody working on it, we would add
Thank you, for now, nobody has planned to work on it, sounds like we can add |
Hey @buyaa-n I want to give it a try, you can assign it to me |
Background and motivation
As it is right now, when an
IHost
is asked toStopAsync
for graceful shutdown, eachIHostedService
is requested toStopAsync
in sequential order. There's no reason why eachIHostedService.StopAsync
task can't be executed asynchronously and awaited altogether.Why is this a problem? Well technically everything is functioning to spec, but in practice, it's problematic.
So normally the idea of being able to set the
ShutdownTimeout
is great.. we can set it to say, 30 seconds, and say we expect allIHostedService
instances to take no longer than 30 seconds to shutdown gracefully. The problem is that, this is an additive timeout. That is to say, if you have 10IHostedService
instances, and each one of them takes 20 seconds to shutdown, your graceful shutdown now takes 200 seconds. Why do that when you can just have everything shutdown asynchronously in 20 seconds?One example of the problem with this is that, when deploying apps to ECS in AWS, when ECS triggers an instance of an app to shutdown (which occurs under normal circumstances such as when scaling down), the maximum amount of time it gives for the app to gracefully shutdown is 120 seconds (see
stopTimeout
), after that it just kills the process.There's no reason that i can see that the IHostedService instances should be shutdown sequentially, they are async for a reason, and if shutdown order is important, than it should be managed by the consumer themselves, or at the very least, there should be an option for asynchronous IHostedService shutdown.
Regression?
This is not a regression
Analysis
Link to code causing the issue:
https://github.com/dotnet/runtime/blob/main/src/libraries/Microsoft.Extensions.Hosting/src/Internal/Host.cs#L126
This code could easily be modified to something similar to:
API Proposal
API Usage
Risks
The default behavior will technically be changing, which would be a breaking change. However, it may be a "safe" breaking change in the sense that the likelihood of someone building an app which depends on this previously internal behavior is highly unlikely and IF they did happen to, there is a way to revert back to the old behavior.
The text was updated successfully, but these errors were encountered: