Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hang when calling start/stop on IHost #1363

Closed
glennc opened this issue Apr 7, 2019 · 36 comments
Closed

Hang when calling start/stop on IHost #1363

glennc opened this issue Apr 7, 2019 · 36 comments
Milestone

Comments

@glennc
Copy link

glennc commented Apr 7, 2019

The program below appears to hang after logging the application is shutting down message and I don't understand why. The OnProcessExit event appears to be waiting here https://github.com/aspnet/Extensions/blob/master/src/Hosting/Hosting/src/Internal/ConsoleLifetime.cs#L79 but I don't know why that would be or if that is a problem or just a red herring.

    public class Program
    {
        public static async Task Main(string[] args)
        {
            var host = CreateHostBuilder(args).Build();
            await host.StartAsync();
            await host.StopAsync();
        }

        public static IHostBuilder CreateHostBuilder(string[] args) =>
            Host.CreateDefaultBuilder(args)
                .ConfigureServices(services => services.AddHostedService<Worker>());
    }

@Tratcher @davidfowl

@glennc
Copy link
Author

glennc commented Apr 7, 2019

The StopAsync isn't actually required to reproduce, if you let the process exit the same thing happens.

@Tratcher
Copy link
Member

Tratcher commented Apr 7, 2019

@davidfowl
Copy link
Member

That might be problematic.

@Tratcher
Copy link
Member

Tratcher commented Apr 7, 2019

Yeah. We did it that way to try to let app cleanup run on app exit events.

@glennc
Copy link
Author

glennc commented Apr 8, 2019

OK, that's surprising. Can it not be tied to something like applicationLifetime.Stopped that gets raised after the host has finished stopping all the hosted services? I want to make sure I understand the tradeoff.

This means if you're not calling Run then you want to put the host in a using, which I don't think we've ever told anyone they'd have to do.

@xqrzd
Copy link

xqrzd commented Apr 9, 2019

It was done there to allow any hosted services / dependencies to be disposed before the process exits. Prior to that change on Linux sigterm would effectively kill the process. If the event was tied to applicationLifetime.Stopped it would allow any hosted services to stop, but not be disposed. I think a proper fix requires a .net runtime change, something like DontExitProcessUntilMainMethodHasReturned or something like that.

@Tratcher
Copy link
Member

Tratcher commented Jun 4, 2019

Note to self: Could address this with a finalizer?

@davidfowl
Copy link
Member

I'm not sure that would work. The host is rooted by the DI container which is rooted by the callback, which is stored in a static.

@tmds
Copy link
Member

tmds commented Jun 6, 2019

        private void OnProcessExit(object sender, EventArgs e)
        {
            ApplicationLifetime.StopApplication();
            _shutdownBlock.WaitOne();
        public void Dispose()
        {
            _shutdownBlock.Set();

StopApplication isn't leading to a Dispose which causes the block.

Can it not be tied to something like applicationLifetime.Stopped that gets raised after the host has finished stopping all the hosted services?

That's one option.
Another option is to call something more than StopApplication which causes the host to also be Disposed.

@davidfowl
Copy link
Member

I like @xqrzd 's suggestion

@tmds
Copy link
Member

tmds commented Jun 7, 2019

I like @xqrzd 's suggestion

All options are better than hanging.

To work that suggestion a bit more:
Now on SIGTERM the runtime will end the process and hence the AppDomain.CurrentDomain.ProcessExit is emitted. By blocking the event, the process exit is delayed.
Instead, you want to get an event on SIGTERM (other than ProcessExit, e.g. ProcessExitRequest) and the runtime to do nothing. In your event handler, you'll do something that causes Main to eventually return.

@Tratcher
Copy link
Member

Tratcher commented Jun 7, 2019

Yeah, that would align a lot better with how we handle CTL+C.
https://github.com/aspnet/Extensions/blob/0f0782d9bb45b2f86b882530224f792e48d971b7/src/Hosting/Hosting/src/Internal/ConsoleLifetime.cs#L85-L89
Who wants to chase this into corefx?

@davidmatson
Copy link

I ran across kind of the opposite problem in the same spot - some cleanup code, like ILoggerProvider.Dispose, doesn't get to run with the current Dispose-based shutdown block.

At least for Windows containers, there isn't any signal that comes before the shutdown signal, and that signal handler must block in order to prevent the process from exiting immediately.

If .NET provided some kind of API like AppDomain.ProgramMainReturned, that would give a way to know when all normal shutdown code has run (including things outside the using block and any exit code returned from Program.Main). Apart from that, it's pretty hard to get full normal shutdown coordination to happen.

The linked issue discusses some other options a bit. I'm currently using a workaround where a custom Shutdown event handler waits on an manual reset event that is somehow set by the very line of Program.Main (with exit codes set first via Environment.ExitCode since returning from Program.Main doesn't work in that case). Having to wire up a signal between the host and Program.Main is ugly (it can be abstracted from a manual reset event to a simple action, but it still has to be wired between Program.Main and the host lifetime somehow). Something along those lines is the best workaround I've been able to find absent an AppDomain.ProgramMainReturned-style event.

@analogrelay
Copy link

analogrelay commented Jul 16, 2019

We know there are some corefx APIs needed to make this super clean and we'll look in to that. The idea that shows the most promise is one that lets us actually just detect SIGTERM directly and tell corefx to cancel the default logic, handling it as a graceful shutdown instead. However, that's not really feasible for 3.0 and it seems user-unfriendly to hang the application with no way of knowing why it's hanging.

Until then, as a tactical fix to make this easier to understand, we're going to propose this:

  1. Wait a "reasonable time" (say, 1 minute?) for the _shutdownBlock to be unblocked in OnProcessExit
  2. If the timeout elapses without unblocking (.WaitOne() returns false), immediately log a warning Waiting for the host to be disposed. Ensure all 'IHost' instances are wrapped in 'using' blocks. (or similar, we can wordsmith in the PR). The logger could throw ObjectDisposedException if we're part-way through disposing though so we may need to be defensive.
  3. Continue waiting indefinitely on the _shutdownBlock after logging that message
  4. Ensure any samples we have for generic host (either in this repo, or in the docs) properly dispose of the host, to avoid this.

Why not just exit after the timeout? The console log is buffered so we have no guarantee that it was actually written after logging.

Why not just exit immediately? That would bypass any dispose/shutdown logic in the app. At least this way a user has some chance of understanding why the app is hanging.

cc @Tratcher @halter73

@analogrelay
Copy link

analogrelay commented Jul 16, 2019

To be clear, the "problem" area here is mostly normal process exit. The flow is:

    public class Program
    {
        public static async Task Main(string[] args)
        {
            var host = CreateHostBuilder(args).Build();
            await host.StartAsync();
            await host.StopAsync();
        }

        public static IHostBuilder CreateHostBuilder(string[] args) =>
            Host.CreateDefaultBuilder(args)
                .ConfigureServices(services => services.AddHostedService<Worker>());
    }
  1. App returns from Main
  2. AppDomain.CurrentDomain.OnProcessExit is triggered, which we are listening to in ConsoleLifetime
  3. We start blocking on _shutdownBlock, which is never triggered because nobody disposed the Host.

There's a second problem in SIGTERM handling, which can be handled in a different issue (https://github.com/aspnet/Extensions/issues/2030 probably).

maryamariyan added a commit to maryamariyan/Extensions that referenced this issue Jul 18, 2019
maryamariyan added a commit to maryamariyan/Extensions that referenced this issue Jul 18, 2019
@Tratcher
Copy link
Member

Timer log mitigation added. Backlog while working with CoreFx?

@rynowak
Copy link
Member

rynowak commented Aug 3, 2019

I'm here to say that this was the cause of a test hang that led to lots o badness and sadness :(

@0xced
Copy link

0xced commented Dec 20, 2019

Yeah, we need to look at designing a real solution to this problem now.

Any update on a potential solution? I also stumbled on this issue when trying to integrate Topshelf and a generic IHost.

@bmukes
Copy link

bmukes commented Jan 8, 2020

Is there some reason why the following code would not be an appropriate workaround for now.
The code below does not hang and still runs your MyServiceA hosted service

	public class ProgramNoDisposeHangs
	{
		public static async Task Main(string[] args)
		{
			var host = CreateHostBuilder(args).Build();
			using (var newHost = host.Services.GetService<IHost>())
			{
				await newHost.StartAsync();
				await newHost.StopAsync();
			}
		}

		public static IHostBuilder CreateHostBuilder(string[] args) =>
			Host.CreateDefaultBuilder(args)
				.ConfigureServices(services => services.AddHostedService<MyServiceA>());
	}

@halter73
Copy link
Member

halter73 commented Jan 9, 2020

@bmukes The workaround is to dispose the host prior to exiting Program.Main. That why we write a log after a timeout that states "Waiting for the host to be disposed. Ensure all 'IHost' instances are wrapped in 'using' blocks.". This should probably be warning-level rather than info-level like it is today though.

You don't have to manually pull the IHost out of the ServiceProvider, the following works just as well (note the using var):

public static async Task Main(string[] args)
{
    using var host = CreateHostBuilder(args).Build();
    await host.StartAsync();
    await host.StopAsync();
}

If we could tell the runtime in OnProcessExit not to exit until Program.Main does, we would probably do that instead of blocking OnProcessExit until the host is disposed.

@robbaman
Copy link

@bmukes The workaround is to dispose the host prior to exiting Program.Main. That why we write a log after a timeout that states "Waiting for the host to be disposed. Ensure all 'IHost' instances are wrapped in 'using' blocks.". This should probably be warning-level rather than info-level like it is today though.

You don't have to manually pull the IHost out of the ServiceProvider, the following works just as well (note the using var):

public static async Task Main(string[] args)
{
    using var host = CreateHostBuilder(args).Build();
    await host.StartAsync();
    await host.StopAsync();
}

If we could tell the runtime in OnProcessExit not to exit until Program.Main does, we would probably do that instead of blocking OnProcessExit until the host is disposed.

The combination of using var and stopping the IHost doesn't change this for me. The application still hangs during shutdown.

When I add this at the very end of the Main method it does complete:

var lifetime = host.Services.GetService<IHostLifetime>() as IDisposable;
lifetime?.Dispose();

@halter73
Copy link
Member

@robbaman In your case, what's triggering the shutdown? I was testing shutting down the process with a sigterm on linux.

@robbaman
Copy link

I used CTRL+C on a Windows machine

@dlyz
Copy link

dlyz commented Feb 20, 2020

There is one more way to reproduce hanging or related weird behavior.

Environment.Exit() call in the main thread leads to permanent hang.
Environment.Exit() in a background thread leads to hang for HostOptions.ShutdownTimeout if you call it in BackgroundService (IHostedService.StopAsync never completes). Also if you use a non-zero error code with the Environment.Exit(), it will be replaced to zero in the ConsoleLifetime.

I think the best solution is to create a way to detect something like SIGTERM separately and have an option to prevent immediate app termination. Then ConsoleLifetime can handle this separate event just like Console.CancelKeyPress and do not use ProcessExit event at all.

@Chebura
Copy link

Chebura commented Feb 20, 2020

I have same issue, when I try to start host in own task. After host.StartAsync(), when application was termnated, fires OnStopping() handler in IHostedService. In this state host hangs indefinite time, until docker kills process. The handler OnStopped() is never fired.
PS: I need start host in own concurrent task for running other services that not refered to kestrel. But only Run/RunAsync() runs without problems with graceful stopping host. But Run/RunAsync() locks his awaiting context and it is unable to dispose gracefully task with Run/RunAsync. To kill Run/RunAsync I wrapped task.Dispose() in "try catch" that seems like sh*t.

@GeeWee
Copy link

GeeWee commented Feb 21, 2020

The snippet of

var lifetime = host.Services.GetService<IHostLifetime>() as IDisposable;
lifetime?.Dispose();

Did not make a difference for me.

After some debugging I found out that I had an IHostedService (The Azure Event Hub Processor SDK) that took upwards of a minute to gracefully shut down.

If you're encountering this problem, potentially there's a long running stopAsync method somewhere you're waiting on.

@ghost ghost added the super-triage label Mar 10, 2020
@Daniele122898
Copy link

Daniele122898 commented Apr 18, 2020

I have a ASP.NET core 3.1 app that has a BackgroundService. When i call systemctl stop on the service that hosts the asp.net core app i get
Waiting for the host to be disposed. Ensure all 'IHost' instances are wrapped in 'using' blocks.
Is this connected to this issue above? Bcs everything that inherits from IHost is managed by the DI.

Just want to make sure if it's a problem in my code or if its indeed the same issue

@davidfowl
Copy link
Member

Yes, it happens when you don't call Dispose explicitly or Run (which calls Dispose).

@Daniele122898
Copy link

So if the backgroundservice runs executeasync over DI i have to manually call dispose? Am i understanding this correctly

@davidfowl
Copy link
Member

It has nothing to do with background services. It’s all about the code in Program.cs that starts the host.

@Daniele122898
Copy link

Oh i see so the initial Host service that you create in the Program.cs. My bad, then i misunderstood this. So i need to add a using. Will do that and see if that fixes the issue :)

@analogrelay
Copy link

This has the same root cause as https://github.com/dotnet/extensions/issues/2030 . The solution for that would also solve this problem since our SIGTERM handler would stop shutdown and allow Main to return.

Closing as a duplicate.

@ghost ghost locked as resolved and limited conversation to collaborators May 24, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests