Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ConsoleLifetime doesn't allow full graceful shutdown for ProcessExit #35990

Closed
davidmatson opened this issue Jul 12, 2019 · 34 comments · Fixed by #56057
Closed

ConsoleLifetime doesn't allow full graceful shutdown for ProcessExit #35990

davidmatson opened this issue Jul 12, 2019 · 34 comments · Fixed by #56057

Comments

@davidmatson
Copy link

davidmatson commented Jul 12, 2019

Describe the bug

When ConsoleLifetime shuts down via the AppDomain.ProcessExit path, it skips a number of steps done in normal graceful program termination.

Related to dotnet/extensions#1363, but perhaps the opposite problem (in this case, it shuts down too soon rather than hanging/too late).

To Reproduce

Program.cs:

using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Hosting;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Logging.Abstractions;
using System;
using System.Threading;

class Program
{
    static int Main()
    {
        var builder = new HostBuilder()
            .ConfigureLogging(l =>
            {
                l.AddConsole();
                l.Services.AddSingleton<ILoggerProvider, CustomLoggerProvider>();
            }).ConfigureServices(s =>
            {
                s.AddSingleton<OtherService>();
            });

        using (IHost host = builder.Build())
        {
            host.Services.GetRequiredService<OtherService>();
            host.Start();
            // Edit: ignore Environment.Exit; instead use CTRL+C or docker stop here, per comments below.
            //Thread thread = new Thread(() => Environment.Exit(456));
            //thread.Start();
            host.WaitForShutdown();
            Console.WriteLine("Ran cleanup code inside using host block.");
        }

        Console.WriteLine("Ran cleanup code outside using host block.");
        Console.WriteLine("Returning exit code from Program.Main.");
        return 123;
    }

    class CustomLoggerProvider : ILoggerProvider
    {
        public ILogger CreateLogger(string categoryName)
        {
            return NullLogger.Instance;
        }

        public void Dispose()
        {
            Console.WriteLine("Ran logger cleanup code.");
        }
    }

    class OtherService : IDisposable
    {
        public void Dispose()
        {
            Console.WriteLine("Ran most service's cleanup code.");
        }
    }
}

App.csproj:

<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>netcoreapp2.2</TargetFramework>
    <RuntimeIdentifier>win-x64</RuntimeIdentifier>
    <SelfContained>false</SelfContained>
  </PropertyGroup>
  <ItemGroup>
    <PackageReference Include="Microsoft.Extensions.Hosting" Version="2.2.0" />
    <PackageReference Include="Microsoft.Extensions.Logging.Console" Version="2.2.0" />
  </ItemGroup>
</Project>
  1. Run the code above as-is, and terminate with Ctrl+C. See all the cleanup code that runs.
  2. Uncomment the two Thread lines above to let the program self-terminate via Process.Exit.

Expected behavior

The same cleanup code runs when exiting via ProcessExit as it did with Ctrl+C:

Ran cleanup code inside using host block.
Ran most service's cleanup code.
Ran logger cleanup code.
Ran cleanup code outside using host block.
Returning exit code from Program.Main.

X:\path\to\App.exe (process nnn) exited with code 123.

Actual behavior

Only the following cleanup code runs:

Ran cleanup code inside using host block.
Ran most service's cleanup code.

X:\path\to\App.exe (process nnn) exited with code 456.

Note that the return code for a normal ProcessExit case (rather than one simulated by Environment.Exit) would likely be 3221225786 (STATUS_CONTROL_C_EXIT) on Windows for v2.2.0 of this package and 0 for the latest (I think; based on ConsoleLifetime source).

Additional context

Trapping ProcessExit and then deciding how long to wait overall is rather problematic; it could wait too long/block, as in dotnet/extensions#1363, or too short, as here. The main options I can think of are:

  1. Only have ProcessExit trigger the start of graceful shutdown (like Ctrl+C does), and give something else the responsibility to keep the shutdown from completing until Program.Main finishes.
  2. Have some automatic way to know when Program.Main returns. Not sure if this is possible - I tried doing a thread.Join on the main thread, but that doesn't quite work. Polling it for when IsBackground flips to false looks like it could work, but that seems not exactly ideal.
  3. Push the control up to the user for when to have the shutdown handler terminate (have the user pass in an event signaled from Program.Main maybe or something roughly like that; more complicated API to make that work).

Overall, the host itself doesn't own the entirety of Program.Main, so having it decide when to stop running is tricky.

I did find that at least the logger cleanup can likely be done better by changing the order of the disposables in the service container, including by adding a first constructor parameter to Host that gets registered first - currently, the IHostApplicationLifetime is after loggers in the list of things to dispose, and the logic to dispose in reverse order on the service provider/scope means loggers don't get to cleanup.

For application insights, for example, having the logger not get to have Dispose called means it can't call Flush, so logs currently get lost on shutdown.

The full list of things that currently get lost are:

  1. Earlier registered service provider disposables, including logger providers (likely could be fixed with some registration order/constructor chaining/dependency order changes).
  2. Code after the using host block.
  3. Program.Main's exit code.
@davidmatson
Copy link
Author

davidmatson commented Jul 15, 2019

A couple other things that likely currently get lost: full disposing of custom service providers, full disposing of custom hosts.

I ended up working around this with a wrapper around the entry point for now. Here's what it looks like:

class Program
{
    static void Main()
    {
        ProcessLifetimeHost.Run(MainCore);
    }

    static void MainCore(IProcessLifetime processLifetime)
    {
        new HostBuilder()
            // Configure etc
            .UseProcessLifetime(processLifetime)
            .Build()
            .Run();
    }
}

When run with the original set of things hooked up above, all of the cleanup code runs (including code outside the using block and setting the exit code).

The IProcessLifetime interface looks like this:

/// <summary>Defines a process lifetime.</summary>
interface IProcessLifetime
{
    /// <summary>
    /// Gets or sets the action to invoke to initiate graceful shutdown of currently-running application, if any.
    /// </summary>
    Action StopApplication { get; set; }
}

If you're interested in going down this route further, let me know and I can provide the other details as well (UseProcessLifetime extension method and ProcessLifetimeHost.Run method).

@analogrelay
Copy link
Contributor

We'll likely need some corefx support on this one to give us an API to properly handle SIGTERM.

The rough proposal we have right now is to have a new API in corefx to hook the SIGTERM directly (right now we hook ProcessExit which can happen gracefully or via SIGTERM). In the new API, there would be a Cancel argument that can be used to stop the default behavior of the CLR (to shut down). In our lifetime we can hook that new event, stop the termination and signal the IApplicationLifetime to stop.

It means that if the app doesn't gracefully shutdown, SIGTERM won't shut it down, but SIGTERM is intended to be "graceful" so that makes sense. If a user needs to forcibly terminate the app, they can use SIGKILL

@davidmatson
Copy link
Author

Unfortunately, I don't think it's possible for corefx to provide a cancellable API on Windows - the underlying signal sent by docker stop on Windows containers changed to CTRL_SHUTDOWN_EVENT, which can't be canceled. The MSDN docs could be a little clearer - they more say what can be done in this case (inline cleanup in the handler) rather than what can't be done (canceling the signal). See also a related moby issue for details.

@analogrelay
Copy link
Contributor

Good to know. I was referring to graceful termination on Linux which sends SIGTERM but it may be more challenging on Windows.

This will be a tricky one, we'll have to investigate some options.

@MikhaelSorkin
Copy link

Hello!
Any update here? Is it possible to achieve graceful shutdown in windows container?

@davidmatson
Copy link
Author

It is possible to achieve graceful shutdown in a Windows container, but it currently requires a bit of custom code to make it work. Specifically, it requires using P/Invoke to get a shutdown notification (see dotnet/extensions#2029 and moby/moby#25982). And then this signal cannot be canceled, so it requires different wiring to make it work with IHostBuilder/Host. The ProcessLifetimeHost snippet above shows how I'm currently doing that from a design standpoint (the implementation is rather complicated).

So yes it's possible, but it requires custom code and is not easy.

@MikhaelSorkin
Copy link

@davidmatson
Thank you for answer. Do you plan to publish your implementation? As a nuget package or as a source code?

@davidmatson
Copy link
Author

My hope with this bug was to get a solution into the official product, so I wasn't planning on publishing my implementation. I'd have to check, but I could probably paste it here as a temporary workaround if there's interest.

@MikhaelSorkin
Copy link

@davidmatson
Well, i'm very interested in this workaround :)

@davidmatson
Copy link
Author

Here's a zip of the source I'm using as a workaround. The usage is as described above. I'm using it with v2.2.0 of the hosting extensions package (it may work with a later version but I haven't tested that).

ProcessLifetimeWorkaround.zip

Disclaimer: as-is; just a workaround; not an official solution; not offering support for this code; etc.

@Tragetaschen
Copy link
Contributor

Tragetaschen commented Dec 17, 2019

This has cost me the better part of the day :-(

Whenever I ran my application on the command line on Linux, CTRL+C would gracefully exit and dispose my Autofac container. When running as systemd service, that would just start the disposing work, but then stop in the middle of it. Since that service would normally only be stopped when my embedded instrument with a volatile system log would be shut down, I had to unwrap a lot of layers before this became clear.

For the time being as a workaround, I have internalized the code for UseSystemd, replaced the ProcessExit with Console.CancelKeyPress and added KillSignal=SIGINT to my systemd unit.

Is there a planned path forward already? I think this should be more than "Backlog"…

@analogrelay
Copy link
Contributor

Triage: The right approach here is twofold:

  1. Get an event that runs before ProcessExit to capture SIGTERM (either by a full signal handler API or a SIGTERM) and gives the option to cancel the default behavior (terminating the process)

  2. Change ConsoleLifetime to hook that event and use it to signal a graceful shutdown.

@analogrelay
Copy link
Contributor

The above approach would also fix dotnet/extensions#1363

@davidmatson
Copy link
Author

@anurse - the equivalent of SIGTERM on Windows/Docker (CTRL_SHUTDOWN_EVENT) can't be canceled. Thoughts?

@tmds
Copy link
Member

tmds commented Apr 25, 2020

Sharing my view of dotnet/extensions#1363 (comment).

ConsoleLifetime shouldn't be used by default.
Though there should be an API to Run a Host from Main that includes it.

That avoids the issue because the Host no longer assumes it gets disposed on ProcessExit.
And, not all hosts should be tied to these events, like a host that runs shortly as part of a larger application.

@davidfowl
Copy link
Member

The console host should be used by default when you call CreateDefaultBuilder, I'd be ok removing it when you new up the host.

That avoids the issue because the Host no longer assumes it gets disposed on ProcessExit.

It doesn't we still need to fix it.

@tmds
Copy link
Member

tmds commented Apr 25, 2020

It doesn't we still need to fix it.

Why not? The issue is caused by ConsoleLifetime right?

@davidfowl
Copy link
Member

Why not? The issue is caused by ConsoleLifetime right?

Because when you call CreateDefaultBuilder the problem still exists.

@analogrelay analogrelay transferred this issue from dotnet/extensions May 7, 2020
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the untriaged New issue has not been triaged by the area owner label May 7, 2020
@Dotnet-GitSync-Bot
Copy link
Collaborator

I couldn't figure out the best area label to add to this issue. Please help me learn by adding exactly one area label.

@maryamariyan
Copy link
Member

Moving to 6.0 as it involves API work.

@maryamariyan maryamariyan removed this from the 5.0.0 milestone Aug 20, 2020
@davidfowl
Copy link
Member

@davidmatson I know this issue is old but how are you exiting the process here? AFAIK there's no way to make this work without runtime changes. Ideally, we want Main to run to completion before exiting for these un-cancellable signals.

cc @jkotas @janvorli

@jkotas
Copy link
Member

jkotas commented Apr 22, 2021

I know this issue is old but how are you exiting the process here?

The repro steps at the top exit the process using Environment.Exit. Does it still accurately describe the problem that this issue is tracking?

Environment.Exit calls the process exit event and exits the process. I do not see us adding more steps or any implicit waiting to this sequence. If Environment.Exit is not compatible with your app model, avoid calling it.

@Tragetaschen
Copy link
Contributor

@jkotas

For me (#35990 (comment)), the issue became apparent through .NET's handling of SIGTERM and I moved to SIGINT as a workaround.

Environment.Exit is not part of my code base

@davidmatson
Copy link
Author

@jkotas - it's been awhile, but I believe the Environment.Exit was just a deterministic, programmatic way of simulating what happens when someone does Ctrl+C, docker stop, or similar. (I expect there are other ways to trigger this behavior as well, like an unhandled exception from another thread.)

@davidmatson
Copy link
Author

davidmatson commented Apr 22, 2021

@davidfowl

@davidmatson I know this issue is old but how are you exiting the process here? AFAIK there's no way to make this work without runtime changes. Ideally, we want Main to run to completion before exiting for these un-cancellable signals.

Normally, exiting the process by Ctrl+C or docker stop.
Yes, when I filed this issue, I expected runtime changes would be required (or a very different API for IHost startup, like in the workaround zip I posted above).

@jkotas
Copy link
Member

jkotas commented Apr 22, 2021

the Environment.Exit was just a deterministic, programmatic way of simulating what happens when someone does Ctrl+C, docker stop, or similar

Not exactly.

Environment.Exit is low-level API that is not designed to be overridable.

Ctrl-C and similar signals exit the process by default, but that behavior can be overriden. #50527 is about making the APIs for these signals better.

This issue looks like duplicate of #50527 and can be closed. @davidfowl Do you agree?

an unhandled exception from another thread

Unhandled exception is abnormal exit. It does not follow the same path as Environment.Exit

@davidmatson
Copy link
Author

Yeah, I'm not all that concerned with Environment.Exit per se as long as all the other cases (Ctrl+C, docker stop, system shutdown notification, and unhandled exceptions from other threads) are covered.

I'm guessing the other issue was filed with this kind of scenario in mind; as long as we're covering the same cases there, closing this one in favor of that one sounds fine to me.

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Jul 20, 2021
eerhardt added a commit to eerhardt/runtime that referenced this issue Jul 20, 2021
Don't listen to ProcessExit on net6.0+ in Hosting anymore. This allows for Environment.Exit to not hang the app.
Don't clobber ExitCode during ProcessExit now that SIGTERM is handled separately.

For non-net6.0 targets, only wait for the shutdown timeout, so the process doesn't hang forever.

Fix dotnet#55417
Fix dotnet#44086
Fix dotnet#50397
Fix dotnet#42224
Fix dotnet#35990
eerhardt added a commit that referenced this issue Jul 22, 2021
* Add NetCoreAppCurrent target to Microsoft.Extensions.Hosting

* Handle SIGTERM in Hosting and handle just like SIGINT (CTRL+C)

Don't listen to ProcessExit on net6.0+ in Hosting anymore. This allows for Environment.Exit to not hang the app.
Don't clobber ExitCode during ProcessExit now that SIGTERM is handled separately.

For non-net6.0 targets, only wait for the shutdown timeout, so the process doesn't hang forever.

Fix #55417
Fix #44086
Fix #50397
Fix #42224
Fix #35990

* Remove _shutdownBlock on netcoreappcurrent, as this is no longer waited on
* Change Console.CancelKeyPress to use PosixSignalRegistration SIGINT and SIGQUIT
* Use a no-op lifetime on mobile platforms

* Add docs for shutdown
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Jul 22, 2021
@davidmatson
Copy link
Author

@eerhardt - to clarify the status here: with the merged fix, does the repro at the top now work as in the Expected behavior section above (run all the cleanup steps listed and return an exit code of 123 from Main)?

Thanks

@davidmatson
Copy link
Author

Also, does it work that way specifically on Windows?

@eerhardt
Copy link
Member

eerhardt commented Jul 22, 2021

does the repro at the top now work as in the Expected behavior section above (run all the cleanup steps listed and return an exit code of 123 from Main)?

No. That is unreasonable expected behavior. See https://docs.microsoft.com/en-us/dotnet/api/system.environment.exit?view=net-5.0#remarks for what to expect with Environment.Exit. Specifically:

Exit terminates an application immediately, even if other threads are running.

When calling Environment.Exit, your "Main" method doesn't complete. Also see @jkotas's comment above.

With the merged fix, Hosting no longer subscribes to ProcessExit. It was only handling ProcessExit in previous versions of .NET because it wanted to handle SIGTERM (for docker stop). In .NET 6, we can now specifically handle SIGTERM without using ProcessExit. If your app calls Environment.Exit and it requires specific clean up to happen, it is your apps responsibility for coordinating the shutdown. Hosting doesn't try to gracefully shutdown the process when Environment.Exit is called, because Environment.Exit isn't really a "graceful" way of shutting down a process. Your Main method doesn't complete, running threads are terminated, finally blocks aren't executed, etc.

In short, Environment.Exit != CTRL+C or docker stop. If your app is using Hosting, and you want to gracefully stop the host, you should be calling IHostApplicationLifetime.StopApplication().

See https://github.com/dotnet/runtime/blob/main/src/libraries/Microsoft.Extensions.Hosting/docs/HostShutdown.md for more information.

@davidmatson
Copy link
Author

davidmatson commented Jul 22, 2021

@eerhardt - sorry, I should clarify ignoring Environment.Exit to trigger process stop as in jkota's comment above. If, at that point in the code, CTRL+C or docker stop happens, will it now behave as in the Expected behavior section, even on Windows?

Thanks

@davidmatson
Copy link
Author

I'm particularly interested in the docker stop case on Windows - at least last time I looked, docker stop used system shutdown, which is a non-cancellable event.

@eerhardt
Copy link
Member

If, at that point in the code, CTRL+C or docker stop happens, will it now behave as in the Expected behavior section, even on Windows?

Yes. When CTRL+C or docker stop is issued to the process, the Hosted process will now gracefully exit and return 123. Yes, even on Windows and in Windows containers.

using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Hosting;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Logging.Abstractions;
using System;
using System.Threading;

class Program
{
    static int Main()
    {
        var builder = new HostBuilder()
            .ConfigureLogging(l =>
            {
                l.AddConsole();
                l.Services.AddSingleton<ILoggerProvider, CustomLoggerProvider>();
            }).ConfigureServices(s =>
            {
                s.AddSingleton<OtherService>();
            });

        using (IHost host = builder.Build())
        {
            host.Services.GetRequiredService<OtherService>();
            host.Start();
            //Thread thread = new Thread(() => Environment.Exit(456));
            //thread.Start();
            host.WaitForShutdown();
            Console.WriteLine("Ran cleanup code inside using host block.");
        }

        Console.WriteLine("Ran cleanup code outside using host block.");
        Console.WriteLine("Returning exit code from Program.Main.");
        return 123;
    }

    class CustomLoggerProvider : ILoggerProvider
    {
        public ILogger CreateLogger(string categoryName)
        {
            return NullLogger.Instance;
        }

        public void Dispose()
        {
            Console.WriteLine("Ran logger cleanup code.");
        }
    }

    class OtherService : IDisposable
    {
        public void Dispose()
        {
            Console.WriteLine("Ran most service's cleanup code.");
        }
    }
}
eerhardt@EERHARDT5  C:\DotNetTest\Net6Hosting                                                                                  [12:43]
❯ dotnet publish -c Release -r win-x64
Microsoft (R) Build Engine version 17.0.0-preview-21359-04+d150e93ff for .NET
Copyright (C) Microsoft Corporation. All rights reserved.

  Determining projects to restore...
  Restored C:\DotNetTest\Net6Hosting\Net6Hosting.csproj (in 92 ms).
  You are using a preview version of .NET. See: https://aka.ms/dotnet-core-preview
  Net6Hosting -> C:\DotNetTest\Net6Hosting\bin\Release\net6.0\win-x64\Net6Hosting.dll
  Net6Hosting -> C:\DotNetTest\Net6Hosting\bin\Release\net6.0\win-x64\publish\
eerhardt@EERHARDT5  C:\DotNetTest\Net6Hosting                                                                                  [12:48]
❯ docker build -t net6hosting .
Sending build context to Docker daemon  676.9MB
Step 1/4 : FROM mcr.microsoft.com/dotnet/runtime:6.0.0-preview.6-nanoserver-1809
 ---> dd2e42b52e4a
Step 2/4 : COPY bin/Release/net6.0/win-x64/publish/ App/
 ---> 4f3178d5e561
Step 3/4 : WORKDIR /App
 ---> Running in 2b68fe9fe4f3
Removing intermediate container 2b68fe9fe4f3
 ---> 875cb6faef2a
Step 4/4 : ENTRYPOINT ["/App/Net6Hosting"]
 ---> Running in 46e2987db95a
Removing intermediate container 46e2987db95a
 ---> 8d4a9591c61d
Successfully built 8d4a9591c61d
Successfully tagged net6hosting:latest

Use 'docker scan' to run Snyk tests against images to find vulnerabilities and learn how to fix them
eerhardt@EERHARDT5  C:\DotNetTest\Net6Hosting                                                                                  [12:48]
❯ docker run net6hosting
info: Microsoft.Hosting.Lifetime[0]
      Application started. Press Ctrl+C to shut down.
info: Microsoft.Hosting.Lifetime[0]
      Hosting environment: Production
info: Microsoft.Hosting.Lifetime[0]
      Content root path: C:\App\
info: Microsoft.Hosting.Lifetime[0]
      Application is shutting down...
Ran cleanup code inside using host block.
Ran most service's cleanup code.
Ran logger cleanup code.
Ran cleanup code outside using host block.
Returning exit code from Program.Main.
eerhardt@EERHARDT5  C:\DotNetTest\Net6Hosting                                                                                  [12:49]
❯ $LASTEXITCODE
123

Above I ran docker stop 93c982cee9f8 from a separate command prompt after docker run.

@davidmatson
Copy link
Author

Excellent. Thanks, @eerhardt!

@ghost ghost locked as resolved and limited conversation to collaborators Aug 21, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.