-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CoreCLR's CNTRL_SHUTDOWN_EVENT handler prevents graceful exit of services during system shutdown #83093
Comments
I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label. |
Tagging subscribers to this area: @dotnet/area-extensions-hosting Issue DetailsDescriptionI am not sure if this belongs here, but this is a new weird behaviour I have been fighting with lately. I have a service, where the 'Main_Worker' is a 'IHostedService' and a 'BackgroundService'. I have overriden the 'StopAsync' function and the only thing it really does is write to the Windows Event Log a message and save a txt file with some data. While I was working with this structure in .NET 6, it all worked preatty well. Then I upgraded to .Net 7, and it seems that now 'StopAsync' is not triggering when I turn of the PC by a normal Power -> Shutdown or if I use 'Process.Start("cmd", @"/c shutdown -s -t 0");' When in .NET 6, both ways it worked without problems. Reproduction Steps
That is my StopAsync function, it used to work like this with no problems. But now, even if you take away the await line, not even the logger is appearing. Expected behaviorAt least I would expect to see the logged message in the Windows Event Logger when the PC is turned on again. Actual behaviorNo trace of the message being logged or the file being written. The service probably closes, yet I have no idea if it is in a forced mode or if it is actually being closed in a gracefull way. Regression?No response Known WorkaroundsI do not have workarounds at the moment. ConfigurationHave tested in Windows 10 & 11 Other informationNo response
|
As a matter of fact, just to give a bit more of context, here is a test I have been doing:
ApplicationStarted appears in the event viewer perfectly well. But the other two are not. This when restaring or shutdown. But if I stop gracefully the service manually, both messageds appear fine. |
Looks similar to #62579 and #81274, @gkapellmann could you try the workaround mentioned and let us know if that helps your issue? The workaround is for unblocking you for short term, if you could let us know how it goes, it would be a useful data for to the team to find appropriate fix. |
This issue has been marked |
This one is actually different than the error 1067 issue. In that case the service process was exiting before completing ServiceBase's OnStop method which is what notified SCM that the service cleanly stopped. In that case it would run all ApplicationStopping and ApplicationStopped. In this case the machine is shutting down. I was looking at a similar issue internally and found that the problem was actually due to #41101. During a shutdown a service will first receive a CNTRL_SHUTDOWN_EVENT. The default is to do nothing for a service process per https://learn.microsoft.com/en-us/windows/console/handlerroutine#remarks, but that was changed for .NET processes in #41101. I was able to workaround this by adding ConsoleControlHandler that cancels other handlers when seeing CNTRL_SHUTDOWN_EVENT - stopping the runtime's handler from shutting down the process and letting the service exit normally. In .NET 6.0 you can do this with a one-liner: using var registration = PosixSignalRegistration.Create(PosixSignal.SIGTERM, (context) => context.Cancel = true); Just make sure that registration stays alive for the entire duration of the service - I put it in the Main() routine. We should look into adding something like this to ServiceBase -- or having the runtime try to avoid handling CNTRL_SHUTDOWN_EVENT for services. |
Tagging subscribers to this area: @dotnet/area-system-serviceprocess Issue DetailsDescriptionI am not sure if this belongs here, but this is a new weird behaviour I have been fighting with lately. I have a service, where the 'Main_Worker' is a 'IHostedService' and a 'BackgroundService'. I have overriden the 'StopAsync' function and the only thing it really does is write to the Windows Event Log a message and save a txt file with some data. While I was working with this structure in .NET 6, it all worked preatty well. Then I upgraded to .Net 7, and it seems that now 'StopAsync' is not triggering when I turn of the PC by a normal Power -> Shutdown or if I use 'Process.Start("cmd", @"/c shutdown -s -t 0");' When in .NET 6, both ways it worked without problems. Reproduction Steps
That is my StopAsync function, it used to work like this with no problems. But now, even if you take away the await line, not even the logger is appearing. Expected behaviorAt least I would expect to see the logged message in the Windows Event Logger when the PC is turned on again. Actual behaviorNo trace of the message being logged or the file being written. The service probably closes, yet I have no idea if it is in a forced mode or if it is actually being closed in a gracefull way. Regression?No response Known WorkaroundsI do not have workarounds at the moment. ConfigurationHave tested in Windows 10 & 11 Other informationNo response
|
Ideally we don't do this only in ServiceBase since that wouldn't fix cases where folks were implementing services themselves without using ServiceBase. Instead we should try to replicate whatever the OS does in the default control handler to distinguish between normal console processes and services. |
Hello @ericstj, thank you for your answers! I have not tested your workaround in .NET 6.0, but I did test wit with a small example in .NET 7.0 and is not working for me. As a matter of fact neither the StopAsync or StartAsync functions are getting triggered. |
@gkapellmann if StartAsync is not triggered then you are having a problem with even starting up your service. Are you able to share your example? |
@janvorli as FYI.. |
@ericstj Sure, no problem, I have created another repo with my very simple service example here. It is based in .NET 7.0, when running you should be ableto see the events in the event manager, then shutting the PC down and then on again and I am not able to see the start async or stop async messages, but the main executeasyc events are there properly. |
@gkapellmann I took a look at your repro. I was able to find out what was wrong by using Time-Travel debugging and examining the behavior at shutdown. The
This was preventing you from running your SaveCurrentLog method. I was able to workaround that by wrapping the call to _logger.LogInformation in your stopasync with a try/catch. I'll take a deeper look at this to see why it's happening. Could be that this is normal during shutdown. Could be that a handle got disposed out from under the logger. |
Looks like those unhandled exceptions can also occur during the runtime/src/libraries/Microsoft.Extensions.Hosting.WindowsServices/src/WindowsServiceLifetime.cs Line 73 in 967250c
I saw that the handle was still valid - so not a case of being disposed while still logging. The very first failure to write to the event log fails with
So it looks to me like the machine is just shutting down the other services that would be needed to write to the event log. I tried adding service dependencies on the EventLog and RpcSs services to the test service, but it didn't seem to help. |
I tried retargeting this service to |
It is interesting, I am going to look back in my repo history. In the end the real priority is to have SaveCurrentLog working, the event logger is just for debugging, and I guess that what you say regarding the services makes sense. I did get the saved logs before, could it be that the order of the services shutting down is semi-random? |
It could just be timing that's machine dependent. I'm reproducing on a clean VM without many other services so that might mean that all the other services shut down very fast. If you happen to have a repro that shows things working consistently, I can have a look and try to explain the difference if I can repro in my environment. Another thing you could do to see if you were getting lucky with timing is to take your working repro and add delay (for instance, in StopAsync before calling the logger) to see if it will fail. Probably we should add documentation to the EventLog log provider that it's not safe to use during shutdown and we might want to go and add catch statements around calls to the logger that happen during shutdown. That feels like a pitfall in that logger cc @dotnet/area-extensions-logging |
Hello @ericstj I have been a bit busy but managed to do some tests. I guess I was getting lucky, I did a complete cleanup of the service and re-installed. Seems to behave different with every instalation. But then, what I have tried is to do a loop that awaits for the stoppingToken from ExecuteAsync, then try to do a EventLog, followed by a "SaveCurrentLog" action. But I dont get a log file either here. There are multiple issues here: In my head I see StartAsync first, followed by ExecuteAsync. When the stopping is requested, StopAsync is triggered, but that shouldnt prevent ExecuteAsync to finish completely, right? I have done this updates in the repo, if you want to test this changes. |
Double check the directory. Services run with a working directory of
ExecuteAsync is called by StartAsync which receives a cancellationToken that will be cancelled when the application is stopped.
There is no guarantee that StopAsync waits for any completion of ExecuteAsync. It looks to me like your ExecuteAsync method is just polling the cancellation token - so no cancellation registration is guaranteed to run -- and nothing else would wait on ExecuteAsync exiting gracefully. |
Right, so I think that things have gone really clearer for me with this. Part of the problem is that I was trying to use the backgroundservice in a way its not meant to be used. So:
I have modified how I do the logging, and seems to be working fine now. Thanks a lot @ericstj for your help through this. If we all are ok, I am good to close this issue. |
This issue did help us discover the shutdown problem with CoreCLR's CNTRL_SHUTDOWN_EVENT which I mentioned here: #83093 (comment). We've seen other reports of this same problem, so I'd like to keep this issue open to track a fix for that in CoreCLR. @janvorli was looking into a fix for that. |
@mangod9 this is a regression from a previous release and I don't think it should be moved to future. This bug prevents graceful service exit due to disabling exception handling and it should be addressed. |
believe there is a workaround in place to help with this issue? Will mark it for 9 to determine the feasibility of the fix. |
The workaround is required in the customer's application, and they would need to identify this problem (exceptions not being caught/stop not completing on shutdown) and understand what was going on to know they needed the workaround - which is a pretty big stretch given how hard it is to debug shutdown issues. We would still like to see a fix for this. |
Ok, chatted with @janvorli on this to follow up if there is a good solution on the runtime side. |
@janvorli @ericstj What do you expect the runtime fix to be? If I am interpreting the comments above correctly, this is what's happening in general:
To avoid unhandled exceptions in the last step, I think the service either needs to shutdown completely in the response to the shutdown callback and/or the service needs to start ignoring unhandled exceptions at some point during shutdown. #42275 may help with the later. |
The primary issue is that a service should not call ExitProcess when receiving the CTRL_SHUTDOWN_EVENT, as described in https://learn.microsoft.com/en-us/windows/console/handlerroutine#remarks:
The problem got introduced when .NET started to register the control handler in #55333. |
I haven't looked into the details of our implementation of the code related to the handler, maybe it would be sufficient to return FALSE from the HandlerRoutine if we detect we are running in the system session at this point: Lines 83 to 85 in ac07ea6
|
@janvorli has it right The problem is the process is getting two different shutdown signals. Services should only be shutdown through SCM - and that's how it works in .NETFramework and earlier .NET versions. It broke with #55333. The Windows default control handler identifies a service process and avoids sending it CTRL_SHUTDOWN_EVENT - the CoreCLR should do the same since it's behaving like the OS and registering a handler for all application types. To behave most like the OS here we should try to identify if the process is a service process - we can't count on it using System.ServiceProcess - some .NET processes opt to PInvoke directly to SCM. If possible we should replicate the OS's detection mechanism. |
#50527 and #55333 introduced a low-level signal handling API. I do not think that the introduction of these APIs alone broke anything. The break was introduced as side-effect of ConsoleLifeTime using these APIs in #56057. It does not make sense for this low-level signal handling API to be opinionated and handle app-model specific concerns, like skipping the handler registration for Windows Services. Windows Services are not the only ones with this type of issue. For example, the Win32 GUI apps should be only shutdown through WM_QUIT message. I think that the problem is caused by ConsoleLifeTime being used by Windows Services. Windows Services are not console apps, so it does not make sense for them to use Console app model. (And similarly, Win32 GUI apps are not console apps and so they should not use ConsoleLifeTime. Dtto for all other appmodels - WinUI, MAUI, Avalonia, ... .) |
It is true that the #55333 on its own without any SIGTERM registered handler doesn't cause the issue and the problem is that the |
Or I guess just not using the ConsoleLifetime. That's what you were trying to say, @jkotas, right? |
Right. ConsoleLifetime implements console appmodel that is not compatible with Windows services. Note that we have SystemdLifetime that handles same concern for Systemd processes ("Unix services"): runtime/src/libraries/Microsoft.Extensions.Hosting.Systemd/src/SystemdLifetime.netcoreapp.cs Lines 15 to 18 in d88c7ba
|
Description
I am not sure if this belongs here, but this is a new weird behaviour I have been fighting with lately.
I have a service, where the 'Main_Worker' is a 'IHostedService' and a 'BackgroundService'. I have overriden the 'StopAsync' function and the only thing it really does is write to the Windows Event Log a message and save a txt file with some data.
While I was working with this structure in .NET 6, it all worked preatty well. Then I upgraded to .Net 7, and it seems that now 'StopAsync' is not triggering when I turn of the PC by a normal Power -> Shutdown or if I use 'Process.Start("cmd", @"/c shutdown -s -t 0");'
When in .NET 6, both ways it worked without problems.
Reproduction Steps
public async override Task<Task> StopAsync(CancellationToken cancellationToken) { Log.Text("Stopping Worker_Main service..."); logger.LogWarning("Test Service is Stopping..."); await Log.SaveCurrentLog(); return base.StopAsync(cancellationToken); }
That is my StopAsync function, it used to work like this with no problems. But now, even if you take away the await line, not even the logger is appearing.
Expected behavior
At least I would expect to see the logged message in the Windows Event Logger when the PC is turned on again.
Actual behavior
No trace of the message being logged or the file being written. The service probably closes, yet I have no idea if it is in a forced mode or if it is actually being closed in a gracefull way.
Regression?
No response
Known Workarounds
I do not have workarounds at the moment.
Configuration
Have tested in Windows 10 & 11
Both are x64
Other information
No response
The text was updated successfully, but these errors were encountered: