Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for handling SIGTERM gracefully [WIP] #4309

Merged
merged 1 commit into from Apr 27, 2016

Conversation

@adityamandaleeka
Copy link
Member

@adityamandaleeka adityamandaleeka commented Apr 12, 2016

Today, our signal handling logic doesn't handle SIGTERM, so it just terminates the process abruptly. This change improves on that behavior by allowing us to treat SIGTERM similar to how we treat Environment.Exit() (so that the system is shutdown gracefully).

This change brings back the logic we had in place before for handling things like SIGINT, where we have a thread waiting to receive notifications about signals through a pipe. Doing so allows us to handle the signals outside of the context of the signal handler.

Addresses #2688

cc: @janvorli

@adityamandaleeka
Copy link
Member Author

@adityamandaleeka adityamandaleeka commented Apr 13, 2016

@janvorli PTAL

if (bytesRead == -1)
{
// Fatal error
abort();

This comment has been minimized.

@janvorli

janvorli Apr 13, 2016
Member

I think we might want to use PROCAbort instead of abort to shut down the debugger transport too.


default:
ASSERT("Unexpected signal %d in signal thread\n", signalCode);
abort();

This comment has been minimized.

@janvorli

janvorli Apr 13, 2016
Member

The same here

}
else
{
TRACE("SIGTERM signal was unhandled; chaining to previous sigaction\n");

This comment has been minimized.

@janvorli

janvorli Apr 13, 2016
Member

I am going to remove all the TRACE calls from signal handlers soon, since I have found they invoke functions that are not signal safe (e.g. string formatting). So I would suggest not adding it here now.

@janvorli
Copy link
Member

@janvorli janvorli commented Apr 13, 2016

LGTM modulo the few nits.

@adityamandaleeka adityamandaleeka force-pushed the adityamandaleeka:sigterm branch from b7d1b00 to 8499763 Apr 14, 2016
@adityamandaleeka
Copy link
Member Author

@adityamandaleeka adityamandaleeka commented Apr 14, 2016

@janvorli Thanks for taking a look. I've addressed the feedback and also removed all the TRACE calls from within signal handlers.

@janvorli
Copy link
Member

@janvorli janvorli commented Apr 14, 2016

@adityamandaleeka I actually meant removing the TRACE calls from the new handler only. I have removed them in my change that I am just about to check in, so it would create unnecessary conflicts.

@adityamandaleeka adityamandaleeka force-pushed the adityamandaleeka:sigterm branch from 8499763 to 3f99608 Apr 14, 2016
@adityamandaleeka
Copy link
Member Author

@adityamandaleeka adityamandaleeka commented Apr 14, 2016

@janvorli Ah ok, I added the others back.

@janvorli
Copy link
Member

@janvorli janvorli commented Apr 14, 2016

LGTM

@stephentoub
Copy link
Member

@stephentoub stephentoub commented Apr 14, 2016

Before this gets merged, I just wanted double-check that we don't have a way to achieve this without adding back the dedicated handler thread? I was wondering, for example, if we might somehow be able to use the finalizer thread in any way, e.g. on startup creating a finalizable object that's rooted, and then in the signal handler unrooting it to allow it to be collected and finalized, with its finalizer triggering shutdown, or something like that? I don't know if the work required to do that would be signal-safe, but if there's another route, it'd be nice to avoid having this dedicated thread.

@adityamandaleeka adityamandaleeka force-pushed the adityamandaleeka:sigterm branch from 3f99608 to 323d175 Apr 14, 2016
@adityamandaleeka
Copy link
Member Author

@adityamandaleeka adityamandaleeka commented Apr 14, 2016

Fixed the merge conflicts. @stephentoub I'll have to see if it's possible to do that safely. I'm not sure at this point what the ramifications of doing something like that would be if the SIGTERM is triggered while we are in the middle of a GC or holding some special lock.

@jkotas
Copy link
Member

@jkotas jkotas commented Apr 15, 2016

signal handler unrooting it to allow it to be collected and finalized, with its finalizer triggering shutdown

This does not work well because of there can be very long time between the object is unrooted and its finalizer running.

If you would like to piggyback on the finalizer thread for this, look for HaveExtraWorkForFinalizer in the VM. There are number of similar things piggybacking on it already.

@stephentoub
Copy link
Member

@stephentoub stephentoub commented Apr 15, 2016

If you would like to piggyback on the finalizer thread for this, look for HaveExtraWorkForFinalizer in the VM. There are number of similar things piggybacking on it already.

Great, that's really all I was suggesting; if we already have a way to do it, even better.

@Maoni0
Copy link
Member

@Maoni0 Maoni0 commented Apr 15, 2016

I was going to suggest the same thing. If you are still curious to know, as far as correctness goes, it's fine - finalizers are only run after GC is done (at the end of a GC, it will signal the finalizer thread to run if there're finalizers to be run, or there's extra work to be done). But perf wise, if your object is in gen2, we may not do a gen2 collection for a long time so we will not discover that the object is dead and its finalizer needs to be run before then.

@adityamandaleeka
Copy link
Member Author

@adityamandaleeka adityamandaleeka commented Apr 15, 2016

Thanks @jkotas and @Maoni0. I'll look at HaveExtraWorkForFinalizer.

@adityamandaleeka adityamandaleeka force-pushed the adityamandaleeka:sigterm branch 2 times, most recently from fb2197e to 975a4f9 Apr 26, 2016
@adityamandaleeka
Copy link
Member Author

@adityamandaleeka adityamandaleeka commented Apr 26, 2016

After investigating this, it turned out to be pretty easy to get into situations where using the "extra work" functionality of the finalizer thread wouldn't work well in practice (for instance, the finalizer can be too busy to run the extra work so we wouldn't be able to handle SIGTERM before whoever sent it gets impatient and sends something less polite).

Thankfully, there is another thread that seems to be appropriate for the job. The latest change uses the PAL synchronization manager's worker thread, which already waits for commands to come down a pipe. We simply add a new command for the termination request and once the worker thread receives it, it will spawn a new thread to perform the actual shutdown (to avoid deadlocks if code run during shutdown needs the worker thread to do something).

@jkotas @janvorli PTAL

case SynchWorkerCmdTerminationRequest:
// This worker thread is being asked to initiate process termination

HANDLE hShutdownThread;

This comment has been minimized.

@jkotas

jkotas Apr 26, 2016
Member

This confused me a bit - it may be nice to call this hThread or hTerminationRequestHandlingThread to be consistent with the name of everything else around this.

if (NO_ERROR != palErr)
{
ERROR("Unable to create worker thread\n");
}

This comment has been minimized.

@jkotas

jkotas Apr 26, 2016
Member

It may be nice to close thread handle. I know it is not strictly necessary for the shutdown case because of the process is likely go away, but it is a good hygiene.

@jkotas
Copy link
Member

@jkotas jkotas commented Apr 26, 2016

LGTM otherwise

@adityamandaleeka
Copy link
Member Author

@adityamandaleeka adityamandaleeka commented Apr 26, 2016

Thanks @jkotas

@janvorli
Copy link
Member

@janvorli janvorli commented Apr 27, 2016

LGTM

@adityamandaleeka adityamandaleeka force-pushed the adityamandaleeka:sigterm branch from 1e99c9e to a785c40 Apr 27, 2016
@adityamandaleeka adityamandaleeka merged commit 597f60b into dotnet:master Apr 27, 2016
8 checks passed
8 checks passed
CentOS7.1 x64 Debug Build and Test Build finished. 809 tests run, 0 skipped, 0 failed.
Details
FreeBSD x64 Checked Build Build finished. 809 tests run, 0 skipped, 0 failed.
Details
OSX x64 Checked Build and Test Build finished. No test results found.
Details
Ubuntu x64 Checked Build and Test Build finished. No test results found.
Details
Windows_NT x64 Debug Build and Test Build finished. 6414 tests run, 0 skipped, 0 failed.
Details
Windows_NT x64 Release Priority 1 Build and Test Build finished. 9316 tests run, 0 skipped, 0 failed.
Details
Windows_NT x86 Debug Legacy Backend Build and Test Build finished. No test results found.
Details
Windows_NT x86 Release Legacy Backend Build and Test Build finished. No test results found.
Details
@leecow leecow removed the 2 - In Progress label Apr 27, 2016
@MJomaa
Copy link

@MJomaa MJomaa commented May 14, 2016

Uhmm a stupid question, but how can I use that within C#?

Using ASP.NET Core RC1 I realized that process.Kill() doesn't always kill the process so a SIGTERM would be better in this case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

9 participants
You can’t perform that action at this time.