Skip to content

exit notebook cleanly on SIGINT, SIGTERM #1609

Merged
merged 3 commits into from Apr 16, 2012

2 participants

@minrk
IPython member
minrk commented Apr 15, 2012

makes it a bit more likely security files, etc. will be cleaned up.

We already handled SIGINT, but now SIGTERM results in clean exit as well.

Closes #1601

@minrk minrk exit notebook cleanly on SIGINT, SIGTERM
makes it a bit more likely security files, etc. will be cleaned up.
b2135e7
@fperez
IPython member
fperez commented Apr 16, 2012

Actually, I've been wondering, given that currently the only way to stop the notebook server is to use Ctrl-C, if we shouldn't at least do a 'are you sure you wan to exit? [Y/n]' kind of thing on SIGINT... Since accidentally killing the nb server right now can potentially kill multiple computations for a user, a minimal check might not be a bad idea. Thoughts?

@minrk
IPython member
minrk commented Apr 16, 2012

That makes good sense, though there is one problem: asking y/n blocks, so while it is waiting for the response, the notebook server is actually unresponsive. If you do kill -INT <nbserver>, you wouldn't cause a shutdown, but you would render the notebook useless, unless you can get back to the shell where the notebook was started (may not exist anymore). Any idea to get around that?

@fperez
IPython member
fperez commented Apr 16, 2012

Mmh, we could put the asker in a thread that dies with a timeout of say 5 seconds. The user can then either reply, or send a second SIGINT to force a kill, but if 5 seconds go by, the server will assume it was an accidental SIGINT and resume. How does that sound?

@minrk
IPython member
minrk commented Apr 16, 2012

that could work. So the SIGINT handler would do:

  1. register secondary SIGINT handler which causes immediate exit
  2. spawns thread with ask y/n
    • if y: exit immediately
    • if n or timeout: restore original ask y/n handler

yes?

I forget how to do timeouts / interrupt a thread.

@fperez
IPython member
fperez commented Apr 16, 2012

Yup. I don't remember the timeout api either, let me know if you want me to tackle this one and I'll look into it. Otherwise, I'll continue writing the new magics stuff (I finally got to it after my blitz on PRs and issues :)

@minrk
IPython member
minrk commented Apr 16, 2012

By all means, work on magics. I'll peek at this one, and see if I can't get close enough.

@fperez
IPython member
fperez commented Apr 16, 2012

Got it, thx. I'm on IRC too if you need a quick ping on anything.

@minrk
IPython member
minrk commented Apr 16, 2012

confirmation dialog added on SIGINT.

  • confirmation happens in background thread, to avoid blocking app thread
  • confirmation times out after 5s before resuming
  • ^C^C counts as confirmation
@fperez
IPython member
fperez commented Apr 16, 2012

This looks great, glad you found a non-thread solution! (and I learned how to do that with select today :) Let me know if it was a typo and we'll wrap it up. Awesome job.

@minrk
IPython member
minrk commented Apr 16, 2012

Yes, sorry - the 2 was because I didn't want to wait 5 seconds while testing. It should be back to 5 now.

It is a threaded solution, but by using select instead of raw_input, I could have a timeout without having to do any interrupting or communication between threads - it just always runs to completion, in at most 5s.

@fperez
IPython member
fperez commented Apr 16, 2012

Thanks for the quick fix. Never mind the thread comment, I didn't see the threading code further up :) Code looks good, but on testing I got this:

[NotebookApp] Use Control-C to stop this server and shut down all kernels.
^CShutdown Notebook Server (y/[n])? WARNING:root:Interrupted system call
Traceback (most recent call last):
  File "/home/fperez/usr/opt/lib/python2.7/site-packages/zmq/eventloop/ioloop.py", line 293, in start
    event_pairs = self._impl.poll(1000*poll_timeout)
  File "poll.pyx", line 189, in zmq.core.poll.Poller.poll (zmq/core/poll.c:2215)
  File "poll.pyx", line 101, in zmq.core.poll._poll (zmq/core/poll.c:1460)
ZMQError: Interrupted system call
no answer for 5s resuming

Any idea why? It seems to work fine, but the ZMQ errors on-screen are a bit nasty.

ps - I'd adjust the message to: "No answer for 5s: resuming operation...", a bit more readable.

@minrk minrk confirm notebook shutdown on SIGINT
confirmation in bg thread, to avoid blocking
5s timeout before restoring original state if no response

^C^C == confirmation
988b044
@minrk
IPython member
minrk commented Apr 16, 2012

Thanks, message updated.

The interrupt message you mentioned is inherited from tornado, which logged interrupts. I discovered that it's actually worse than you mentioned, as ^C actually goes unhandled, and the IOLoop raises if you have pyzmq ≤ 2.1.7, invoking the crash-handler. This one I can fix, though the prompt never shows up, and IPython just exits on ^C with old pyzmq.

In 2.1.9, you will get the message as you see it, but Tornado changed their behavior to be quiet about it (and pyzmq followed suit), so there is no such message in 2.1.11 and above.

If we want to suppress this message, we would have to silence the root logger used by tornado/the IOLoop, or require/recommend pyzmq ≥ 2.1.11.

@minrk
IPython member
minrk commented Apr 16, 2012

Another alternative could be to only enable this confirmation dialog when using pyzmq ≥ 2.1.11, where it works without any ugliness.

@fperez
IPython member
fperez commented Apr 16, 2012

Nah, I'd say just apply the fix you can to prevent the crash handler from kicking in, and we'll leave it at that. The message is a bit scary looking but things actually work fine in practice, and as people gradually update the relevance of this will quickly die down, so let's not bog our own code with unnecessary special-casing.

@minrk
IPython member
minrk commented Apr 16, 2012

hm, the log message tends to print after the prompt, so it's pretty bad on 2.1.9. Should I add a tiny delay to avoid this, or make 2.1.11 the minimum version for the confirmation?

@fperez
IPython member
fperez commented Apr 16, 2012

Since this is going for human response, a small delay is fine IMO.

@minrk minrk handle old pyzmq in notebook exit confirmation
* 2.1.7 (earliest dep) will simply not work, so skip it
* 2.1.9-10 log the interrupts, so add a short delay to ensure the log
  messages don't come after the prompt.
65ec49c
@minrk
IPython member
minrk commented Apr 16, 2012

okay, short delay added - 2.1.7 will get no dialog, 2.1.9-10 will get the log messages, and 2.1.11 will be peachy.

@fperez
IPython member
fperez commented Apr 16, 2012

Great, merging now (had been reviewed and tested).

@fperez fperez merged commit 4f9b234 into ipython:master Apr 16, 2012
@stevengj stevengj referenced this pull request in JuliaLang/IJulia.jl Mar 3, 2015
Closed

killing notebook is unreliable #270

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.