Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exit notebook cleanly on SIGINT, SIGTERM #1609

Merged
merged 3 commits into from Apr 16, 2012
Merged

Conversation

minrk
Copy link
Member

@minrk minrk commented Apr 15, 2012

makes it a bit more likely security files, etc. will be cleaned up.

We already handled SIGINT, but now SIGTERM results in clean exit as well.

Closes #1601

makes it a bit more likely security files, etc. will be cleaned up.
@fperez
Copy link
Member

fperez commented Apr 16, 2012

Actually, I've been wondering, given that currently the only way to stop the notebook server is to use Ctrl-C, if we shouldn't at least do a 'are you sure you wan to exit? [Y/n]' kind of thing on SIGINT... Since accidentally killing the nb server right now can potentially kill multiple computations for a user, a minimal check might not be a bad idea. Thoughts?

@minrk
Copy link
Member Author

minrk commented Apr 16, 2012

That makes good sense, though there is one problem: asking y/n blocks, so while it is waiting for the response, the notebook server is actually unresponsive. If you do kill -INT <nbserver>, you wouldn't cause a shutdown, but you would render the notebook useless, unless you can get back to the shell where the notebook was started (may not exist anymore). Any idea to get around that?

@fperez
Copy link
Member

fperez commented Apr 16, 2012

Mmh, we could put the asker in a thread that dies with a timeout of say 5 seconds. The user can then either reply, or send a second SIGINT to force a kill, but if 5 seconds go by, the server will assume it was an accidental SIGINT and resume. How does that sound?

@minrk
Copy link
Member Author

minrk commented Apr 16, 2012

that could work. So the SIGINT handler would do:

  1. register secondary SIGINT handler which causes immediate exit
  2. spawns thread with ask y/n
    • if y: exit immediately
    • if n or timeout: restore original ask y/n handler

yes?

I forget how to do timeouts / interrupt a thread.

@fperez
Copy link
Member

fperez commented Apr 16, 2012

Yup. I don't remember the timeout api either, let me know if you want me to tackle this one and I'll look into it. Otherwise, I'll continue writing the new magics stuff (I finally got to it after my blitz on PRs and issues :)

@minrk
Copy link
Member Author

minrk commented Apr 16, 2012

By all means, work on magics. I'll peek at this one, and see if I can't get close enough.

@fperez
Copy link
Member

fperez commented Apr 16, 2012

Got it, thx. I'm on IRC too if you need a quick ping on anything.

@minrk
Copy link
Member Author

minrk commented Apr 16, 2012

confirmation dialog added on SIGINT.

  • confirmation happens in background thread, to avoid blocking app thread
  • confirmation times out after 5s before resuming
  • ^C^C counts as confirmation

@fperez
Copy link
Member

fperez commented Apr 16, 2012

This looks great, glad you found a non-thread solution! (and I learned how to do that with select today :) Let me know if it was a typo and we'll wrap it up. Awesome job.

@minrk
Copy link
Member Author

minrk commented Apr 16, 2012

Yes, sorry - the 2 was because I didn't want to wait 5 seconds while testing. It should be back to 5 now.

It is a threaded solution, but by using select instead of raw_input, I could have a timeout without having to do any interrupting or communication between threads - it just always runs to completion, in at most 5s.

@fperez
Copy link
Member

fperez commented Apr 16, 2012

Thanks for the quick fix. Never mind the thread comment, I didn't see the threading code further up :) Code looks good, but on testing I got this:

[NotebookApp] Use Control-C to stop this server and shut down all kernels.
^CShutdown Notebook Server (y/[n])? WARNING:root:Interrupted system call
Traceback (most recent call last):
  File "/home/fperez/usr/opt/lib/python2.7/site-packages/zmq/eventloop/ioloop.py", line 293, in start
    event_pairs = self._impl.poll(1000*poll_timeout)
  File "poll.pyx", line 189, in zmq.core.poll.Poller.poll (zmq/core/poll.c:2215)
  File "poll.pyx", line 101, in zmq.core.poll._poll (zmq/core/poll.c:1460)
ZMQError: Interrupted system call
no answer for 5s resuming

Any idea why? It seems to work fine, but the ZMQ errors on-screen are a bit nasty.

ps - I'd adjust the message to: "No answer for 5s: resuming operation...", a bit more readable.

confirmation in bg thread, to avoid blocking
5s timeout before restoring original state if no response

^C^C == confirmation
@minrk
Copy link
Member Author

minrk commented Apr 16, 2012

Thanks, message updated.

The interrupt message you mentioned is inherited from tornado, which logged interrupts. I discovered that it's actually worse than you mentioned, as ^C actually goes unhandled, and the IOLoop raises if you have pyzmq ≤ 2.1.7, invoking the crash-handler. This one I can fix, though the prompt never shows up, and IPython just exits on ^C with old pyzmq.

In 2.1.9, you will get the message as you see it, but Tornado changed their behavior to be quiet about it (and pyzmq followed suit), so there is no such message in 2.1.11 and above.

If we want to suppress this message, we would have to silence the root logger used by tornado/the IOLoop, or require/recommend pyzmq ≥ 2.1.11.

@minrk
Copy link
Member Author

minrk commented Apr 16, 2012

Another alternative could be to only enable this confirmation dialog when using pyzmq ≥ 2.1.11, where it works without any ugliness.

@fperez
Copy link
Member

fperez commented Apr 16, 2012

Nah, I'd say just apply the fix you can to prevent the crash handler from kicking in, and we'll leave it at that. The message is a bit scary looking but things actually work fine in practice, and as people gradually update the relevance of this will quickly die down, so let's not bog our own code with unnecessary special-casing.

@minrk
Copy link
Member Author

minrk commented Apr 16, 2012

hm, the log message tends to print after the prompt, so it's pretty bad on 2.1.9. Should I add a tiny delay to avoid this, or make 2.1.11 the minimum version for the confirmation?

@fperez
Copy link
Member

fperez commented Apr 16, 2012

Since this is going for human response, a small delay is fine IMO.

* 2.1.7 (earliest dep) will simply not work, so skip it
* 2.1.9-10 log the interrupts, so add a short delay to ensure the log
  messages don't come after the prompt.
@minrk
Copy link
Member Author

minrk commented Apr 16, 2012

okay, short delay added - 2.1.7 will get no dialog, 2.1.9-10 will get the log messages, and 2.1.11 will be peachy.

@fperez
Copy link
Member

fperez commented Apr 16, 2012

Great, merging now (had been reviewed and tested).

fperez added a commit that referenced this pull request Apr 16, 2012
exit notebook cleanly on SIGINT, SIGTERM, and add a safety (text) dialog to ask for exit confirmation.  This prevents an accidental Ctrl-C in the wrong window from destroying a user's potentially large set of notebooks that could have taken a long time to create.

Closes #1601
@fperez fperez merged commit 4f9b234 into ipython:master Apr 16, 2012
mattvonrocketstein pushed a commit to mattvonrocketstein/ipython that referenced this pull request Nov 3, 2014
exit notebook cleanly on SIGINT, SIGTERM, and add a safety (text) dialog to ask for exit confirmation.  This prevents an accidental Ctrl-C in the wrong window from destroying a user's potentially large set of notebooks that could have taken a long time to create.

Closes ipython#1601
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Security file is not removed after shutdown by Ctrl+C or kill -INT
2 participants