Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

exit notebook cleanly on SIGINT, SIGTERM #1609

Merged
merged 3 commits into from

2 participants

@minrk
Owner

makes it a bit more likely security files, etc. will be cleaned up.

We already handled SIGINT, but now SIGTERM results in clean exit as well.

Closes #1601

@minrk minrk exit notebook cleanly on SIGINT, SIGTERM
makes it a bit more likely security files, etc. will be cleaned up.
b2135e7
@fperez
Owner

Actually, I've been wondering, given that currently the only way to stop the notebook server is to use Ctrl-C, if we shouldn't at least do a 'are you sure you wan to exit? [Y/n]' kind of thing on SIGINT... Since accidentally killing the nb server right now can potentially kill multiple computations for a user, a minimal check might not be a bad idea. Thoughts?

@minrk
Owner

That makes good sense, though there is one problem: asking y/n blocks, so while it is waiting for the response, the notebook server is actually unresponsive. If you do kill -INT <nbserver>, you wouldn't cause a shutdown, but you would render the notebook useless, unless you can get back to the shell where the notebook was started (may not exist anymore). Any idea to get around that?

@fperez
Owner

Mmh, we could put the asker in a thread that dies with a timeout of say 5 seconds. The user can then either reply, or send a second SIGINT to force a kill, but if 5 seconds go by, the server will assume it was an accidental SIGINT and resume. How does that sound?

@minrk
Owner

that could work. So the SIGINT handler would do:

  1. register secondary SIGINT handler which causes immediate exit
  2. spawns thread with ask y/n
    • if y: exit immediately
    • if n or timeout: restore original ask y/n handler

yes?

I forget how to do timeouts / interrupt a thread.

@fperez
Owner

Yup. I don't remember the timeout api either, let me know if you want me to tackle this one and I'll look into it. Otherwise, I'll continue writing the new magics stuff (I finally got to it after my blitz on PRs and issues :)

@minrk
Owner

By all means, work on magics. I'll peek at this one, and see if I can't get close enough.

@fperez
Owner

Got it, thx. I'm on IRC too if you need a quick ping on anything.

@minrk
Owner

confirmation dialog added on SIGINT.

  • confirmation happens in background thread, to avoid blocking app thread
  • confirmation times out after 5s before resuming
  • ^C^C counts as confirmation
@fperez
Owner

This looks great, glad you found a non-thread solution! (and I learned how to do that with select today :) Let me know if it was a typo and we'll wrap it up. Awesome job.

@minrk
Owner

Yes, sorry - the 2 was because I didn't want to wait 5 seconds while testing. It should be back to 5 now.

It is a threaded solution, but by using select instead of raw_input, I could have a timeout without having to do any interrupting or communication between threads - it just always runs to completion, in at most 5s.

@fperez
Owner

Thanks for the quick fix. Never mind the thread comment, I didn't see the threading code further up :) Code looks good, but on testing I got this:

[NotebookApp] Use Control-C to stop this server and shut down all kernels.
^CShutdown Notebook Server (y/[n])? WARNING:root:Interrupted system call
Traceback (most recent call last):
  File "/home/fperez/usr/opt/lib/python2.7/site-packages/zmq/eventloop/ioloop.py", line 293, in start
    event_pairs = self._impl.poll(1000*poll_timeout)
  File "poll.pyx", line 189, in zmq.core.poll.Poller.poll (zmq/core/poll.c:2215)
  File "poll.pyx", line 101, in zmq.core.poll._poll (zmq/core/poll.c:1460)
ZMQError: Interrupted system call
no answer for 5s resuming

Any idea why? It seems to work fine, but the ZMQ errors on-screen are a bit nasty.

ps - I'd adjust the message to: "No answer for 5s: resuming operation...", a bit more readable.

@minrk minrk confirm notebook shutdown on SIGINT
confirmation in bg thread, to avoid blocking
5s timeout before restoring original state if no response

^C^C == confirmation
988b044
@minrk
Owner

Thanks, message updated.

The interrupt message you mentioned is inherited from tornado, which logged interrupts. I discovered that it's actually worse than you mentioned, as ^C actually goes unhandled, and the IOLoop raises if you have pyzmq ≤ 2.1.7, invoking the crash-handler. This one I can fix, though the prompt never shows up, and IPython just exits on ^C with old pyzmq.

In 2.1.9, you will get the message as you see it, but Tornado changed their behavior to be quiet about it (and pyzmq followed suit), so there is no such message in 2.1.11 and above.

If we want to suppress this message, we would have to silence the root logger used by tornado/the IOLoop, or require/recommend pyzmq ≥ 2.1.11.

@minrk
Owner

Another alternative could be to only enable this confirmation dialog when using pyzmq ≥ 2.1.11, where it works without any ugliness.

@fperez
Owner

Nah, I'd say just apply the fix you can to prevent the crash handler from kicking in, and we'll leave it at that. The message is a bit scary looking but things actually work fine in practice, and as people gradually update the relevance of this will quickly die down, so let's not bog our own code with unnecessary special-casing.

@minrk
Owner

hm, the log message tends to print after the prompt, so it's pretty bad on 2.1.9. Should I add a tiny delay to avoid this, or make 2.1.11 the minimum version for the confirmation?

@fperez
Owner

Since this is going for human response, a small delay is fine IMO.

@minrk minrk handle old pyzmq in notebook exit confirmation
* 2.1.7 (earliest dep) will simply not work, so skip it
* 2.1.9-10 log the interrupts, so add a short delay to ensure the log
  messages don't come after the prompt.
65ec49c
@minrk
Owner

okay, short delay added - 2.1.7 will get no dialog, 2.1.9-10 will get the log messages, and 2.1.11 will be peachy.

@fperez
Owner

Great, merging now (had been reviewed and tested).

@fperez fperez merged commit 4f9b234 into ipython:master
@stevengj stevengj referenced this pull request in JuliaLang/IJulia.jl
Open

killing notebook is unreliable #270

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Apr 15, 2012
  1. @minrk

    exit notebook cleanly on SIGINT, SIGTERM

    minrk authored
    makes it a bit more likely security files, etc. will be cleaned up.
Commits on Apr 16, 2012
  1. @minrk

    confirm notebook shutdown on SIGINT

    minrk authored
    confirmation in bg thread, to avoid blocking
    5s timeout before restoring original state if no response
    
    ^C^C == confirmation
  2. @minrk

    handle old pyzmq in notebook exit confirmation

    minrk authored
    * 2.1.7 (earliest dep) will simply not work, so skip it
    * 2.1.9-10 log the interrupts, so add a short delay to ensure the log
      messages don't come after the prompt.
This page is out of date. Refresh to see the latest.
Showing with 65 additions and 0 deletions.
  1. +65 −0 IPython/frontend/html/notebook/notebookapp.py
View
65 IPython/frontend/html/notebook/notebookapp.py
@@ -20,10 +20,13 @@
import errno
import logging
import os
+import re
+import select
import signal
import socket
import sys
import threading
+import time
import webbrowser
# Third party
@@ -449,11 +452,73 @@ def init_webapp(self):
self.port = port
break
+ def init_signal(self):
+ # FIXME: remove this check when pyzmq dependency is >= 2.1.11
+ # safely extract zmq version info:
+ try:
+ zmq_v = zmq.pyzmq_version_info()
+ except AttributeError:
+ zmq_v = [ int(n) for n in re.findall(r'\d+', zmq.__version__) ]
+ if 'dev' in zmq.__version__:
+ zmq_v.append(999)
+ zmq_v = tuple(zmq_v)
+ if zmq_v >= (2,1,9):
+ # This won't work with 2.1.7 and
+ # 2.1.9-10 will log ugly 'Interrupted system call' messages,
+ # but it will work
+ signal.signal(signal.SIGINT, self._handle_sigint)
+ signal.signal(signal.SIGTERM, self._signal_stop)
+
+ def _handle_sigint(self, sig, frame):
+ """SIGINT handler spawns confirmation dialog"""
+ # register more forceful signal handler for ^C^C case
+ signal.signal(signal.SIGINT, self._signal_stop)
+ # request confirmation dialog in bg thread, to avoid
+ # blocking the App
+ thread = threading.Thread(target=self._confirm_exit)
+ thread.daemon = True
+ thread.start()
+
+ def _restore_sigint_handler(self):
+ """callback for restoring original SIGINT handler"""
+ signal.signal(signal.SIGINT, self._handle_sigint)
+
+ def _confirm_exit(self):
+ """confirm shutdown on ^C
+
+ A second ^C, or answering 'y' within 5s will cause shutdown,
+ otherwise original SIGINT handler will be restored.
+ """
+ # FIXME: remove this delay when pyzmq dependency is >= 2.1.11
+ time.sleep(0.1)
+ sys.stdout.write("Shutdown Notebook Server (y/[n])? ")
+ sys.stdout.flush()
+ r,w,x = select.select([sys.stdin], [], [], 5)
+ if r:
+ line = sys.stdin.readline()
+ if line.lower().startswith('y'):
+ self.log.critical("Shutdown confirmed")
+ ioloop.IOLoop.instance().stop()
+ return
+ else:
+ print "No answer for 5s:",
+ print "resuming operation..."
+ # no answer, or answer is no:
+ # set it back to original SIGINT handler
+ # use IOLoop.add_callback because signal.signal must be called
+ # from main thread
+ ioloop.IOLoop.instance().add_callback(self._restore_sigint_handler)
+
+ def _signal_stop(self, sig, frame):
+ self.log.critical("received signal %s, stopping", sig)
+ ioloop.IOLoop.instance().stop()
+
@catch_config_error
def initialize(self, argv=None):
super(NotebookApp, self).initialize(argv)
self.init_configurables()
self.init_webapp()
+ self.init_signal()
def cleanup_kernels(self):
"""shutdown all kernels
Something went wrong with that request. Please try again.