Skip to content
This repository

exit notebook cleanly on SIGINT, SIGTERM #1609

Merged
merged 3 commits into from about 2 years ago

2 participants

Min RK Fernando Perez
Min RK
Owner
minrk commented April 15, 2012

makes it a bit more likely security files, etc. will be cleaned up.

We already handled SIGINT, but now SIGTERM results in clean exit as well.

Closes #1601

Min RK exit notebook cleanly on SIGINT, SIGTERM
makes it a bit more likely security files, etc. will be cleaned up.
b2135e7
Fernando Perez
Owner

Actually, I've been wondering, given that currently the only way to stop the notebook server is to use Ctrl-C, if we shouldn't at least do a 'are you sure you wan to exit? [Y/n]' kind of thing on SIGINT... Since accidentally killing the nb server right now can potentially kill multiple computations for a user, a minimal check might not be a bad idea. Thoughts?

Min RK
Owner
minrk commented April 15, 2012

That makes good sense, though there is one problem: asking y/n blocks, so while it is waiting for the response, the notebook server is actually unresponsive. If you do kill -INT <nbserver>, you wouldn't cause a shutdown, but you would render the notebook useless, unless you can get back to the shell where the notebook was started (may not exist anymore). Any idea to get around that?

Fernando Perez
Owner

Mmh, we could put the asker in a thread that dies with a timeout of say 5 seconds. The user can then either reply, or send a second SIGINT to force a kill, but if 5 seconds go by, the server will assume it was an accidental SIGINT and resume. How does that sound?

Min RK
Owner
minrk commented April 15, 2012

that could work. So the SIGINT handler would do:

  1. register secondary SIGINT handler which causes immediate exit
  2. spawns thread with ask y/n
    • if y: exit immediately
    • if n or timeout: restore original ask y/n handler

yes?

I forget how to do timeouts / interrupt a thread.

Fernando Perez
Owner

Yup. I don't remember the timeout api either, let me know if you want me to tackle this one and I'll look into it. Otherwise, I'll continue writing the new magics stuff (I finally got to it after my blitz on PRs and issues :)

Min RK
Owner
minrk commented April 15, 2012

By all means, work on magics. I'll peek at this one, and see if I can't get close enough.

Fernando Perez
Owner

Got it, thx. I'm on IRC too if you need a quick ping on anything.

Min RK
Owner
minrk commented April 15, 2012

confirmation dialog added on SIGINT.

  • confirmation happens in background thread, to avoid blocking app thread
  • confirmation times out after 5s before resuming
  • ^C^C counts as confirmation
Fernando Perez
Owner

This looks great, glad you found a non-thread solution! (and I learned how to do that with select today :) Let me know if it was a typo and we'll wrap it up. Awesome job.

Min RK
Owner
minrk commented April 15, 2012

Yes, sorry - the 2 was because I didn't want to wait 5 seconds while testing. It should be back to 5 now.

It is a threaded solution, but by using select instead of raw_input, I could have a timeout without having to do any interrupting or communication between threads - it just always runs to completion, in at most 5s.

Fernando Perez
Owner

Thanks for the quick fix. Never mind the thread comment, I didn't see the threading code further up :) Code looks good, but on testing I got this:

[NotebookApp] Use Control-C to stop this server and shut down all kernels.
^CShutdown Notebook Server (y/[n])? WARNING:root:Interrupted system call
Traceback (most recent call last):
  File "/home/fperez/usr/opt/lib/python2.7/site-packages/zmq/eventloop/ioloop.py", line 293, in start
    event_pairs = self._impl.poll(1000*poll_timeout)
  File "poll.pyx", line 189, in zmq.core.poll.Poller.poll (zmq/core/poll.c:2215)
  File "poll.pyx", line 101, in zmq.core.poll._poll (zmq/core/poll.c:1460)
ZMQError: Interrupted system call
no answer for 5s resuming

Any idea why? It seems to work fine, but the ZMQ errors on-screen are a bit nasty.

ps - I'd adjust the message to: "No answer for 5s: resuming operation...", a bit more readable.

Min RK confirm notebook shutdown on SIGINT
confirmation in bg thread, to avoid blocking
5s timeout before restoring original state if no response

^C^C == confirmation
988b044
Min RK
Owner
minrk commented April 15, 2012

Thanks, message updated.

The interrupt message you mentioned is inherited from tornado, which logged interrupts. I discovered that it's actually worse than you mentioned, as ^C actually goes unhandled, and the IOLoop raises if you have pyzmq ≤ 2.1.7, invoking the crash-handler. This one I can fix, though the prompt never shows up, and IPython just exits on ^C with old pyzmq.

In 2.1.9, you will get the message as you see it, but Tornado changed their behavior to be quiet about it (and pyzmq followed suit), so there is no such message in 2.1.11 and above.

If we want to suppress this message, we would have to silence the root logger used by tornado/the IOLoop, or require/recommend pyzmq ≥ 2.1.11.

Min RK
Owner
minrk commented April 15, 2012

Another alternative could be to only enable this confirmation dialog when using pyzmq ≥ 2.1.11, where it works without any ugliness.

Fernando Perez
Owner

Nah, I'd say just apply the fix you can to prevent the crash handler from kicking in, and we'll leave it at that. The message is a bit scary looking but things actually work fine in practice, and as people gradually update the relevance of this will quickly die down, so let's not bog our own code with unnecessary special-casing.

Min RK
Owner
minrk commented April 16, 2012

hm, the log message tends to print after the prompt, so it's pretty bad on 2.1.9. Should I add a tiny delay to avoid this, or make 2.1.11 the minimum version for the confirmation?

Fernando Perez
Owner

Since this is going for human response, a small delay is fine IMO.

Min RK handle old pyzmq in notebook exit confirmation
* 2.1.7 (earliest dep) will simply not work, so skip it
* 2.1.9-10 log the interrupts, so add a short delay to ensure the log
  messages don't come after the prompt.
65ec49c
Min RK
Owner
minrk commented April 16, 2012

okay, short delay added - 2.1.7 will get no dialog, 2.1.9-10 will get the log messages, and 2.1.11 will be peachy.

Fernando Perez
Owner

Great, merging now (had been reviewed and tested).

Fernando Perez fperez merged commit 4f9b234 into from April 16, 2012
Fernando Perez fperez closed this April 16, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Showing 3 unique commits by 1 author.

Apr 15, 2012
Min RK exit notebook cleanly on SIGINT, SIGTERM
makes it a bit more likely security files, etc. will be cleaned up.
b2135e7
Min RK confirm notebook shutdown on SIGINT
confirmation in bg thread, to avoid blocking
5s timeout before restoring original state if no response

^C^C == confirmation
988b044
Apr 16, 2012
Min RK handle old pyzmq in notebook exit confirmation
* 2.1.7 (earliest dep) will simply not work, so skip it
* 2.1.9-10 log the interrupts, so add a short delay to ensure the log
  messages don't come after the prompt.
65ec49c
This page is out of date. Refresh to see the latest.
65  IPython/frontend/html/notebook/notebookapp.py
@@ -20,10 +20,13 @@
20 20
 import errno
21 21
 import logging
22 22
 import os
  23
+import re
  24
+import select
23 25
 import signal
24 26
 import socket
25 27
 import sys
26 28
 import threading
  29
+import time
27 30
 import webbrowser
28 31
 
29 32
 # Third party
@@ -449,11 +452,73 @@ def init_webapp(self):
449 452
                 self.port = port
450 453
                 break
451 454
     
  455
+    def init_signal(self):
  456
+        # FIXME: remove this check when pyzmq dependency is >= 2.1.11
  457
+        # safely extract zmq version info:
  458
+        try:
  459
+            zmq_v = zmq.pyzmq_version_info()
  460
+        except AttributeError:
  461
+            zmq_v = [ int(n) for n in re.findall(r'\d+', zmq.__version__) ]
  462
+            if 'dev' in zmq.__version__:
  463
+                zmq_v.append(999)
  464
+            zmq_v = tuple(zmq_v)
  465
+        if zmq_v >= (2,1,9):
  466
+            # This won't work with 2.1.7 and
  467
+            # 2.1.9-10 will log ugly 'Interrupted system call' messages,
  468
+            # but it will work
  469
+            signal.signal(signal.SIGINT, self._handle_sigint)
  470
+        signal.signal(signal.SIGTERM, self._signal_stop)
  471
+    
  472
+    def _handle_sigint(self, sig, frame):
  473
+        """SIGINT handler spawns confirmation dialog"""
  474
+        # register more forceful signal handler for ^C^C case
  475
+        signal.signal(signal.SIGINT, self._signal_stop)
  476
+        # request confirmation dialog in bg thread, to avoid
  477
+        # blocking the App
  478
+        thread = threading.Thread(target=self._confirm_exit)
  479
+        thread.daemon = True
  480
+        thread.start()
  481
+    
  482
+    def _restore_sigint_handler(self):
  483
+        """callback for restoring original SIGINT handler"""
  484
+        signal.signal(signal.SIGINT, self._handle_sigint)
  485
+    
  486
+    def _confirm_exit(self):
  487
+        """confirm shutdown on ^C
  488
+        
  489
+        A second ^C, or answering 'y' within 5s will cause shutdown,
  490
+        otherwise original SIGINT handler will be restored.
  491
+        """
  492
+        # FIXME: remove this delay when pyzmq dependency is >= 2.1.11
  493
+        time.sleep(0.1)
  494
+        sys.stdout.write("Shutdown Notebook Server (y/[n])? ")
  495
+        sys.stdout.flush()
  496
+        r,w,x = select.select([sys.stdin], [], [], 5)
  497
+        if r:
  498
+            line = sys.stdin.readline()
  499
+            if line.lower().startswith('y'):
  500
+                self.log.critical("Shutdown confirmed")
  501
+                ioloop.IOLoop.instance().stop()
  502
+                return
  503
+        else:
  504
+            print "No answer for 5s:",
  505
+        print "resuming operation..."
  506
+        # no answer, or answer is no:
  507
+        # set it back to original SIGINT handler
  508
+        # use IOLoop.add_callback because signal.signal must be called
  509
+        # from main thread
  510
+        ioloop.IOLoop.instance().add_callback(self._restore_sigint_handler)
  511
+    
  512
+    def _signal_stop(self, sig, frame):
  513
+        self.log.critical("received signal %s, stopping", sig)
  514
+        ioloop.IOLoop.instance().stop()
  515
+    
452 516
     @catch_config_error
453 517
     def initialize(self, argv=None):
454 518
         super(NotebookApp, self).initialize(argv)
455 519
         self.init_configurables()
456 520
         self.init_webapp()
  521
+        self.init_signal()
457 522
 
458 523
     def cleanup_kernels(self):
459 524
         """shutdown all kernels
Commit_comment_tip

Tip: You can add notes to lines in a file. Hover to the left of a line to make a note

Something went wrong with that request. Please try again.