Skip to content

Refactoring notebook managers and adding Azure backed storage. #2045

Merged
merged 6 commits into from Aug 11, 2012

5 participants

@ellisonbg
IPython member

I have created a base class for all notebook managers. Our existing, file-based store, is now in filenbmanager.py. I have also created a new Azure Blob based backed notebook manager.

@fperez
IPython member
fperez commented Jun 26, 2012

This looks good on first view, but it's obviously 0.14 material, tagging as such.

@ellisonbg
IPython member

Yep, strongly so. One thing that I am running into is that we have hardcoded logic about the notebook_dir everywhere. But that notion only makes sense for the file based notebook manager. I have going to commit some fixes that work around this fact, but we are going to have to rethink this.

@fperez
IPython member
fperez commented Jun 27, 2012

Furthermore, the notebook_dir support is kind of borked now: it refers to the storage dir, but the cwd of the actual kernels is still where the server starts. Ideally it would do both, so one can (in file-based storage) configure the notebook dir in a config profile and simply start things from wherever, knowing that all kernels will come up in the right place.

I'd actually like to see that fixed for 0.13, but I don't see it happening.

@fperez
IPython member
fperez commented Jun 27, 2012

Scratch that! It was #1958, and Min already fixed it in #1951 :) So all clear for 0.13 :)

Gee, we're merging so much code so fast, that I don't even remember issues I opened only a week ago. The problems we have :)

@ellisonbg
IPython member

@fperez do you think this is ready to go in?

@fperez fperez commented on the diff Jul 18, 2012
IPython/frontend/html/notebook/azurenbmanager.py
+"""
+
+#-----------------------------------------------------------------------------
+# Copyright (C) 2012 The IPython Development Team
+#
+# Distributed under the terms of the BSD License. The full license is in
+# the file COPYING, distributed as part of this software.
+#-----------------------------------------------------------------------------
+
+#-----------------------------------------------------------------------------
+# Imports
+#-----------------------------------------------------------------------------
+
+import datetime
+
+import azure
@fperez
IPython member
fperez added a note Jul 18, 2012

Is this guaranteed to be available by default on all azure images, or do users need to manually install the necessary sdk on their images? If so, do we need to update our setup.py to declare this as an optional dependency?

@ellisonbg
IPython member
ellisonbg added a note Jul 18, 2012

Users have to install this by hand and we should add it to our setup.py.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@fperez
IPython member
fperez commented Jul 18, 2012

Overall the code looks clean, but there are a few things that worry me:

  • untested/untestable: since the only way to test this stuff for real is to run it on azure, it's not exactly test suite material. But I am becoming increasingly uncomfortable with the amount of new code we're putting in without proper test coverage. I'm wondering if at least using a mock object that mocks azure would be viable...

  • docs: I think it needs somewhere a bit of mention in the documentation. I know we have very little in the way of docs about the notebook architecture and possibilities, but we need to start somewhere. Might as well be here :)

The former I'm not really sure what to do about, the latter is eminently addressable... Otherwise the code does look great, and I'm thrilled to see it in. Good job!

@ellisonbg
IPython member

I will add some docs for this.

@minrk
IPython member
minrk commented Jul 18, 2012

untested/untestable: since the only way to test this stuff for real is to run it on azure, it's not exactly test suite material. But I am becoming increasingly uncomfortable with the amount of new code we're putting in without proper test coverage. I'm wondering if at least using a mock object that mocks azure would be viable...

This is like the PBS/etc. Launchers, which simply cannot be automatically tested. We can do some amount of testing with dummy objects, but I'm not sure how useful that is. Perhaps what is more useful, if more cumbersome, is manual tests a la docs/examples/tests, to make sure things work. When changing this code, run the test script / step through the test process, etc.

For instance, we managed to cut 0.13 with PBS/SGE support in ipcluster that totally doesn't work, so we need to figure something out.

re: notebook_dir

We use notebook_dir for three technically unrelated values, that are all the same by coincidence right now:

  1. the place where notebooks are stored (only relevant for file-based nbserver)
  2. the place where kernels are started (no reason to force these to be the same)
  3. the path used for serving local files from the notebook area (again, easily decoupled)

Is there more to decouple than that?

@ellisonbg
IPython member
@minrk
IPython member
minrk commented Jul 21, 2012

The main difficulty with testing this is that it requires the Azure
credentials, which we definitely can't version control. The test
script in this case would be a config file that has those credentials
and sets the right notebook manager.

Precisely - just like the launchers, it could only be run by users with an appropriate environment, and cannot be involved in the automatic test suite.

3 is the hardest to decouple. The reason is that eventually the
kernel will be run on a different host than the notebook server.

Is it possible that local files would simply be unavailable in cases where the kernel's filesystem is not visible to the nbserver? That honestly seems like a reasonable answer to me, since this is not a 100% critical feature.

@minrk
IPython member
minrk commented Jul 21, 2012

This one is only waiting for docs, right? And a rebase, now.

@ellisonbg
IPython member
@minrk
IPython member
minrk commented Jul 23, 2012

Is there a reason for the base class to be called BaseNotebookManager rather than NotebookManager? This change would mean that config for 0.13 will be broken in 0.14, that would work perfectly fine otherwise.

Since we have coupled class names to config files, I think we need a pretty high bar for renaming configurable classes.

@minrk minrk commented on an outdated diff Jul 23, 2012
IPython/frontend/html/notebook/filenbmanager.py
- def copy_notebook(self, notebook_id):
- """Copy an existing notebook and return its notebook_id."""
- last_mod, nb = self.get_notebook_object(notebook_id)
- name = nb.metadata.name + '-Copy'
- path, name = self.increment_filename(name)
- nb.metadata.name = name
- notebook_id = self.new_notebook_id(name)
- self.save_notebook_object(notebook_id, nb)
- return notebook_id
+ def log_info(self):
+ self.log.info("Serving notebooks from local directory: %s", self.notebook_manager.notebook_dir)
@minrk
IPython member
minrk added a note Jul 23, 2012

This should be self.notebook_dir, not self.notebook_manager.notebook_dir

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@Carreau Carreau commented on an outdated diff Jul 25, 2012
IPython/frontend/html/notebook/tests/test_nbmanager.py
self.assertEquals(km.notebook_dir, td)
def test_create_nb_dir(self):
with TemporaryDirectory() as td:
nbdir = os.path.join(td, 'notebooks')
- km = NotebookManager(notebook_dir=nbdir)
+ km = FileNotebookManager(notebook_dir=nbdir)
self.assertEquals(km.notebook_dir, nbdir)
@Carreau
IPython member
Carreau added a note Jul 25, 2012

FYI, this is conflicting just because of the 2 above assertEquals that have drop s in 4a6836c ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@benjiec
benjiec commented Aug 8, 2012

I closed pull request #2155 and adjusted my notebook manager to use Brian's work. Thanks Brian.

Noted the following bug in the azurenb branch, causing the default file system based nb manager to fail

diff --git a/IPython/frontend/html/notebook/filenbmanager.py b/IPython/frontend/html/notebook/filenbmanager.py
index 6389ec1..34d7d26 100644
--- a/IPython/frontend/html/notebook/filenbmanager.py
+++ b/IPython/frontend/html/notebook/filenbmanager.py
@@ -193,4 +193,4 @@ class FileNotebookManager(BaseNotebookManager):
         return name

     def log_info(self):
-        self.log.info("Serving notebooks from local directory: %s", self.notebook_manager.notebook_dir)
+        self.log.info("Serving notebooks from local directory: %s", self.notebook_dir)
@benjiec
benjiec commented Aug 8, 2012

Here is another bug fix to the azurenb branch, it was still referring to NotebookManager


diff --git a/IPython/frontend/html/notebook/notebookapp.py b/IPython/frontend/html/notebook/notebookapp.py
index 167965d..f867155 100644
--- a/IPython/frontend/html/notebook/notebookapp.py
+++ b/IPython/frontend/html/notebook/notebookapp.py
@@ -50,7 +50,7 @@ from .handlers import (LoginHandler, LogoutHandler,
     RSTHandler, AuthenticatedFileHandler, PrintNotebookHandler,
     MainClusterHandler, ClusterProfileHandler, ClusterActionHandler
 )
-from .notebookmanager import NotebookManager
+from .basenbmanager import BaseNotebookManager
 from .clustermanager import ClusterManager

 from IPython.config.application import catch_config_error, boolean_flag
@@ -215,7 +215,7 @@ flags['read-only'] = (
 )

 # Add notebook manager flags
-flags.update(boolean_flag('script', 'NotebookManager.save_script',
+flags.update(boolean_flag('script', 'BaseNotebookManager.save_script',
                'Auto-save a .py script everytime the .ipynb notebook is saved',
                'Do not auto-save .py scripts for every notebook'))

@@ -232,7 +232,7 @@ aliases.update({
     'port-retries': 'NotebookApp.port_retries',
     'keyfile': 'NotebookApp.keyfile',
     'certfile': 'NotebookApp.certfile',
-    'notebook-dir': 'NotebookManager.notebook_dir',
+    'notebook-dir': 'BaseNotebookManager.notebook_dir',
     'browser': 'NotebookApp.browser',
 })

@@ -260,7 +260,7 @@ class NotebookApp(BaseIPythonApplication):
     """
     examples = _examples

-    classes = IPythonConsoleApp.classes + [MappingKernelManager, NotebookManager]
+    classes = IPythonConsoleApp.classes + [MappingKernelManager, BaseNotebookManager]
     flags = Dict(flags)
     aliases = Dict(aliases)

@@ -410,7 +410,7 @@ class NotebookApp(BaseIPythonApplication):
             else:
                 self.file_to_run = f
                 nbdir = os.path.dirname(f)
-            self.config.NotebookManager.notebook_dir = nbdir
+            self.config.BaseNotebookManager.notebook_dir = nbdir

     def init_configurables(self):
         # force Session default to be secure
@benjiec
benjiec commented Aug 8, 2012

My patches are in https://github.com/benjiec/ipython/tree/azurenb

I will create a pull request to Brian's branch

@ellisonbg
IPython member

OK, I have addressed all of the comments and the test suite passes without the azure package installed. I have also added some minimal docs about using a custom notebook manager. Merging.

@ellisonbg ellisonbg merged commit c8c254c into ipython:master Aug 11, 2012
@minrk
IPython member
minrk commented Aug 11, 2012

I wish you had pinged, since you did not actually address all comments above.

Unaddressed:

  • you, yourself mentioned adding something about azure to setup.py, but nothing came of it
  • The question about renaming NotebookManager to BaseNotebookManager, which breaks config unnecessarily went unanswered.

And the rebased commits re-introduced use of the deprecated assertEquals name, which was otherwise removed.

@fperez
IPython member
fperez commented Aug 11, 2012

@ellisonbg, in general I think it's a good idea to give one last ping on PRs that have had good back-and-forth (such as this one). Typically what we've been doing is that then the reviewer, if all looks good, ends up merging right away. That way it doesn't really add any unnecessary lag to the process, it helps catch situations like this one, and it means that when we see the merge logs, the main reviewer tends to stand out right away without having to come back to read the PR process itself.

@ellisonbg
IPython member
@fperez
IPython member
@ellisonbg
IPython member
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.