Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring notebook managers and adding Azure backed storage. #2045

Merged
merged 6 commits into from Aug 11, 2012

Conversation

ellisonbg
Copy link
Member

I have created a base class for all notebook managers. Our existing, file-based store, is now in filenbmanager.py. I have also created a new Azure Blob based backed notebook manager.

@fperez
Copy link
Member

fperez commented Jun 26, 2012

This looks good on first view, but it's obviously 0.14 material, tagging as such.

@ellisonbg
Copy link
Member Author

Yep, strongly so. One thing that I am running into is that we have hardcoded logic about the notebook_dir everywhere. But that notion only makes sense for the file based notebook manager. I have going to commit some fixes that work around this fact, but we are going to have to rethink this.

@fperez
Copy link
Member

fperez commented Jun 27, 2012

Furthermore, the notebook_dir support is kind of borked now: it refers to the storage dir, but the cwd of the actual kernels is still where the server starts. Ideally it would do both, so one can (in file-based storage) configure the notebook dir in a config profile and simply start things from wherever, knowing that all kernels will come up in the right place.

I'd actually like to see that fixed for 0.13, but I don't see it happening.

@fperez
Copy link
Member

fperez commented Jun 27, 2012

Scratch that! It was #1958, and Min already fixed it in #1951 :) So all clear for 0.13 :)

Gee, we're merging so much code so fast, that I don't even remember issues I opened only a week ago. The problems we have :)

@ellisonbg
Copy link
Member Author

@fperez do you think this is ready to go in?


import datetime

import azure
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this guaranteed to be available by default on all azure images, or do users need to manually install the necessary sdk on their images? If so, do we need to update our setup.py to declare this as an optional dependency?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users have to install this by hand and we should add it to our setup.py.

@fperez
Copy link
Member

fperez commented Jul 18, 2012

Overall the code looks clean, but there are a few things that worry me:

  • untested/untestable: since the only way to test this stuff for real is to run it on azure, it's not exactly test suite material. But I am becoming increasingly uncomfortable with the amount of new code we're putting in without proper test coverage. I'm wondering if at least using a mock object that mocks azure would be viable...
  • docs: I think it needs somewhere a bit of mention in the documentation. I know we have very little in the way of docs about the notebook architecture and possibilities, but we need to start somewhere. Might as well be here :)

The former I'm not really sure what to do about, the latter is eminently addressable... Otherwise the code does look great, and I'm thrilled to see it in. Good job!

@ellisonbg
Copy link
Member Author

I will add some docs for this.

@minrk
Copy link
Member

minrk commented Jul 18, 2012

untested/untestable: since the only way to test this stuff for real is to run it on azure, it's not exactly test suite material. But I am becoming increasingly uncomfortable with the amount of new code we're putting in without proper test coverage. I'm wondering if at least using a mock object that mocks azure would be viable...

This is like the PBS/etc. Launchers, which simply cannot be automatically tested. We can do some amount of testing with dummy objects, but I'm not sure how useful that is. Perhaps what is more useful, if more cumbersome, is manual tests a la docs/examples/tests, to make sure things work. When changing this code, run the test script / step through the test process, etc.

For instance, we managed to cut 0.13 with PBS/SGE support in ipcluster that totally doesn't work, so we need to figure something out.

re: notebook_dir

We use notebook_dir for three technically unrelated values, that are all the same by coincidence right now:

  1. the place where notebooks are stored (only relevant for file-based nbserver)
  2. the place where kernels are started (no reason to force these to be the same)
  3. the path used for serving local files from the notebook area (again, easily decoupled)

Is there more to decouple than that?

@ellisonbg
Copy link
Member Author

On Wed, Jul 18, 2012 at 7:31 AM, Min RK
reply@reply.github.com
wrote:

untested/untestable: since the only way to test this stuff for real is to run it on azure, it's not exactly test suite material. But I am becoming increasingly uncomfortable with the amount of new code we're putting in without proper test coverage. I'm wondering if at least using a mock object that mocks azure would be viable...

This is like the PBS/etc. Launchers, which simply cannot be automatically tested. We can do some amount of testing with dummy objects, but I'm not sure how useful that is. Perhaps what is more useful, if more cumbersome, is manual tests a la docs/examples/tests, to make sure things work. When changing this code, run the test script / step through the test process, etc.

The main difficulty with testing this is that it requires the Azure
credentials, which we definitely can't version control. The test
script in this case would be a config file that has those credentials
and sets the right notebook manager.

For instance, we managed to cut 0.13 with PBS/SGE support in ipcluster that totally doesn't work, so we need to figure something out.

re: notebook_dir

We use notebook_dir for three technically unrelated values, that are all the same by coincidence right now:

  1. the place where notebooks are stored (only relevant for file-based nbserver)
  2. the place where kernels are started (no reason to force these to be the same)
  3. the path used for serving local files from the notebook area (again, easily decoupled)

Is there more to decouple than that?

3 is the hardest to decouple. The reason is that eventually the
kernel will be run on a different host than the notebook server. The
notebook server simply won't have access to the local files of the
kernel. Don't know how to handle that. The other two should be easy
to decouple.


Reply to this email directly or view it on GitHub:
#2045 (comment)

Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu and ellisonbg@gmail.com

@minrk
Copy link
Member

minrk commented Jul 21, 2012

The main difficulty with testing this is that it requires the Azure
credentials, which we definitely can't version control. The test
script in this case would be a config file that has those credentials
and sets the right notebook manager.

Precisely - just like the launchers, it could only be run by users with an appropriate environment, and cannot be involved in the automatic test suite.

3 is the hardest to decouple. The reason is that eventually the
kernel will be run on a different host than the notebook server.

Is it possible that local files would simply be unavailable in cases where the kernel's filesystem is not visible to the nbserver? That honestly seems like a reasonable answer to me, since this is not a 100% critical feature.

@minrk
Copy link
Member

minrk commented Jul 21, 2012

This one is only waiting for docs, right? And a rebase, now.

@ellisonbg
Copy link
Member Author

Yes, I just need to finish the docs.

On Fri, Jul 20, 2012 at 10:35 PM, Min RK
reply@reply.github.com
wrote:

This one is only waiting for docs, right? And a rebase, now.


Reply to this email directly or view it on GitHub:
#2045 (comment)

Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu and ellisonbg@gmail.com

@minrk
Copy link
Member

minrk commented Jul 23, 2012

Is there a reason for the base class to be called BaseNotebookManager rather than NotebookManager? This change would mean that config for 0.13 will be broken in 0.14, that would work perfectly fine otherwise.

Since we have coupled class names to config files, I think we need a pretty high bar for renaming configurable classes.

self.save_notebook_object(notebook_id, nb)
return notebook_id
def log_info(self):
self.log.info("Serving notebooks from local directory: %s", self.notebook_manager.notebook_dir)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be self.notebook_dir, not self.notebook_manager.notebook_dir

@benjiec
Copy link
Contributor

benjiec commented Aug 8, 2012

I closed pull request #2155 and adjusted my notebook manager to use Brian's work. Thanks Brian.

Noted the following bug in the azurenb branch, causing the default file system based nb manager to fail

diff --git a/IPython/frontend/html/notebook/filenbmanager.py b/IPython/frontend/html/notebook/filenbmanager.py
index 6389ec1..34d7d26 100644
--- a/IPython/frontend/html/notebook/filenbmanager.py
+++ b/IPython/frontend/html/notebook/filenbmanager.py
@@ -193,4 +193,4 @@ class FileNotebookManager(BaseNotebookManager):
         return name

     def log_info(self):
-        self.log.info("Serving notebooks from local directory: %s", self.notebook_manager.notebook_dir)
+        self.log.info("Serving notebooks from local directory: %s", self.notebook_dir)

@benjiec
Copy link
Contributor

benjiec commented Aug 8, 2012

Here is another bug fix to the azurenb branch, it was still referring to NotebookManager


diff --git a/IPython/frontend/html/notebook/notebookapp.py b/IPython/frontend/html/notebook/notebookapp.py
index 167965d..f867155 100644
--- a/IPython/frontend/html/notebook/notebookapp.py
+++ b/IPython/frontend/html/notebook/notebookapp.py
@@ -50,7 +50,7 @@ from .handlers import (LoginHandler, LogoutHandler,
     RSTHandler, AuthenticatedFileHandler, PrintNotebookHandler,
     MainClusterHandler, ClusterProfileHandler, ClusterActionHandler
 )
-from .notebookmanager import NotebookManager
+from .basenbmanager import BaseNotebookManager
 from .clustermanager import ClusterManager

 from IPython.config.application import catch_config_error, boolean_flag
@@ -215,7 +215,7 @@ flags['read-only'] = (
 )

 # Add notebook manager flags
-flags.update(boolean_flag('script', 'NotebookManager.save_script',
+flags.update(boolean_flag('script', 'BaseNotebookManager.save_script',
                'Auto-save a .py script everytime the .ipynb notebook is saved',
                'Do not auto-save .py scripts for every notebook'))

@@ -232,7 +232,7 @@ aliases.update({
     'port-retries': 'NotebookApp.port_retries',
     'keyfile': 'NotebookApp.keyfile',
     'certfile': 'NotebookApp.certfile',
-    'notebook-dir': 'NotebookManager.notebook_dir',
+    'notebook-dir': 'BaseNotebookManager.notebook_dir',
     'browser': 'NotebookApp.browser',
 })

@@ -260,7 +260,7 @@ class NotebookApp(BaseIPythonApplication):
     """
     examples = _examples

-    classes = IPythonConsoleApp.classes + [MappingKernelManager, NotebookManager]
+    classes = IPythonConsoleApp.classes + [MappingKernelManager, BaseNotebookManager]
     flags = Dict(flags)
     aliases = Dict(aliases)

@@ -410,7 +410,7 @@ class NotebookApp(BaseIPythonApplication):
             else:
                 self.file_to_run = f
                 nbdir = os.path.dirname(f)
-            self.config.NotebookManager.notebook_dir = nbdir
+            self.config.BaseNotebookManager.notebook_dir = nbdir

     def init_configurables(self):
         # force Session default to be secure

@benjiec
Copy link
Contributor

benjiec commented Aug 8, 2012

My patches are in https://github.com/benjiec/ipython/tree/azurenb

I will create a pull request to Brian's branch

@ellisonbg
Copy link
Member Author

OK, I have addressed all of the comments and the test suite passes without the azure package installed. I have also added some minimal docs about using a custom notebook manager. Merging.

ellisonbg added a commit that referenced this pull request Aug 11, 2012
Refactoring notebook managers and adding Azure backed storage.
@ellisonbg ellisonbg merged commit c8c254c into ipython:master Aug 11, 2012
@minrk
Copy link
Member

minrk commented Aug 11, 2012

I wish you had pinged, since you did not actually address all comments above.

Unaddressed:

  • you, yourself mentioned adding something about azure to setup.py, but nothing came of it
  • The question about renaming NotebookManager to BaseNotebookManager, which breaks config unnecessarily went unanswered.

And the rebased commits re-introduced use of the deprecated assertEquals name, which was otherwise removed.

@fperez
Copy link
Member

fperez commented Aug 11, 2012

@ellisonbg, in general I think it's a good idea to give one last ping on PRs that have had good back-and-forth (such as this one). Typically what we've been doing is that then the reviewer, if all looks good, ends up merging right away. That way it doesn't really add any unnecessary lag to the process, it helps catch situations like this one, and it means that when we see the merge logs, the main reviewer tends to stand out right away without having to come back to read the PR process itself.

@ellisonbg
Copy link
Member Author

  • I decided that it probably doesn't make sense to add the azure package to
    setup.py.
  • I completely missed the NotebookManager naming thing. I will submit a PR
    for that.
  • I was mostly sick of this one hanging over my head, but points taken to
    heart...

On Fri, Aug 10, 2012 at 7:13 PM, Fernando Perez notifications@github.comwrote:

@ellisonbg https://github.com/ellisonbg, in general I think it's a good
idea to give one last ping on PRs that have had good back-and-forth (such
as this one). Typically what we've been doing is that then the reviewer, if
all looks good, ends up merging right away. That way it doesn't really add
any unnecessary lag to the process, it helps catch situations like this
one, and it means that when we see the merge logs, the main reviewer tends
to stand out right away without having to come back to read the PR process
itself.


Reply to this email directly or view it on GitHubhttps://github.com//pull/2045#issuecomment-7663107.

Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu and ellisonbg@gmail.com

@fperez
Copy link
Member

fperez commented Aug 11, 2012

On Fri, Aug 10, 2012 at 8:03 PM, Brian E. Granger
notifications@github.comwrote:

  • I decided that it probably doesn't make sense to add the azure package
    to
    setup.py.
  • I completely missed the NotebookManager naming thing. I will submit a PR
    for that.
  • I was mostly sick of this one hanging over my head, but points taken to
    heart...

No worries :) Min quickly updated the deprecated asserts and I reviewed it
and merged it, so no biggie.

@ellisonbg
Copy link
Member Author

OK new PR submitted.

On Fri, Aug 10, 2012 at 8:08 PM, Fernando Perez notifications@github.comwrote:

On Fri, Aug 10, 2012 at 8:03 PM, Brian E. Granger
notifications@github.comwrote:

  • I decided that it probably doesn't make sense to add the azure package
    to
    setup.py.
  • I completely missed the NotebookManager naming thing. I will submit a
    PR
    for that.
  • I was mostly sick of this one hanging over my head, but points taken to
    heart...

No worries :) Min quickly updated the deprecated asserts and I reviewed it
and merged it, so no biggie.


Reply to this email directly or view it on GitHubhttps://github.com//pull/2045#issuecomment-7663431.

Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu and ellisonbg@gmail.com

mattvonrocketstein pushed a commit to mattvonrocketstein/ipython that referenced this pull request Nov 3, 2014
Refactoring notebook managers and adding Azure backed storage.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants