Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

autoexported notebooks: only export explicitly marked cells #3295

Closed
JanSchulz opened this Issue · 13 comments

5 participants

@JanSchulz

I try to do my data munching and statistics in IPython notebooks. I start the notebook server with "--script" and my notebooks (up to now...) are mixing imports, functions and data munching (loading, apply the functions, save result). I tried to reuse parts of the notebook in other notebooks but run into two problem / annoyances:

  • magic commands (%%time in this case) are not commented out on export, so doing a "from import function" results in errors when magic commands are used (#2922)
  • it is tiresome to guard ever cell against execution when loaded as a library (see help)

So, it would be nice if the notebook could add metadata to a cell, which would basicly instruct the exporter to "export this cell" or "do not export this cell". This could then also show an error when a magi command is encountered in a "marked for export"-cell. Another idea would be to simple comment out any encountered magic command.

Additionally it would be nice if the notebook server could be instructed to a) export the notebook if such metadata is found, even if not run with "--script" and b) use a specified name for the python file.

@minrk
Owner

--script is a terrible hack (that is also totally broken, since it saves untransformed IPython input in a .py file). A more generic post-save hook should allow people to do much more sensible and sophisticated things that the current status quo.

@JanSchulz

So how should a generic post-save hock look like?
Also relevant for #3291

@minrk
Owner

--script should at the very least be killed in favor of something using nbconvert once it is merged. If we do replace it with a proper hook, it should probably be a callable that takes some/all of: the notebook ID, notebook manager, and destination filename. It may also require hooks to be called on rename and deletion to match the cleanup we currently do.

@JanSchulz

Ok, so something along this lines?

def add_save_hook(self, hook):
    self._hooks.append(hook)

def remove_save_hook(self, hook):
    self._hooks.remove(hook)

[... in FileNotebookManager.write_notebook_object()...]
if self._hooks:
    for hook in hooks:
        hook.postsave(notebook_id, new_name, path, self)

How would a plugin then register itself?

@ellisonbg
Owner

I don't think the notebook manager is where these things will live. We are quickly moving to a model that supports multiple directories. In that context, each directory might want to have a different configuration of these types of hooks. However the notebook manager would be the same object for each directory. IOW, we won't instantiate a new notebook manager for each directory. This represents an entirely different type of configuration abstraction than we current have. We will need to think carefully about how to represent that well. But, these things are very much in flux and I think we just need to wait for some of this other work to land first.

@JanSchulz

Ok, nothing to be done about this issue until that refactoring is done? If yes: when will this refactoring land in IPython?

Other things:

  • How to add plugins/hooks ("get_ipython()" has, AFAIS no reference to the notebookapp)
  • How to inject JS into the notebook from python code (just display an js link/script?)
@filmor

@JanSchulz Regarding the injection part you can use the functions in IPython.core.display.

@JanSchulz

I'm currently thinking about using the IPython.core.hooks mechanism:

  • Add a CommandListDispatcher which calls every command instead of only until one commands returns successfully (see IPython.core.hooks.CommandChainDispatcher)
  • Remove the check whether the hook is in IPython.core.hooks.__all__ in IPython.core.interactiveshell.InteractiveShell.set_hook()
  • add code to instantiate a CommandListDispatcher and add it to the hooks under a name like 'notebookmanager_post_save' during init of the FileNotebookManager
  • Add code which replaces the FileNotebookManager.save_script cases with a call to the above hook like get_ipython().hooks.notebookmanager_post_save(nb, notebook_id, old_name, new_name, path)

An extension could then simple add a function to the hook via ip.set_hooks(...) (or get_ipython().set_hooks()) and be done.

The --script case would then become a simple extension which would copy the IPython.nbformat.v3.nbpy.PyWriter.writes() function and the copy/rename/delete logic from the hack in FileNotebookManager.write_notebook_object()

So, before I start coding: would such a thing be considered for merging at all and before the refactoring mentioned above?

@JanSchulz

So, the idea is implemented in https://github.com/JanSchulz/ipython/commits/define_hooks (last 4 commits). It still needs a --skript parameter to the notebook startup and the js part needs to be loaded via
%load_ext saveasscript (and before that you need to copy the file to the extension directory)

Unfortunately the %load_ext directly for both parts (hook and js) does not work: the ip object in the extension is a different one than the get_ipython() one in the filenbmanager :-( I'm not sure how to get an extension into the notebook application...

@JanSchulz

And a cleaner implmentation, which uses the notebook metadata to define a postsavehook:
https://github.com/JanSchulz/ipython/commits/notebook_save_hooks

Workflow:

  • Activate the extention via %reload_ext IPython.frontend.html.notebook.examples.saveasscript
  • choose 'save as .py: enable'
  • save :-)

To check which cells will be written to the .py file choose 'Cell Toolbar: auto export hin' and change the setting for each cell.

@JanSchulz

My current thinking is to define a cell level magic %%save_and_run -f "file" -i "identifier", which would save the transformed cell into the file and surround it with lines which include the identifier and afterwards would run the code. The identifier can then be used to change only that portion of the file.

  • Does that sound better? The code could either be a new extension (in a different repository) or a patch against magics/code.py (there is already a %save and a %pastebin magic).
  • What is the correct way to simply run (transformed -> no %magics,...) code? Just as %%time does it (using transform, ast.parse, transform_ast, compile and exec the code)?
  • does a magic have access to cell metadata? This would be nice so that a hash of the transformed code can be saved in the cell metadata and in case the code was already saved, not saved again (no change to the file on rerun).
@Carreau
Owner
@JanSchulz JanSchulz referenced this issue in minrk/ipython_extensions
Merged

Add writeandexecute magic #11

@JanSchulz

My usecase would be fixed by minrk/ipython_extensions#11

This could also be included in ipython itself by adding this methods to IPython/core/magic/code.py (there is a %%save <linenumbers> and a magic to save to a gist...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.