Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

autoexported notebooks: only export explicitly marked cells #3295

Closed
jankatins opened this issue May 9, 2013 · 14 comments · Fixed by minrk/ipython_extensions#11
Closed

autoexported notebooks: only export explicitly marked cells #3295

jankatins opened this issue May 9, 2013 · 14 comments · Fixed by minrk/ipython_extensions#11
Milestone

Comments

@jankatins
Copy link
Contributor

I try to do my data munching and statistics in IPython notebooks. I start the notebook server with "--script" and my notebooks (up to now...) are mixing imports, functions and data munching (loading, apply the functions, save result). I tried to reuse parts of the notebook in other notebooks but run into two problem / annoyances:

  • magic commands (%%time in this case) are not commented out on export, so doing a "from import function" results in errors when magic commands are used (File => Save as '.py' saves magic as code  #2922)
  • it is tiresome to guard ever cell against execution when loaded as a library (see help)

So, it would be nice if the notebook could add metadata to a cell, which would basicly instruct the exporter to "export this cell" or "do not export this cell". This could then also show an error when a magi command is encountered in a "marked for export"-cell. Another idea would be to simple comment out any encountered magic command.

Additionally it would be nice if the notebook server could be instructed to a) export the notebook if such metadata is found, even if not run with "--script" and b) use a specified name for the python file.

@minrk
Copy link
Member

minrk commented May 9, 2013

--script is a terrible hack (that is also totally broken, since it saves untransformed IPython input in a .py file). A more generic post-save hook should allow people to do much more sensible and sophisticated things that the current status quo.

@jankatins
Copy link
Contributor Author

So how should a generic post-save hock look like?
Also relevant for #3291

@minrk
Copy link
Member

minrk commented May 9, 2013

--script should at the very least be killed in favor of something using nbconvert once it is merged. If we do replace it with a proper hook, it should probably be a callable that takes some/all of: the notebook ID, notebook manager, and destination filename. It may also require hooks to be called on rename and deletion to match the cleanup we currently do.

@jankatins
Copy link
Contributor Author

Ok, so something along this lines?

def add_save_hook(self, hook):
    self._hooks.append(hook)

def remove_save_hook(self, hook):
    self._hooks.remove(hook)

[... in FileNotebookManager.write_notebook_object()...]
if self._hooks:
    for hook in hooks:
        hook.postsave(notebook_id, new_name, path, self)

How would a plugin then register itself?

@ellisonbg
Copy link
Member

I don't think the notebook manager is where these things will live. We are quickly moving to a model that supports multiple directories. In that context, each directory might want to have a different configuration of these types of hooks. However the notebook manager would be the same object for each directory. IOW, we won't instantiate a new notebook manager for each directory. This represents an entirely different type of configuration abstraction than we current have. We will need to think carefully about how to represent that well. But, these things are very much in flux and I think we just need to wait for some of this other work to land first.

@jankatins
Copy link
Contributor Author

Ok, nothing to be done about this issue until that refactoring is done? If yes: when will this refactoring land in IPython?

Other things:

  • How to add plugins/hooks ("get_ipython()" has, AFAIS no reference to the notebookapp)
  • How to inject JS into the notebook from python code (just display an js link/script?)

@filmor
Copy link
Contributor

filmor commented May 10, 2013

@JanSchulz Regarding the injection part you can use the functions in IPython.core.display.

@jankatins
Copy link
Contributor Author

I'm currently thinking about using the IPython.core.hooks mechanism:

  • Add a CommandListDispatcher which calls every command instead of only until one commands returns successfully (see IPython.core.hooks.CommandChainDispatcher)
  • Remove the check whether the hook is in IPython.core.hooks.__all__ in IPython.core.interactiveshell.InteractiveShell.set_hook()
  • add code to instantiate a CommandListDispatcher and add it to the hooks under a name like 'notebookmanager_post_save' during init of the FileNotebookManager
  • Add code which replaces the FileNotebookManager.save_script cases with a call to the above hook like get_ipython().hooks.notebookmanager_post_save(nb, notebook_id, old_name, new_name, path)

An extension could then simple add a function to the hook via ip.set_hooks(...) (or get_ipython().set_hooks()) and be done.

The --script case would then become a simple extension which would copy the IPython.nbformat.v3.nbpy.PyWriter.writes() function and the copy/rename/delete logic from the hack in FileNotebookManager.write_notebook_object()

So, before I start coding: would such a thing be considered for merging at all and before the refactoring mentioned above?

@jankatins
Copy link
Contributor Author

So, the idea is implemented in https://github.com/JanSchulz/ipython/commits/define_hooks (last 4 commits). It still needs a --skript parameter to the notebook startup and the js part needs to be loaded via
%load_ext saveasscript (and before that you need to copy the file to the extension directory)

Unfortunately the %load_ext directly for both parts (hook and js) does not work: the ip object in the extension is a different one than the get_ipython() one in the filenbmanager :-( I'm not sure how to get an extension into the notebook application...

@jankatins
Copy link
Contributor Author

And a cleaner implmentation, which uses the notebook metadata to define a postsavehook:
https://github.com/JanSchulz/ipython/commits/notebook_save_hooks

Workflow:

  • Activate the extention via %reload_ext IPython.frontend.html.notebook.examples.saveasscript
  • choose 'save as .py: enable'
  • save :-)

To check which cells will be written to the .py file choose 'Cell Toolbar: auto export hin' and change the setting for each cell.

@jankatins
Copy link
Contributor Author

My current thinking is to define a cell level magic %%save_and_run -f "file" -i "identifier", which would save the transformed cell into the file and surround it with lines which include the identifier and afterwards would run the code. The identifier can then be used to change only that portion of the file.

  • Does that sound better? The code could either be a new extension (in a different repository) or a patch against magics/code.py (there is already a %save and a %pastebin magic).
  • What is the correct way to simply run (transformed -> no %magics,...) code? Just as %%time does it (using transform, ast.parse, transform_ast, compile and exec the code)?
  • does a magic have access to cell metadata? This would be nice so that a hash of the transformed code can be saved in the cell metadata and in case the code was already saved, not saved again (no change to the file on rerun).

@Carreau
Copy link
Member

Carreau commented Jun 17, 2013

I'm not sure I understand what you meant by (transformed -> no magics...)

No a cell magic does not have access to cell metadata, it is no possible as
cell magic works also on non-notebook env.

The save if no change would be too complicated for it's goal if it is just
to avoid data on the wire, as it would need bidirectional communication.

The cell "identifier" would be cell ID that we will tackle later.

Le lundi 17 juin 2013, JanSchulz a écrit :

My current thinking is to define a cell level magic %%save_and_run -f
"file" -i "identifier", which would save the transformed cell into the
file and surround it with lines which include the identifier and afterwards
would run the code.

  • Does that sound better? The code could either be a new extension (in
    a different repository) or a patch against magics/code.py (there is already
    a %save and a %pastebin magic).
  • What is the correct way to simply run (transformed -> no
    %magics,...) code? Just as %%time does it (using transform, ast.parse,
    transform_ast, compile and exec the code?)
  • does a magic has access to cell metadata? This would be nice so that
    a hash of the transformed code can be saved in the cell metadata and in
    case the code was already saved, not saved again (no change to the file on
    rerun).


Reply to this email directly or view it on GitHubhttps://github.com//issues/3295#issuecomment-19520356
.

@jankatins
Copy link
Contributor Author

My usecase would be fixed by minrk/ipython_extensions/pull/11

This could also be included in ipython itself by adding this methods to IPython/core/magic/code.py (there is a %%save <linenumbers> and a magic to save to a gist...

@paulochf
Copy link

I beg your pardon, but could you please explain how to get this done? I was looking for some way to supress some cells from the exported results but I didn't find a way to do this fastly.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants