autoexported notebooks: only export explicitly marked cells #3295

Closed
jankatins opened this Issue May 9, 2013 · 14 comments

Comments

Projects
None yet
6 participants
@jankatins
Contributor

jankatins commented May 9, 2013

I try to do my data munching and statistics in IPython notebooks. I start the notebook server with "--script" and my notebooks (up to now...) are mixing imports, functions and data munching (loading, apply the functions, save result). I tried to reuse parts of the notebook in other notebooks but run into two problem / annoyances:

  • magic commands (%%time in this case) are not commented out on export, so doing a "from import function" results in errors when magic commands are used (#2922)
  • it is tiresome to guard ever cell against execution when loaded as a library (see help)

So, it would be nice if the notebook could add metadata to a cell, which would basicly instruct the exporter to "export this cell" or "do not export this cell". This could then also show an error when a magi command is encountered in a "marked for export"-cell. Another idea would be to simple comment out any encountered magic command.

Additionally it would be nice if the notebook server could be instructed to a) export the notebook if such metadata is found, even if not run with "--script" and b) use a specified name for the python file.

@minrk

This comment has been minimized.

Show comment Hide comment
@minrk

minrk May 9, 2013

Owner

--script is a terrible hack (that is also totally broken, since it saves untransformed IPython input in a .py file). A more generic post-save hook should allow people to do much more sensible and sophisticated things that the current status quo.

Owner

minrk commented May 9, 2013

--script is a terrible hack (that is also totally broken, since it saves untransformed IPython input in a .py file). A more generic post-save hook should allow people to do much more sensible and sophisticated things that the current status quo.

@jankatins

This comment has been minimized.

Show comment Hide comment
@jankatins

jankatins May 9, 2013

Contributor

So how should a generic post-save hock look like?
Also relevant for #3291

Contributor

jankatins commented May 9, 2013

So how should a generic post-save hock look like?
Also relevant for #3291

@minrk

This comment has been minimized.

Show comment Hide comment
@minrk

minrk May 9, 2013

Owner

--script should at the very least be killed in favor of something using nbconvert once it is merged. If we do replace it with a proper hook, it should probably be a callable that takes some/all of: the notebook ID, notebook manager, and destination filename. It may also require hooks to be called on rename and deletion to match the cleanup we currently do.

Owner

minrk commented May 9, 2013

--script should at the very least be killed in favor of something using nbconvert once it is merged. If we do replace it with a proper hook, it should probably be a callable that takes some/all of: the notebook ID, notebook manager, and destination filename. It may also require hooks to be called on rename and deletion to match the cleanup we currently do.

@jankatins

This comment has been minimized.

Show comment Hide comment
@jankatins

jankatins May 10, 2013

Contributor

Ok, so something along this lines?

def add_save_hook(self, hook):
    self._hooks.append(hook)

def remove_save_hook(self, hook):
    self._hooks.remove(hook)

[... in FileNotebookManager.write_notebook_object()...]
if self._hooks:
    for hook in hooks:
        hook.postsave(notebook_id, new_name, path, self)

How would a plugin then register itself?

Contributor

jankatins commented May 10, 2013

Ok, so something along this lines?

def add_save_hook(self, hook):
    self._hooks.append(hook)

def remove_save_hook(self, hook):
    self._hooks.remove(hook)

[... in FileNotebookManager.write_notebook_object()...]
if self._hooks:
    for hook in hooks:
        hook.postsave(notebook_id, new_name, path, self)

How would a plugin then register itself?

@ellisonbg

This comment has been minimized.

Show comment Hide comment
@ellisonbg

ellisonbg May 10, 2013

Owner

I don't think the notebook manager is where these things will live. We are quickly moving to a model that supports multiple directories. In that context, each directory might want to have a different configuration of these types of hooks. However the notebook manager would be the same object for each directory. IOW, we won't instantiate a new notebook manager for each directory. This represents an entirely different type of configuration abstraction than we current have. We will need to think carefully about how to represent that well. But, these things are very much in flux and I think we just need to wait for some of this other work to land first.

Owner

ellisonbg commented May 10, 2013

I don't think the notebook manager is where these things will live. We are quickly moving to a model that supports multiple directories. In that context, each directory might want to have a different configuration of these types of hooks. However the notebook manager would be the same object for each directory. IOW, we won't instantiate a new notebook manager for each directory. This represents an entirely different type of configuration abstraction than we current have. We will need to think carefully about how to represent that well. But, these things are very much in flux and I think we just need to wait for some of this other work to land first.

@jankatins

This comment has been minimized.

Show comment Hide comment
@jankatins

jankatins May 10, 2013

Contributor

Ok, nothing to be done about this issue until that refactoring is done? If yes: when will this refactoring land in IPython?

Other things:

  • How to add plugins/hooks ("get_ipython()" has, AFAIS no reference to the notebookapp)
  • How to inject JS into the notebook from python code (just display an js link/script?)
Contributor

jankatins commented May 10, 2013

Ok, nothing to be done about this issue until that refactoring is done? If yes: when will this refactoring land in IPython?

Other things:

  • How to add plugins/hooks ("get_ipython()" has, AFAIS no reference to the notebookapp)
  • How to inject JS into the notebook from python code (just display an js link/script?)
@filmor

This comment has been minimized.

Show comment Hide comment
@filmor

filmor May 10, 2013

Contributor

@janschulz Regarding the injection part you can use the functions in IPython.core.display.

Contributor

filmor commented May 10, 2013

@janschulz Regarding the injection part you can use the functions in IPython.core.display.

@jankatins

This comment has been minimized.

Show comment Hide comment
@jankatins

jankatins May 10, 2013

Contributor

I'm currently thinking about using the IPython.core.hooks mechanism:

  • Add a CommandListDispatcher which calls every command instead of only until one commands returns successfully (see IPython.core.hooks.CommandChainDispatcher)
  • Remove the check whether the hook is in IPython.core.hooks.__all__ in IPython.core.interactiveshell.InteractiveShell.set_hook()
  • add code to instantiate a CommandListDispatcher and add it to the hooks under a name like 'notebookmanager_post_save' during init of the FileNotebookManager
  • Add code which replaces the FileNotebookManager.save_script cases with a call to the above hook like get_ipython().hooks.notebookmanager_post_save(nb, notebook_id, old_name, new_name, path)

An extension could then simple add a function to the hook via ip.set_hooks(...) (or get_ipython().set_hooks()) and be done.

The --script case would then become a simple extension which would copy the IPython.nbformat.v3.nbpy.PyWriter.writes() function and the copy/rename/delete logic from the hack in FileNotebookManager.write_notebook_object()

So, before I start coding: would such a thing be considered for merging at all and before the refactoring mentioned above?

Contributor

jankatins commented May 10, 2013

I'm currently thinking about using the IPython.core.hooks mechanism:

  • Add a CommandListDispatcher which calls every command instead of only until one commands returns successfully (see IPython.core.hooks.CommandChainDispatcher)
  • Remove the check whether the hook is in IPython.core.hooks.__all__ in IPython.core.interactiveshell.InteractiveShell.set_hook()
  • add code to instantiate a CommandListDispatcher and add it to the hooks under a name like 'notebookmanager_post_save' during init of the FileNotebookManager
  • Add code which replaces the FileNotebookManager.save_script cases with a call to the above hook like get_ipython().hooks.notebookmanager_post_save(nb, notebook_id, old_name, new_name, path)

An extension could then simple add a function to the hook via ip.set_hooks(...) (or get_ipython().set_hooks()) and be done.

The --script case would then become a simple extension which would copy the IPython.nbformat.v3.nbpy.PyWriter.writes() function and the copy/rename/delete logic from the hack in FileNotebookManager.write_notebook_object()

So, before I start coding: would such a thing be considered for merging at all and before the refactoring mentioned above?

@jankatins

This comment has been minimized.

Show comment Hide comment
@jankatins

jankatins May 11, 2013

Contributor

So, the idea is implemented in https://github.com/JanSchulz/ipython/commits/define_hooks (last 4 commits). It still needs a --skript parameter to the notebook startup and the js part needs to be loaded via
%load_ext saveasscript (and before that you need to copy the file to the extension directory)

Unfortunately the %load_ext directly for both parts (hook and js) does not work: the ip object in the extension is a different one than the get_ipython() one in the filenbmanager :-( I'm not sure how to get an extension into the notebook application...

Contributor

jankatins commented May 11, 2013

So, the idea is implemented in https://github.com/JanSchulz/ipython/commits/define_hooks (last 4 commits). It still needs a --skript parameter to the notebook startup and the js part needs to be loaded via
%load_ext saveasscript (and before that you need to copy the file to the extension directory)

Unfortunately the %load_ext directly for both parts (hook and js) does not work: the ip object in the extension is a different one than the get_ipython() one in the filenbmanager :-( I'm not sure how to get an extension into the notebook application...

@jankatins

This comment has been minimized.

Show comment Hide comment
@jankatins

jankatins May 12, 2013

Contributor

And a cleaner implmentation, which uses the notebook metadata to define a postsavehook:
https://github.com/JanSchulz/ipython/commits/notebook_save_hooks

Workflow:

  • Activate the extention via %reload_ext IPython.frontend.html.notebook.examples.saveasscript
  • choose 'save as .py: enable'
  • save :-)

To check which cells will be written to the .py file choose 'Cell Toolbar: auto export hin' and change the setting for each cell.

Contributor

jankatins commented May 12, 2013

And a cleaner implmentation, which uses the notebook metadata to define a postsavehook:
https://github.com/JanSchulz/ipython/commits/notebook_save_hooks

Workflow:

  • Activate the extention via %reload_ext IPython.frontend.html.notebook.examples.saveasscript
  • choose 'save as .py: enable'
  • save :-)

To check which cells will be written to the .py file choose 'Cell Toolbar: auto export hin' and change the setting for each cell.

@jankatins

This comment has been minimized.

Show comment Hide comment
@jankatins

jankatins Jun 16, 2013

Contributor

My current thinking is to define a cell level magic %%save_and_run -f "file" -i "identifier", which would save the transformed cell into the file and surround it with lines which include the identifier and afterwards would run the code. The identifier can then be used to change only that portion of the file.

  • Does that sound better? The code could either be a new extension (in a different repository) or a patch against magics/code.py (there is already a %save and a %pastebin magic).
  • What is the correct way to simply run (transformed -> no %magics,...) code? Just as %%time does it (using transform, ast.parse, transform_ast, compile and exec the code)?
  • does a magic have access to cell metadata? This would be nice so that a hash of the transformed code can be saved in the cell metadata and in case the code was already saved, not saved again (no change to the file on rerun).
Contributor

jankatins commented Jun 16, 2013

My current thinking is to define a cell level magic %%save_and_run -f "file" -i "identifier", which would save the transformed cell into the file and surround it with lines which include the identifier and afterwards would run the code. The identifier can then be used to change only that portion of the file.

  • Does that sound better? The code could either be a new extension (in a different repository) or a patch against magics/code.py (there is already a %save and a %pastebin magic).
  • What is the correct way to simply run (transformed -> no %magics,...) code? Just as %%time does it (using transform, ast.parse, transform_ast, compile and exec the code)?
  • does a magic have access to cell metadata? This would be nice so that a hash of the transformed code can be saved in the cell metadata and in case the code was already saved, not saved again (no change to the file on rerun).
@Carreau

This comment has been minimized.

Show comment Hide comment
@Carreau

Carreau Jun 17, 2013

Owner

I'm not sure I understand what you meant by (transformed -> no magics...)

No a cell magic does not have access to cell metadata, it is no possible as
cell magic works also on non-notebook env.

The save if no change would be too complicated for it's goal if it is just
to avoid data on the wire, as it would need bidirectional communication.

The cell "identifier" would be cell ID that we will tackle later.

Le lundi 17 juin 2013, JanSchulz a écrit :

My current thinking is to define a cell level magic %%save_and_run -f
"file" -i "identifier", which would save the transformed cell into the
file and surround it with lines which include the identifier and afterwards
would run the code.

  • Does that sound better? The code could either be a new extension (in
    a different repository) or a patch against magics/code.py (there is already
    a %save and a %pastebin magic).
  • What is the correct way to simply run (transformed -> no
    %magics,...) code? Just as %%time does it (using transform, ast.parse,
    transform_ast, compile and exec the code?)
  • does a magic has access to cell metadata? This would be nice so that
    a hash of the transformed code can be saved in the cell metadata and in
    case the code was already saved, not saved again (no change to the file on
    rerun).


Reply to this email directly or view it on GitHubhttps://github.com/ipython/ipython/issues/3295#issuecomment-19520356
.

Owner

Carreau commented Jun 17, 2013

I'm not sure I understand what you meant by (transformed -> no magics...)

No a cell magic does not have access to cell metadata, it is no possible as
cell magic works also on non-notebook env.

The save if no change would be too complicated for it's goal if it is just
to avoid data on the wire, as it would need bidirectional communication.

The cell "identifier" would be cell ID that we will tackle later.

Le lundi 17 juin 2013, JanSchulz a écrit :

My current thinking is to define a cell level magic %%save_and_run -f
"file" -i "identifier", which would save the transformed cell into the
file and surround it with lines which include the identifier and afterwards
would run the code.

  • Does that sound better? The code could either be a new extension (in
    a different repository) or a patch against magics/code.py (there is already
    a %save and a %pastebin magic).
  • What is the correct way to simply run (transformed -> no
    %magics,...) code? Just as %%time does it (using transform, ast.parse,
    transform_ast, compile and exec the code?)
  • does a magic has access to cell metadata? This would be nice so that
    a hash of the transformed code can be saved in the cell metadata and in
    case the code was already saved, not saved again (no change to the file on
    rerun).


Reply to this email directly or view it on GitHubhttps://github.com/ipython/ipython/issues/3295#issuecomment-19520356
.

@jankatins jankatins referenced this issue in minrk/ipython_extensions Jun 18, 2013

Merged

Add writeandexecute magic #11

@jankatins

This comment has been minimized.

Show comment Hide comment
@jankatins

jankatins Jun 18, 2013

Contributor

My usecase would be fixed by minrk/ipython_extensions#11

This could also be included in ipython itself by adding this methods to IPython/core/magic/code.py (there is a %%save <linenumbers> and a magic to save to a gist...

Contributor

jankatins commented Jun 18, 2013

My usecase would be fixed by minrk/ipython_extensions#11

This could also be included in ipython itself by adding this methods to IPython/core/magic/code.py (there is a %%save <linenumbers> and a magic to save to a gist...

@paulochf

This comment has been minimized.

Show comment Hide comment
@paulochf

paulochf Jul 29, 2015

I beg your pardon, but could you please explain how to get this done? I was looking for some way to supress some cells from the exported results but I didn't find a way to do this fastly.

Thanks!

I beg your pardon, but could you please explain how to get this done? I was looking for some way to supress some cells from the exported results but I didn't find a way to do this fastly.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment