Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Add the option to skip output cells when saving. #3291

Closed
wants to merge 2 commits into from

6 participants

Benedikt Sauer Matthias Bussonnier Min RK Takafumi Arakaki Brian E. Granger Jan Schulz
Benedikt Sauer

To allow notebooks to be put under proper version control it is very sensible to remove the variable but “non-source” parts, the output cells (i.e. massive base64-encoded pngs) and prompt numbers.

Matthias Bussonnier
Owner

Wouldn't this be more appropriate as a git commit hook or something along ?

The goal of the ipynb format is to capture output and prompt number for conversion with nbconvert for example, or sharing with nbviewer (notebook shared with nbviewer without output are really desapointing imho).

But I do understand the wish to have it. Maybe the ability to have a post save hook is what you would like.
It could for example run a script that store the stripped version in another directory, and could also cover the need for the --script option.

Min RK
Owner

a generic hook is probably a better approach. I don't like the way --script works, and I don't think adding a skip-outputs flag to nbformat is something we should do.

Takafumi Arakaki
tkf commented

@filmor FYI you can strip output parts using nbstripout.py script in the nbconvert repo:
https://github.com/ipython/nbconvert/blob/master/nbstripout.py

Benedikt Sauer

@minrk I'm fully for a more generic approach, I'll prepare a proposal.

@tkf Stripping out the output data automatically is not the problem, but the whole reason I want this behaviour is that I want to be able to diff notebook files properly before committing them :) (i.e. the normal git workflow)

Brian E. Granger
Owner

A few comments:

  • git related things really should be done through git hooks.
  • In general notebook transformations should be done using nbconvert. Once it is merged, the .py format should be removed from nbformat entirely.

I think we need to wait until nbconvert is in good shape and merged into IPython before we think about solving these issues. I would like to close this PR and open an issue to track these things. How do people feel about that?

Takafumi Arakaki

@filmor You can write a few lines of script to copy notebooks to a git repository and strip outputs when there is a change in notebook directory. This is what I do. But I do agree that if user can hook functions on save event, it would simplify this kind of stuff.

@ellisonbg I think what @filmor wants is to run some action even before commiting notebook, meaning that there aren't any git event to hook at that point (or are there?)

Benedikt Sauer

The filter attribute seems more suitable. I'll close this know.

Benedikt Sauer filmor closed this
Benedikt Sauer filmor deleted the branch
Takafumi Arakaki

@ellisonbg I mean even before commit occurs. For example, what can you use if you want to strip off output before running git diff?

Jan Schulz

@filmor Thanks for the idea about clean/smudge filters! That works perfectly for this!

Jan Schulz JanSchulz referenced this pull request in ipython/nbconvert
Closed

Use STDIN in nbstripout if no input is given #142

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on May 8, 2013
  1. Benedikt Sauer
  2. Benedikt Sauer

    Removed debug output.

    filmor authored
This page is out of date. Refresh to see the latest.
10 IPython/frontend/html/notebook/filenbmanager.py
View
@@ -43,6 +43,14 @@ class FileNotebookManager(NotebookManager):
short `--script` flag.
"""
)
+
+ skip_outputs = Bool(False, config=True,
+ help="""Don't save output cells and prompt numbers.
+
+ This is useful when the notebook files are to be put under version
+ control and may also make them significantly smaller, with the obvious
+ drawback that all output needs to be regenerated at runtime."""
+ )
checkpoint_dir = Unicode(config=True,
help="""The location in which to keep notebook checkpoints
@@ -178,7 +186,7 @@ def write_notebook_object(self, nb, notebook_id=None):
try:
self.log.debug("Autosaving notebook %s", path)
with open(path,'w') as f:
- current.write(nb, f, u'json')
+ current.write(nb, f, u'json', skip_outputs=self.skip_outputs)
except Exception as e:
raise web.HTTPError(400, u'Unexpected error while autosaving notebook: %s' % e)
20 IPython/nbformat/v3/nbjson.py
View
@@ -49,6 +49,23 @@ def to_notebook(self, d, **kwargs):
return restore_bytes(rejoin_lines(from_dict(d)))
+def _remove_outputs(d):
+ if isinstance(d, dict):
+ res = {}
+ for key, value in d.items():
+ if key == u"outputs":
+ res["outputs"] = []
+ elif key == u"prompt_number":
+ pass
+ else:
+ res[key] = _remove_outputs(value)
+ return res
+ elif isinstance(d, list):
+ return [_remove_outputs(i) for i in d]
+ else:
+ return d
+
+
class JSONWriter(NotebookWriter):
def writes(self, nb, **kwargs):
@@ -58,6 +75,9 @@ def writes(self, nb, **kwargs):
kwargs['separators'] = (',',': ')
if kwargs.pop('split_lines', True):
nb = split_lines(copy.deepcopy(nb))
+ if kwargs.pop('skip_outputs', False):
+ nb = _remove_outputs(nb)
+
return py3compat.str_to_unicode(json.dumps(nb, **kwargs), 'utf-8')
Something went wrong with that request. Please try again.