New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: add pre/post-save hooks #6896

Merged
merged 4 commits into from Dec 12, 2014

Conversation

Projects
None yet
4 participants
@minrk
Copy link
Member

minrk commented Nov 11, 2014

We may not want to do this so close to 3.0, but it's a mitigation of the removed --script flag, so I thought it was worth discussing. Hooks are Python functions that are passed the model, path, and content_manager. I started with two hooks:

  • ContentsManager.pre_save_hook runs on the path and model with content. This one can be used for things like stripping output that people don't like adding to VCS noise.
  • FileContentsManager.post_save_hook runs on the filesystem path and model without content. This one can be used to replace --script with a call to nbconvert.

An example ipython_notebook_config.py that does both of these things:

A pre-save hook for stripping output:

def scrub_output_pre_save(model, **kwargs):
    """scrub output before saving notebooks"""
    # only run on notebooks
    if model['type'] != 'notebook':
        return
    # only run on nbformat v4
    if model['content']['nbformat'] != 4:
        return

    model['content']['metadata'].pop('signature', None)
    for cell in model['content']['cells']:
        if cell['cell_type'] != 'code':
            continue
        cell['outputs'] = []
        cell['execution_count'] = None

c.FileContentsManager.pre_save_hook = scrub_output_pre_save

A post-save hook to replace --script (this could be a simple call to ipython nbconvert --to script, but spawning the subprocess every time is quite slow):

import io
import os

_script_exporter = None

def script_post_save(model, os_path, contents_manager, **kwargs):
    """convert notebooks to Python script after save with nbconvert

    replaces `ipython notebook --script`
    """
    from IPython.nbconvert.exporters.script import ScriptExporter

    if model['type'] != 'notebook':
        return

    global _script_exporter
    if _script_exporter is None:
        _script_exporter = ScriptExporter(parent=contents_manager)
    log = contents_manager.log

    base, ext = os.path.splitext(os_path)
    py_fname = base + '.py'
    script, resources = _script_exporter.from_filename(os_path)
    script_fname = base + resources.get('output_extension', '.txt')
    log.info("Saving script /%s", to_api_path(script_fname, contents_manager.root_dir))
    with io.open(script_fname, 'w', encoding='utf-8') as f:
        f.write(script)
c.FileContentsManager.post_save_hook = script_post_save

closes #3111
closes #1280

@minrk minrk added the -Needs review label Nov 11, 2014

@ssanderson

This comment has been minimized.

Copy link
Member

ssanderson commented Nov 11, 2014

@minrk I don't have a great sense of how technically risky this feature is (I assume it is somewhat since you mention not wanting to merge it pre 3.0), but I would likely use this in for quantopian's notebook implementation, so 👍 for the feature concept at least.

@minrk

This comment has been minimized.

Copy link
Member

minrk commented Nov 11, 2014

I have a history of adding poorly thought-out features that scratch an itch, but ultimately end up a maintenance problem (--script, for example), so things like this make me nervous. That said, This seems pretty simple, and solves the same problem --script did in a more general way.

Other ways to implement hooks:

  • scripts on-disk, such as git hooks
  • dispatched via an event system, like we do in core

Of course, the scripts-on-disk convention could be implemented as a single Python function that finds and executes scripts based on the directory structure.

@minrk minrk added this to the 3.0 milestone Nov 11, 2014

@ssanderson

This comment has been minimized.

Copy link
Member

ssanderson commented Nov 11, 2014

The specific problem this solves that's useful to us is the ability to save code but not output. That makes storing notebooks for a large population of people (especially people working with large datasets) a lot less arduous.

@takluyver

This comment has been minimized.

Copy link
Member

takluyver commented Nov 11, 2014

I think the idea is reasonable, but it brings up a couple of questions:

  • There's no reason you couldn't have several independent hooks doing things on save, but the current interface only appears to allow one pre- and one post-save hook. I think the API in IPython.core.events can be used for this. Not sure how best to make it configurable, though.
  • We have kernel extensions and JS nbextensions; should we have notebook server extensions?

@minrk minrk force-pushed the minrk:save-hooks branch from f1fa0bd to cfd600d Nov 18, 2014

@Carreau

This comment has been minimized.

Copy link
Member

Carreau commented Nov 18, 2014

As long as this a stopgap I'm fine with only one hook as one can make a function that make several call in order if needed.

But I think the hook should not be assumed to mutate their argument. the pre_ one would return a model if I had to do it.

@Carreau

This comment has been minimized.

Copy link
Member

Carreau commented Nov 18, 2014

And I kicked travis.

@minrk

This comment has been minimized.

Copy link
Member

minrk commented Nov 18, 2014

There's no reason you couldn't have several independent hooks doing things on save

That's true. We could either allow chaining callbacks, or just require people to define one function that calls all their other functions. I don't think there's anything that can't actually be done if it's a Python function.

I think the API in IPython.core.events can be used for this.

I looked into that, and there are a some assumptions that tie the event system in core to InteractiveShell (mainly for traceback rendering, so not hard to remove), so we would have to refactor that code a little bit to be able to use it here. It does make it a bit harder to configure than just assigning a single function.

But I think the hook should not be assumed to mutate their argument. the pre_ one would return a model if I had to do it.

I disagree, actually. I think if the hook is to modify the model, it should do it in-place. Some hooks might only notify external services, such as git, and have no effect on the model, so it seems a little odd to make them return the model unchanged.

@takluyver

This comment has been minimized.

Copy link
Member

takluyver commented Nov 18, 2014

or just require people to define one function that calls all their other functions

This assumes that all the callbacks are being set up directly by the user; it would make it very awkward for extensions to use the hook as an integration point without interfering with each other. Since we don't yet have a notion of server extensions, however, it's somewhat moot, and when that is a possibility, I would just make these config settings add the assigned function to the callback system.

@minrk minrk force-pushed the minrk:save-hooks branch from a5ee333 to 8b77d6f Nov 21, 2014

@minrk minrk force-pushed the minrk:save-hooks branch from 8b77d6f to ea8ebc7 Dec 5, 2014

@minrk

This comment has been minimized.

Copy link
Member

minrk commented Dec 5, 2014

This PR is now based on #7118, but once that's in, I think it should be ready to go.

@Carreau

This comment has been minimized.

Copy link
Member

Carreau commented Dec 7, 2014

Need rebase.

minrk added some commits Nov 11, 2014

add pre/post-save hooks
- `ContentsManager.pre_save_hook` runs on the path and model with content
- `FileContentsManager.post_save_hook` runs on the filesystem path and model without content

- use pre_save_hook for things like stripping output
- use post_save_hook for things like nbconvert --to python

@minrk minrk force-pushed the minrk:save-hooks branch from ea8ebc7 to 278cf07 Dec 7, 2014

@minrk

This comment has been minimized.

Copy link
Member

minrk commented Dec 7, 2014

rebased

@minrk

This comment has been minimized.

Copy link
Member

minrk commented Dec 11, 2014

I'll merge this soon, unless someone wants to review further.

minrk added a commit that referenced this pull request Dec 12, 2014

Merge pull request #6896 from minrk/save-hooks
Proposal: add pre/post-save hooks

@minrk minrk merged commit 9a28a05 into ipython:master Dec 12, 2014

1 check passed

continuous-integration/travis-ci The Travis CI build passed
Details

@minrk minrk deleted the minrk:save-hooks branch Dec 12, 2014

@minrk minrk referenced this pull request Dec 12, 2014

Closed

Clear-all on save option #1280

@dalejung dalejung referenced this pull request Jan 21, 2015

Open

remove middleware? #23

takluyver added a commit to takluyver/notebook that referenced this pull request Aug 24, 2015

takluyver added a commit to takluyver/notebook that referenced this pull request Aug 24, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment