Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Git diff ignore outputs #303

Closed
tfiers opened this issue Oct 8, 2017 · 12 comments
Closed

Git diff ignore outputs #303

tfiers opened this issue Oct 8, 2017 · 12 comments

Comments

@tfiers
Copy link

tfiers commented Oct 8, 2017

Hi,

Is it possible to ignore outputs (like with nbdiff --ignore-outputs) when using this tool with git diff (i.e. after doing nbdime config-git --enable --global) ?

I have tried a few options (like editing the git global config files), but haven't gotten anything to work yet.

BTW thanks for making this tool, it is dearly needed.

@vidartf
Copy link
Collaborator

vidartf commented Oct 9, 2017

If you enabled the config with --global, then you should be able to manually edit your git config using the command git config -e --global (if not, look up how to edit your git config). Once you have the git config open, find the nbdiff command (should be git-nbdiffdriver diff [...] by default). You should then be able to add arguments to this command (call git-nbdiffdriver diff --help, or see http://nbdime.readthedocs.io/en/latest/cli.html#common-diff-options for a reference to which arguments).

@tfiers
Copy link
Author

tfiers commented Oct 9, 2017

Hi @vidartf, thank you for the reply.
I did edit my global git config (see below), but my git diff output still contains things like:

## inserted before /cells/37/outputs/0:
+  output:
+    output_type: execute_result
+    execution_count: 85
+    data:
+      text/plain:
+        array([    0. ,   106.9,   213.7,   320.6,   427.5,   534.4,   641. 000 
[...]

## deleted /cells/37/outputs/0:
-  output:
-    output_type: execute_result
-    execution_count: 82
-    data:
-      text/plain:
-        array([    0. ,   106.3,   212.7,   319. ,   425.4,   531.7,   638. 
[...]

Appendix: my current global git config file:

[diff "jupyternotebook"]
        command = git-nbdiffdriver diff --ignore-outputs
[merge "jupyternotebook"]
        driver = git-nbmergedriver merge %O %A %B %L %P
        name = jupyter notebook merge driver
[difftool "nbdime"]
        cmd = git-nbdifftool diff \"$LOCAL\" \"$REMOTE\"
[difftool]
        prompt = false
[mergetool "nbdime"]
        cmd = git-nbmergetool merge \"$BASE\" \"$LOCAL\" \"$REMOTE\" \"$MERGED\"
[mergetool]
        prompt = false

Putting the --ignore-outputs after git-nbdifftool diff instead of git-nbdiffdriver diff did not make any difference.

@vidartf
Copy link
Collaborator

vidartf commented Oct 9, 2017

Hmm. That's weird. Can you run git config --show-origin --get diff.jupyternotebook.command? This should show both the setting that is being enforced, and its source.

@vidartf
Copy link
Collaborator

vidartf commented Oct 9, 2017

Oh. I think this was actually fixed in #290. I guess a new release is overdue :) I'll see if we can push a new release this week!

@tfiers
Copy link
Author

tfiers commented Oct 9, 2017

Oh that's good to hear! Looking forward to it.

(By the way git config --show-origin --get diff.jupyternotebook.command does correctly yield
"file:/root/.gitconfig git-nbdiffdriver diff --ignore-outputs")

Thanks for the help!

@tfiers
Copy link
Author

tfiers commented Oct 10, 2017

I installed the development version of nbdime (see below for setup), and I still seem not be able to --ignore-outputs with git diff. Sample output:

> git diff downsample.ipynb

fatal: cannot run less: No such file or directory
nbdiff /tmp/ThNYjG_downsample.ipynb downsample.ipynb
--- /tmp/ThNYjG_downsample.ipynb  2017-10-10 14:07:31.415010
+++ downsample.ipynb  2017-10-10 13:59:42.686333
## inserted before /cells/9:
+  code cell:
+    execution_count: 7
+    source:
+      list(range(8))
+    outputs:
+      output 0:
+        output_type: execute_result
+        execution_count: 7
+        data:
+          text/plain: [0, 1, 2, 3, 4, 5, 6, 7]

## deleted /cells/10:
-  code cell:
-    execution_count: 4
-    source:
-      list(range(5))
-    outputs:
-      output 0:
-        output_type: execute_result
-        execution_count: 4
-        data:
-          text/plain: [0, 1, 2, 3, 4]
> git config --show-origin --get diff.jupyternotebook.command

file:.git/config        git-nbdiffdriver diff --ignore-outputs

Appendix: Dockerfile I used to install development version of nbdime:

FROM node:6

USER root

RUN apt-get update
RUN apt-get install -y python-pip
RUN pip install --upgrade pip 

RUN pip install -e git+https://github.com/jupyter/nbdime#egg=nbdime
RUN nbdime config-git --enable --global

RUN apt-get install -y vim

# Build git from source to get latest version
RUN apt-get install -y libcurl4-gnutls-dev libexpat1-dev gettext libz-dev libssl-dev build-essential
RUN wget -O gitt https://www.kernel.org/pub/software/scm/git/git-2.14.2.tar.gz
RUN tar -zxf gitt
WORKDIR git-2.14.2
RUN make prefix=/usr/local install

CMD bash

Note: this is using Python 2.7.

@vidartf
Copy link
Collaborator

vidartf commented Oct 10, 2017

You are right (I can reproduce). The reason is that the ignore filters are not currently applied to cell add/deletes. There are two places that need to be changed to fix this, the diff algorithm and the diff presentation. Rough outline for a plan:

For added cells, the diff algorithm could replace the ignored parts with a sentinel value, which tells the diff printer to skip that part. This allows all deciding logic to remain within the diff algorithm, only adding the additional logic of the sentinel value in the presentation step.

@vidartf
Copy link
Collaborator

vidartf commented Oct 10, 2017

Hmm. I guess the ignore settings will need to be passed to the presentation step in order to correctly handle the delete step. As such, it might be better to skip that sentinel value plan altogether.

@tfiers
Copy link
Author

tfiers commented Oct 10, 2017

Ah, seems like my last example was a special case (it got parsed as "delete cell + add cell" instead of just "modify cell"). If I am careful to only modify a cell, I get the desired behaviour:

> git diff downsample.ipynb

nbdiff /tmp/aLHXFf_downsample.ipynb downsample.ipynb
--- /tmp/aLHXFf_downsample.ipynb  2017-10-10 15:30:31.925010
+++ downsample.ipynb  2017-10-10 15:30:29.352820
## replaced /cells/17/execution_count:
-  70
+  71

## modified /cells/17/source:
-  range(5)
+  range(8)

without a mention of the changed output. Nice!

Thanks again for the help.

@vidartf
Copy link
Collaborator

vidartf commented Oct 24, 2017

@tfiers A suggested fix/enhancement is available in #305. This should now also handle filtering on add/remove of cells.

@tfiers
Copy link
Author

tfiers commented Oct 24, 2017

Awesome, I just tested it out, with success.
Thanks!

@gladwig2
Copy link

Should I understand from the above that there is no way to pass command line arguments from
git diff...
to nbdime?
I find I want to use git diff .... because I need git command line options such as --cached, but also want to sometimes (not always) --disable-outputs
thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants