Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nbstripout installation causes incorrect 'git status' results #65

Open
jykim opened this issue Aug 7, 2017 · 19 comments
Open

nbstripout installation causes incorrect 'git status' results #65

jykim opened this issue Aug 7, 2017 · 19 comments
Assignees
Labels
help wanted state:needs follow up This issue has interesting suggestions / ideas worth following up on type:bug

Comments

@jykim
Copy link

jykim commented Aug 7, 2017

I seem to see changes for notebooks I haven't touched since I installed nbstripout. Interestingly, when I uninstall nbstripout and install it again, the changes are gone. Besides general performance issue, this prevents us from adopting nbstripout as a team.

 $ git status
On branch master
Your branch is up-to-date with 'origin/master'.
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)
... list of notebooks I haven't touched ...

----(after uninstall nbstripout and install it again)---

 $ git status
On branch master
Your branch is up-to-date with 'origin/master'.
nothing to commit, working tree clean
@kynan kynan self-assigned this Aug 7, 2017
@kynan kynan added the type:bug label Aug 7, 2017
@kynan
Copy link
Owner

kynan commented Aug 7, 2017

What do you get as git diff when you hit the case of allegedly modified notebook files?

@kynan kynan added the state:waiting Waiting for response for reporter label Aug 9, 2017
@jykim
Copy link
Author

jykim commented Aug 11, 2017

I generally see nothing when I run get diff on those files, but I haven't tested them all. (can't share them since it's my work notebooks anyways) As I mentioned, It's especially interesting that these files are gone when I uninstall nbstripout and install it again, which I weird.

@jykim
Copy link
Author

jykim commented Aug 11, 2017

Today I found that these changes show up after 'git pull', and I looked at a few other diffs and found that they are diffs between before) notebooks with output after) notebook with output stripped. So maybe the git maybe telling me to strip the output out from some notebooks which contains outputs from previous commits.

Does this make sense? But I still don't understand why these notebooks show up after 'git pull' and why they're gone after uninstall/reinstall.

@kynan
Copy link
Owner

kynan commented Aug 21, 2017

So what exactly does git status report in this situation? Are those notebooks modified? Or is it a mode change?

Also could you run a nbstripout --status before and after the reinstall and report if there's a difference?

@kynan
Copy link
Owner

kynan commented Aug 4, 2018

@jykim Any update?

@stas00
Copy link

stas00 commented Aug 17, 2018

This might be related to an issue I posted about here. I still haven't figured it out, so any help is welcome.

@melaanya
Copy link

I have the same problem: when nbstripout is enabled and I checkout several files, the omitting of changes in these files is not reflected by git status (git status shows their presence, the changes are actually omitted). If I do 'nbstripout --uninstall' for the repo and then do 'nbstripout --install' again then the changes are reflected by git status.

@stas00
Copy link

stas00 commented Sep 13, 2018

@melaanya, try to change the configuration of the content filters as explained here and it'll help you debugging the issue. You will be able to tell when nbstripout is activated.

@kynan
Copy link
Owner

kynan commented Oct 16, 2018

@stas00 I've looked at your issue and am about as puzzled as you are. Thanks for documenting it so thoroughly. Guess we'll have to find a better git plumber than I am to explain this behavior.

@stas00
Copy link

stas00 commented Oct 16, 2018

Currently, pretty much any time one of our developers forgets to enable the filter and commits w/o stripping the notebooks out, it breaks git pull for anybody who already had a checked out version. And other than doing a new clone the only solution seems to be is to disable the filter, git pull and reenable the filter.

What we really need is a way to enforce content filters repo-wide, except git doesn't support that due to security reasons (overriding user's config is unsafe). I'm not sure what to do about it. It's a constant problem for us. Perhaps there is a need to re-design content filters so that they aren't configured via a private .gitconfig and can be automatically installed for everybody (note that our repo carries its own version of the nbstripout filter, so the culprit is just not being able to enforce the configuration). Unless perhaps I'm missing something and you have some insights of another way of configuring it, so that the user doesn't need to do anything.

And of course, there are server-side hooks which could do the policing, but it doesn't sound like the best solution. In particular since last I looked github doesn't support those.

@kynan
Copy link
Owner

kynan commented Oct 28, 2018

Yeah, to me this sounds like a Git issue rather than an nbstripout issue and I'm not sure what (if anything) we could fix in nbstripout to help with this. Have you considered raising the issue on the Git mailing list?

@stas00
Copy link

stas00 commented Oct 28, 2018

I already did so to no avail.

If it weren't for github and we were running our own git backend then server-side hooks could do the policing, but since github doesn't support that, there is nothing one can do to enforce that.

The best "enforcing" solution I found is to run a CI job that flags this problem as it appears, and hopefully the developers watch for [x]'s in their commit status on github.

echo "Check we are starting with clean git checkout"
if [ -n "$(git status -uno -s)" ]; then echo "git status is not clean"; false; fi
echo "Trying to strip out notebooks"
tools/fastai-nbstripout -d dev_nb/*ipynb dev_nb/*/*ipynb docs_src/*ipynb docs_src/*/*ipynb
echo "Check that strip out was unnecessary"
git status -s # display the status to see which nbs need cleaning up
if [ -n "$(git status -uno -s)" ]; then echo -e "!!! Detected unstripped out notebooks\n!!!Remember to run tools/run-after-git-clone"; false; fi

and the best resolution I found is:

disable the filters
git pull
fix the problems
enable the filters
git commit -a
git push

which in the fastai project is done with a script (-d is for disable):

tools/trust-origin-git-config -d
git pull
tools/trust-origin-git-config

@kynan
Copy link
Owner

kynan commented Nov 10, 2019

FYI, we're discussing adding a dedicated nbstripout command to help with this flow in #108. Could be worth taking inspiration from the fastai scripts.

@kynan kynan added the state:needs follow up This issue has interesting suggestions / ideas worth following up on label Nov 10, 2019
ooiM added a commit to ooiM/nbstripout that referenced this issue Jun 2, 2020
@DanielRudnicki
Copy link

I see the same problem, stopped using the tool after 2 days.

@kynan
Copy link
Owner

kynan commented Apr 5, 2021

@DanielRudnicki Reproduction steps for the failures you're seeing would help.

@rhstanton
Copy link

I think I'm seeing the same issue and may have found a fix. I've been getting (erroneous) complaints for ages about modified ipynb files in (just cloned or pulled) git repositories, and finally decided to track it down. I found that the error occurs when file ~/.config/git/attributes contains the lines

*.ipynb filter=nbstripout
*.zpln filter=nbstripout
*.ipynb diff=ipynb

If I comment out all these lines, the error goes away.

Note that ~/.gitconfig contains the lines

[filter "nbstripout"]
        clean = \"/opt/anaconda3/bin/python3\" -m nbstripout
        smudge = cat
[diff "ipynb"]
      textconv = \"/opt/anaconda3/bin/python3\" -m nbstripout -t

So nbstripout is still installed and runs fine.

Also note that I've encountered the same issue (and the same fix works) on both MacOS and Ubuntu machines.

@kynan
Copy link
Owner

kynan commented Jul 1, 2024

@rhstanton Are you sure nbstripout is still active when you comment out the attributes? I'd find that very surprising. How would git know to apply the nbstripout filter to *.ipynb files?

@rhstanton
Copy link

No, I'm no longer sure! I just tried a small test where I reran some code in a notebook file, and without the attributes, git diff reports that the file has changed, while with the attributes, git diff reports no changes. I guess my "fix" was just to turn off nbstripout - not as interesting as I'd hoped...

@kynan
Copy link
Owner

kynan commented Jul 2, 2024

OK, thanks for confirming! If you have any further insights on this issue please send them along - this one is still a mystery!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted state:needs follow up This issue has interesting suggestions / ideas worth following up on type:bug
Projects
None yet
Development

No branches or pull requests

6 participants