New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Graphical tool for diffing notebooks #3355
Comments
I took a shot some time ago. I think it would require cell-id at some point. cf #2342 , and example of diff'd notebook. |
I've got a vague idea for a kind of 'rich diff protocol', because I find there's a lot of filetypes where standard diff isn't much use (e.g. word processor documents - even if you use a text-based format like lyx or flat ODT, the noise makes the diffs all but unusable). This is kind of a long-term thing: it might be interesting to build the prototype framework with a notebook diff tool, but you're equally welcome to just solve the problem at hand directly if you prefer. This has actually been going around in my head this morning, so here's a few notes on what I envisage:
This also draws a bit from lesspipe, which I just learned about this morning. |
@Carreau Ah, yes — sorry, I should have gone into more detail here. For a first pass, I'd like to build a tool that would would just diff entire notebooks, not cell contents. Imagine a 3-way-merge where, instead of lines of code, you have complete notebook cells. This way you can sidestep a lot of the Very Hard problems, and get something that will be immediately (well, with only a few hours of work) somewhat useful. Again at first pass: I imagine building it as a standalone tool which can be called in place of Of course, from there, it would be straight forward to diff only two notebooks. |
Glad to see you're taking a shot at this! Needless to say, this should be done as a purely standalone experiment for now, so you have the freedom to control development without worrying too much about integration with the core. While I agree with @takluyver that this problem fits into the larger context of complex format diffing, I also think that you should start by focusing on one specific thing, namely IPython notebooks, for a first prototype. It can be generalized later once you have something that works, but this is exactly the kind of problem where trying to build from the outset a completely generic tool is likely to lead to an abstraction monstrosity that's both unmanageable and sub-optimal in any specific case. There's a ton of room for interesting experimentation here on what will be good output. I personally really like LaTeXdiff, as a tool for diffing latex-sourced files in a rich context. I'd encourage you to have a look at it for inspiration. |
I've done a few toy attempts of this kind of thing, and I don't think that a cell ID needs to, or even should be, a part of it. |
Are there any plans to incorporate something like this? I just found this tool: |
Yes we are aware of nbdiff (that also have a website: http://nbdiff.org/) we would need a full-time person to actually work on that. |
FWIW -- more I use notebooks more I run into a need of a visual diff, as many of others -- nbdiff, https://github.com/csiro-scientific-computing/NotebookDiff, and who knows what else. But unfortunately none of those seems to be able to fully survive on their own partially due to the rapid pace of IPython development and lack of dedicated funding for their development. |
Great, thank you for the pointer @takluyver |
Now ReviewNb is another option for visual diff'ing of notebooks stored on GitHub. Disclaimer: I built ReviewNb. |
As per the discussion leading from: https://twitter.com/swcarpentry/status/337611439382593537
I'll be taking a crack at this some time the week of May 27th.
The text was updated successfully, but these errors were encountered: