Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test jupytext against a large collection of notebooks #99

Closed
mwouts opened this issue Oct 11, 2018 · 1 comment
Closed

Test jupytext against a large collection of notebooks #99

mwouts opened this issue Oct 11, 2018 · 1 comment

Comments

@mwouts
Copy link
Owner

mwouts commented Oct 11, 2018

It would be great to test jupytext's round trip conversion on a large collection of notebooks. Take for instance Jake Vanderplas' Python for Data Science Handbook.

mwouts added a commit that referenced this issue Oct 14, 2018
Compare cell content up to blank lines, and restore markdown cell metadata from ipynb
mwouts added a commit that referenced this issue Oct 14, 2018
@mwouts
Copy link
Owner Author

mwouts commented Oct 14, 2018

That was an interesting exercise! On the way I encountered a YAML RepresenterError for one of the notebooks (fixed with f1285db). I also had to improve significantly the reporting on the round trip conversion.

Now the results:

  • Round trip conversion is consistent with Jupytext's expectations. Each of the below returns with no error:
jupytext --test *.ipynb --to py:light
jupytext --test *.ipynb --to py:percent
jupytext --test *.ipynb --to py:sphinx
jupytext --test *.ipynb --to md
jupytext --test *.ipynb --to Rmd
  • Round trip conversion is not strictly the identity. Conversion to Python scripts removes one final blank line in cell 9 of 03.06-Concat-And-Append.ipynb. And cell metadata: 'collapsed', 'scrolled' are absent from the script. Conversion to Markdown or R Markdown removes cell metadata on markdown cells. Thus cell metadata 'collapsed', 'deletable', 'editable' are missing. And markdown cell 35 of notebook 01.07-Timing-and-Profiling.ipynb is splitted into multiple cells (as the cell contains two consecutive blank lines)
jupytext --test-strict *.ipynb --to py:light # missing cell metadata, one final blank line removed in one code cell
jupytext --test-strict *.ipynb --to md       # missing cell metadata, one markdown cell splitted
  • Round trip conversion of paired notebook is closer to identity - cell metadata not represented in the scripts is taken from the ipynb file
jupytext --test-strict *.ipynb --to py:light # one final blank line removed in cell 9 of 03.06-Concat-And-Append.ipynb
jupytext --test-strict *.ipynb --to md       # last markdown cell splitted in 01.07-Timing-and-Profiling.ipynb

I also have documented Jupytext's definition of unchanged notebook for the round trip conversion, and the --test, --test-strict and -x arguments in the README.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant