Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal : lossless markdown cells in ipynb #5408

Open
jgm opened this issue Mar 29, 2019 · 5 comments

Comments

Projects
None yet
4 participants
@jgm
Copy link
Owner

commented Mar 29, 2019

Pandoc currently stores markdown cells in its native format, so when they're rendered back to ipynb, the formatting can change (in semantically insignificant ways). This might be undesirable for some purposes.

We could have an option that causes the ipynb reader to store the original markdown text in a special attribute of the markdown cell div. The ipynb writer and markdown writer could check this attribute, and if it's present, render the stored text as raw markdown, exactly as in the source, rather than rendering the contents on the cell in the normal way.

I guess the attribute would be fairly harmless in other output formats, but we could automatically run a filter to strip it out for formats other than markdown or ipynb, I suppose.

Alternatively, the option could cause the ipynb reader to parse markdown cells into RawBlock (Format "markdown"). With this option, no output would appear for markdown cells when writing to formats other than ipynb or markdown; but that may not be an issue if it's triggered by a particular command-line option.

This would not work with pandoc-citeproc or filters in general, since these modify the AST, and the modifications would be ignored if we just substituted the original markdown.

Thoughts on this proposal?
@mwouts
@choldgraf

@mwouts

This comment has been minimized.

Copy link

commented Mar 29, 2019

Well, that's a good question! I agree that having the markdown cell reformatted was a bit surprising the first time I saw it. But as you explained, this preserves the text syntax and therefore also the HTML rendering. And that can also be a feature, as the long lines of markdown content can be automatically reformatted into shorter lines, and headers get a consistent style.

Now, I think the --wrap=preserve option already reduces by much the reformatting. Maybe that's a good start already? Do you think you need to go further than that?

Regarding the implementation, I do not think I would recommend copying the original text in the metadata, for two reasons:

  • Duplicating the original text would also make file history more noisy in version control
  • And it would prevent the user from being able to edit the content of the cell (since on the next conversion it would be overwritten with the original text)
@jgm

This comment has been minimized.

Copy link
Owner Author

commented Mar 30, 2019

@mwouts

This comment has been minimized.

Copy link

commented Mar 30, 2019

I see! Sure, if you implement an option that preserves the content of the cells when converting between Markdown and Jupyter Notebook, I will make that the default for pandoc from Jupytext, as it will make users more confident with the round trip.

@choldgraf

This comment has been minimized.

Copy link

commented Mar 30, 2019

Just want to say I love where this conversation is heading :-)

I'm happy to play around with things and give feedback whenever there are prototypes (on either side)

@ickc

This comment has been minimized.

Copy link
Contributor

commented May 21, 2019

It seems to me setting it as RawBlock of markdown is a cleaner approach (as a command line option.) People specifically looking for round trip identity would have been using the correct output format (and as a side benefit a free filter to remove any markdown cells.)

Reading elsewhere it seems like only a particular combinations of command line arguments would provide round trip identity, in this case may be similar to those markdown variants (eg multimarkdown as a short cut to combine multiple behaviors), a particular ipynb variants can be defined to provide round trip identity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.