Proposal : lossless markdown cells in ipynb #5408
Pandoc currently stores markdown cells in its native format, so when they're rendered back to ipynb, the formatting can change (in semantically insignificant ways). This might be undesirable for some purposes.
We could have an option that causes the ipynb reader to store the original markdown text in a special attribute of the markdown cell div. The ipynb writer and markdown writer could check this attribute, and if it's present, render the stored text as raw markdown, exactly as in the source, rather than rendering the contents on the cell in the normal way.
I guess the attribute would be fairly harmless in other output formats, but we could automatically run a filter to strip it out for formats other than markdown or ipynb, I suppose.
Alternatively, the option could cause the ipynb reader to parse markdown cells into
This would not work with pandoc-citeproc or filters in general, since these modify the AST, and the modifications would be ignored if we just substituted the original markdown.
Well, that's a good question! I agree that having the markdown cell reformatted was a bit surprising the first time I saw it. But as you explained, this preserves the text syntax and therefore also the HTML rendering. And that can also be a feature, as the long lines of markdown content can be automatically reformatted into shorter lines, and headers get a consistent style.
Now, I think the
Regarding the implementation, I do not think I would recommend copying the original text in the metadata, for two reasons:
Marc Wouts <firstname.lastname@example.org> writes:
Now, I think the `--wrap=preserve` option already reduces by much the reformatting. Maybe that's a good start already? Do you think you need to go further than that?
I guess that's my question for you. I can imagine that someone who goes back and forth a lot might like more consistency in the markdown cells, to avoid spurious changes. But maybe it's not an issue in practice.
Regarding the implementation, I do not think I would recommend copying the original text in the metadata, for two reasons: - Duplicating the original text would also make file history more noisy in version control - And it would prevent the user from being able to edit the content of the cell (since on the next conversion it would be overwritten with the original text)
To be clear, my idea was to store this in the metadata in pandoc's AST, but not to render this metadata in markdown or ipynb.
It seems to me setting it as RawBlock of markdown is a cleaner approach (as a command line option.) People specifically looking for round trip identity would have been using the correct output format (and as a side benefit a free filter to remove any markdown cells.)
Reading elsewhere it seems like only a particular combinations of command line arguments would provide round trip identity, in this case may be similar to those markdown variants (eg multimarkdown as a short cut to combine multiple behaviors), a particular ipynb variants can be defined to provide round trip identity.