-
Notifications
You must be signed in to change notification settings - Fork 229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discussion: support round trips? #892
Comments
If your preprocessor received and emitted an AST instead of markdown, would that make it easier to make transformations? Also, output from the preprocessor could have AST fragments that are markdown, to lean on mdbook to do the more renderering. |
The preprocessing itself is actually fairly straightforward—an AST would be nice in some ways, but I have been doing transformations with streams of events from pulldown-cmark for a long time, so that part isn’t really the issue. It’s more that the stream of A concrete example: I am taking input like this— Normal text, yay.
> Note: This is a callout, more or less.
> It can run across lines.
Back to normal text! —and rewriting it into this, so that it has the correct HTML semantics: Normal text, yay.
<section class="note" aria-role="note">
Note: This is a callout, more or less.
It can run across lines.
</section>
Back to normal text! If I don’t take care to insert newlines when generating the HTML for the Normal text, yay.
<section class="note" aria-role="note">
Note: This is a callout, more or less.
It can run across lines.
</section>
Back to normal text! But because of the rules around block elements, Markdown treats that as plain text within the This is all quite doable, but after dealing with it a couple of times—lots of tests for edge cases now!—and reading through the issue tracker on |
There is a discussion about roundtripping in the context of fuzz testing at Byron/pulldown-cmark-to-cmark#55 cc @mgeisler When it was attempted the last time the fuzzer was failing due to lossiness of text->markdown->text conversion. It probably improved a lot with latest changes in both libraries. I just attempted to patch the PR and run it locally and the fuzzer failed again. A useful feature of this roundtripping would be finding implementation bugs in both libraries. Running the PR locally, for instance, fuzzer failed with input:
It does not seem to conform to the spec due to extra newline by |
+1 – the parser and its ability to modify events is so good. There is a nice opportunity here. |
pulldown-cmark-to-cmark is a really handy crate for doing things like writing mdBook preprocessors, which I have been doing a bit of while working on The Rust Programming Language, but I have noticed that it is challenging for that crate to match the input precisely, and there are a fair number of subtle bugs that are (a) difficult for them to fix and therefore (b) require special handling in preprocessors using it, e.g. to manually re-insert newlines.
This is not a bug report, though, as I don’t think that kind of round-tripping was part of the design here! Rather, it is intended to start a discussion, on two axes:
Event
s into a stream ofEvent
s (e.g. when rewriting in a preprocessor, extending behavior, etc.)?I think there are probably a bunch of open design questions there beyond just what would have to change, so, as I said: opening this for discussion and not assuming it is something the library should do.
The text was updated successfully, but these errors were encountered: