New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enabling mmd_title_block causes YAML title blocks to be mis-parsed #2026
Comments
This involves a fairly complicated dance with a Pandoc "filter" module in order to get all of the metadata to be visible in the output, but means that all metadata formats supported by Pandoc are available without the need for any additional Python modules. It also means strings in metadata will be processed as Markdown. NOTE: Thanks to jgm/pandoc#2026 and backward compatibility constraints, this change defaults to enabling 'mmd_title_block' and *disabling* 'pandoc_title_block' and 'yaml_metadata_block'. Moreover, putting either +pandoc_title_block or +yaml_metadata_block in PANDOC_EXTENSIONS will cause mmd_title_block to be disabled.
I can reproduce this with latest pandoc:
No idea why off hand, but will have to look into it. |
Well, I think I know why that happens. echo -e "http://google.com" | pandoc -f markdown+mmd_title_block -t json produces [{"unMeta":{"http":{"t":"MetaBlocks","c":[{"t":"Plain","c":[{"t":"Str","c":"//google.com"}]}]}}},[]] i.e. "http" is interpreted as a metadata key, and the rest as metadata value. Since metadata values are parsed with the same parser... yeah. No idea on how to fix this properly though. How does multimarkdown handle this, I wonder? |
Possibly related, then:
If I'm reading the YAML spec correctly, |
@zackw, not sure about YAML spec, but it's parsed by yaml, so that's not directly related to pandoc itself. Multimarkdown title block is parsed by pandoc, however. Fix options would include:
In all honesty, I think both should be implemented... |
+++ Nikolay Yakimov [Mar 27 15 16:18 ]:
Multimarkdown does not parse metadata values at all - so you can't e.g. have italics in a metadata value. (Not the greatest feature.) |
Yes, good point: try putting the whole URL in single quotes. Colons in YAML values normally need to be escaped. +++ Zack Weinberg [Mar 27 15 16:24 ]:
|
Disable all metadata block extensions when parsing metadata field values. Issue jgm#2026
Require space after key-value delimiter colon in mmd title block. Issue jgm#2026
@jgm, I mean, how does mmd handle markdown that begins with URI? Does it interpret it as title block? Anyway, I've pushed commits for my fix proposals. You can cherry-pick one or both, or I can create a PR. |
Oh, and by the way, single/double quotes won't work, since those are stripped away by yaml parser. |
+++ Nikolay Yakimov [Mar 27 15 17:01 ]:
It definitely requires a space after the colon. |
+++ Nikolay Yakimov [Mar 27 15 17:01 ]:
I think both fixes are needed. IF you can create a PR, that would be helpful. (Btw, I've confirmed experimentally that multimarkdown does not allow a |
Ok, I've created #2030. I'll see about requiring a value in a minute. |
Require space after key-value delimiter colon in mmd title block. Issue jgm#2026 Amend: parsec's `spaces` include newlines, but we don't want that. Had to make custom `spaceNoNewline` parser here
Skipping spaces is more convoluted than I initially thought, when considering empty value, since Parsec's |
@lierdakil, I probably would have just used Closed by 86a4442 |
This involves a fairly complicated dance with a Pandoc "filter" module in order to get all of the metadata to be visible in the output, but means that all metadata formats supported by Pandoc are available without the need for any additional Python modules. It also means strings in metadata will be processed as Markdown. NOTE: Thanks to jgm/pandoc#2026 and backward compatibility constraints, this change defaults to enabling 'mmd_title_block' and *disabling* 'pandoc_title_block' and 'yaml_metadata_block'. Moreover, putting either +pandoc_title_block or +yaml_metadata_block in PANDOC_EXTENSIONS will cause mmd_title_block to be disabled.
Consider
This is unambiguously a YAML header, and the manual says YAML headers take precedence over MMD headers. But watch what happens:
Not all values are affected. URLs consistently seem to get eaten, and strings containing no punctuation consistently seem to survive, but don't quote me on that.
I have pandoc 1.12.4.2, in case that matters.
The text was updated successfully, but these errors were encountered: