Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with non-breaking space and non-standard end of line #9959

Open
mgajda opened this issue Jul 7, 2024 · 1 comment
Open

Issues with non-breaking space and non-standard end of line #9959

mgajda opened this issue Jul 7, 2024 · 1 comment
Labels

Comments

@mgajda
Copy link

mgajda commented Jul 7, 2024

Explain the problem.

Users will increasingly use non-breaking space and non-CRLF end of line characters.

These are currently breaking Markdown syntax.

For example this apparently valid pair of chapters is parsed as a single heading, and then a paragraph:

# Good chapter

# Bad chapter

This is because there is non-breaking space after # and before "Bad chapter".

The produced output in Try Pandoc from Markdown to HTML:

<h1 id="good-chapter">Good chapter</h1>
<p># Bad chapter</p>

Same issue may occur when creating a list.

We propose to patch detection of Markdown syntax, so that any space is treated as a space, and any line breaking combination (either CRLF and plain CR) is treated as end-of-line within Markdown syntax.

Pandoc version?

Pandoc 3.2.1 in https://pandoc.org/try, with arguments: pandoc --from markdown --to html5 --no-highlight.

@mgajda mgajda added the bug label Jul 7, 2024
@jgm
Copy link
Owner

jgm commented Jul 7, 2024

I think it makes sense to require a standard space after #. I'm not sure why someone would insert a nonbreaking space there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants