Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fenced code block parsing is not CommonMark compliant #410

Closed
pjeby opened this issue Nov 27, 2017 · 4 comments
Closed

Fenced code block parsing is not CommonMark compliant #410

pjeby opened this issue Nov 27, 2017 · 4 comments

Comments

@pjeby
Copy link

pjeby commented Nov 27, 2017

blackfriday rejects opening code fences that contain more than one word, unless the words are enclosed in { }. But CommonMark explicitly allows multiple words in opening fences, as can be seen at e.g. this example in the CommonMark spec. (The only character not allowed in a code fence "info string" is a backquote.)

See also this related issue on gitea, and sample renderings via Gitea and Github of a markdown document containing such multi-word opening fences. Gitea's rendering breaks because blackfriday does not recognize them as fences, while Github's rendering uses the first word to select the language for highlighting (as shown in the example from the CommonMark spec).

As far as I can tell, the issue applies to both v1 and v2.

@pjeby
Copy link
Author

pjeby commented Nov 27, 2017

Quick question -- if I were going to do a PR to fix this, should I submit it against v1 or v2? I notice that both v1 and v2 test suites lack coverage for the case of a multi-word opening fence, so in principle at least, it isn't a backward-incompatible change. 😄

@rtfb
Copy link
Collaborator

rtfb commented Nov 27, 2017

v2. v1 is only kept for backward compatibility, all new stuff should go to v2.

CommonMark compliance is still nowhere near halfway started (I only added some tests on this branch). I was thinking to reap some low-hanging whitespace incompatibility fruits on rendering side, then introduce a flag for CommonMark compatibility and start working on parse-side incompatibilities like this one.

So yeah, if you're interested in getting started with it, you're very welcome :-). Submit a PR against v2 and let's hash out the details there. (Or here, if you feel like there's more to discuss before diving into code).

@pjeby
Copy link
Author

pjeby commented Nov 27, 2017

My main question is whether this should be considered a bug fix -- i.e., safe to introduce with or without CommonMark, since there isn't any sane markdown source I can think of where the current fence rejection is a feature.

If that's the case, the fix would be pretty easy and backportable to v1 as well. (Gitea uses v1 AFAICT, and they're my reason for investigating this in the first place.)

The specifics of the fix would be to add tests for the desired behavior, and change this chunk of code to just skip to the end of the line instead of failing when non-whitespace is detected.

But yeah, what I'm proposing is to make those changes, without putting them behind a CommonMark flag or the test suite, just making it forward compatible. And I'd also ideally like to backport the bug fix to v1 so that Gitea can benefit.

Thoughts?

garfieldnate added a commit to garfieldnate/blackfriday that referenced this issue Apr 21, 2018
According to the common mark standard, code fence info strings can be anything,
not just single words. Update the tests and parser accordingly.

The formatter already expected an info string with a language and HTML classes,
so this does not need to change. Update the LaTeX formatter to take the first
word of the info string as the language.

Fixes russross#410 (in v1).
@garfieldnate
Copy link
Contributor

This is a pretty important feature to me. I'm migrating from Jekyll to Hugo and the code fence plugin for Jekyll accepted longer info strings, so I have to go through my posts and delete lots of information.

garfieldnate added a commit to garfieldnate/blackfriday that referenced this issue Apr 21, 2018
According to common mark, the info string for a fenced code block can be any
non-whitespace string, so adjust the code to read a full string instead of
just the syntax name.

Fixes russross#410 in v2.
garfieldnate added a commit to garfieldnate/blackfriday that referenced this issue Apr 21, 2018
According to common mark, the info string for a fenced code block can be any
non-whitespace string, so adjust the code to read a full string instead of
just the syntax name.

Fixes russross#410 in v2.
@rtfb rtfb closed this as completed in #448 Apr 28, 2018
rtfb pushed a commit that referenced this issue Apr 28, 2018
* Accept info strings in code fences

According to the common mark standard, code fence info strings can be anything,
not just single words. Update the tests and parser accordingly.

The formatter already expected an info string with a language and HTML classes,
so this does not need to change. Update the LaTeX formatter to take the first
word of the info string as the language.

Fixes #410 (in v1).

* Don't output whole info string as code classes

This follows the common mark specification.

* run go fmt
rtfb pushed a commit that referenced this issue Apr 28, 2018
* Add full info string in fenced code blocks

According to common mark, the info string for a fenced code block can be any
non-whitespace string, so adjust the code to read a full string instead of
just the syntax name.

Fixes #410 in v2.

* run go fmt
willdollman pushed a commit to willdollman/blackfriday that referenced this issue Feb 27, 2021
* Accept info strings in code fences

According to the common mark standard, code fence info strings can be anything,
not just single words. Update the tests and parser accordingly.

The formatter already expected an info string with a language and HTML classes,
so this does not need to change. Update the LaTeX formatter to take the first
word of the info string as the language.

Fixes russross#410 (in v1).

* Don't output whole info string as code classes

This follows the common mark specification.

* run go fmt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants