Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spaces around info string? #60

Closed
z0al opened this issue Oct 22, 2017 · 5 comments

Comments

@z0al
Copy link

commented Oct 22, 2017

The spec says:

The line with the opening code fence may optionally contain some text following the code fence; this is trimmed of leading and trailing spaces and called the info string.

I wonder what spaces actually means to in this context? spaces only? spaces and tabs?

I usually check my assumptions using GitHub Markdown API, but this time I got strange output!

Output:

<pre lang="hey"><code>print 'Hello world!'
</code></pre>

Input:

```·→hey
print 'Hello world!'
```

May someone explain what just happened?


NOTE
· represents space
represents tab

@kivikakk

This comment has been minimized.

Copy link
Member

commented Oct 30, 2017

We use an option called CMARK_OPT_GITHUB_PRE_LANG in this fork to output code blocks in a style that github.com expects:

https://github.com/github/cmark/blob/07fe00f38e78846bc0760cb02a3b1441dffc9b6d/src/html.c#L186-L192

That aside, though, you're right that the spec is inconsistent; it says "trimmed of leading and trailing spaces" but the code calls cmark_strbuf_trim:

https://github.com/github/cmark/blob/07fe00f38e78846bc0760cb02a3b1441dffc9b6d/src/blocks.c#L312-L316

which trims anything matching cmark_isspace. I'll open a PR to the spec upstream to have this changed to "trimmed of leading and trailing whitespace" such that the code is correct.

kivikakk added a commit to kivikakk/commonmark-spec that referenced this issue Oct 31, 2017
Info string is trimmed of all whitespace
As noted in github/cmark-gfm#60, the info string is not
only trimmed of "spaces" (U+0020) but indeed all whitespace.  Update the spec to
reflect this.
@kivikakk

This comment has been minimized.

Copy link
Member

commented Oct 31, 2017

@z0al

This comment has been minimized.

Copy link
Author

commented Oct 31, 2017

@kivikakk Thanks

Well, another space-related thing is this:
https://github.com/github/cmark/blob/07fe00f38e78846bc0760cb02a3b1441dffc9b6d/test/spec.txt#L542-L546
I think that "each followed optionally by any number of spaces" part isn't accurate, because AFAIK the source code actually accepts both spaces and tabs (I don't understand how re2c works, though 😄 )
https://github.com/github/cmark/blob/07fe00f38e78846bc0760cb02a3b1441dffc9b6d/src/scanners.re#L274-L278

P.S: should I open issues like this one in the original cmark repo?

@kivikakk

This comment has been minimized.

Copy link
Member

commented Nov 1, 2017

You're right, this should say "spaces or tabs" specifically, not "spaces" (which is only U+0020 SPACE) nor "whitespace" (which would also newlines and other kinds of tabs/feeds). I'll add this to my PR at commonmark/commonmark-spec#505.

I think you definitely could open these on the upstream cmark repo, or directly on the CommonMark repo itself like I'm doing if you wanted to call it out as a problem with the spec.

jgm added a commit to commonmark/commonmark-spec that referenced this issue Nov 5, 2017
Info string is trimmed of all whitespace (#505)
* Info string is trimmed of all whitespace

As noted in github/cmark-gfm#60, the info string is not
only trimmed of "spaces" (U+0020) but indeed all whitespace.  Update the spec to
reflect this.

* "spaces" => "spaces or tabs" after thematic breaks
@kivikakk

This comment has been minimized.

Copy link
Member

commented Nov 9, 2017

commonmark/commonmark-spec#505 got merged, so these issues are hopefully fixed 👍

@kivikakk kivikakk closed this Nov 9, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.