Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Language names, again #318

Closed
cabo opened this issue Feb 24, 2016 · 5 comments
Closed

Language names, again #318

cabo opened this issue Feb 24, 2016 · 5 comments
Assignees

Comments

@cabo
Copy link
Contributor

cabo commented Feb 24, 2016

I'm sorry this is coming up again and again, but language names need to be even more flexible.

In the end, they turn up in an HTML5 class name, and the syntax for class names is extremely flexible:
https://www.w3.org/TR/2014/REC-html5-20141028/infrastructure.html#set-of-space-separated-tokens

So language-русский is as valid as is language-c++ or language-asn.1.
This is not theoretical: It comes up in
https://tools.ietf.org/html/draft-iab-xml2rfc-03#section-2.48.4
which will soon be a standard that kramdown-rfc2629 is targeting.

Is there a reason the syntax for languages in fenced code blocks is restricted?
Shouldn't it be [^\s]+?

@cabo
Copy link
Contributor Author

cabo commented Feb 24, 2016

common mark seems to agree with русский:
http://johnmacfarlane.net/babelmark2/?normalize=1&text=~~~+русский%0Aeasel%0A~~~%0A
and with asn.1:
http://johnmacfarlane.net/babelmark2/?normalize=1&text=~~~+asn.1%0Aeasel%0A~~~%0A

But indeed few other implementations do. Hmm.

(Note that I would require a space before the [^\s]+ language name; the more restricted current one maybe is useful when attached to the ~~~.)

@gettalong gettalong self-assigned this Feb 26, 2016
@gettalong
Copy link
Owner

@cabo Thanks for bringing this up.

In the kramdown specification I did exactly specify of what characters an ID name may be specified but didn't do the same for class names.

Keeping that in mind not allowing any character in class names would actually be a bug. However, the two characters I would not include are the dot and hash sign so as to make chaining class and ID names in attribute lists still possible.

Any thoughts on this?

And what do you mean by this:

(Note that I would require a space before the [^\s]+ language name; the more restricted current one maybe is useful when attached to the ~~~.)

@gettalong
Copy link
Owner

I have fixed this in a way that is hopefully futureproof!

gettalong added a commit that referenced this issue Mar 2, 2016
…ntax

This change modifies how kramdown handles class names that are defined
by using the special shortcut syntax in attribute lists or when adding a
language name to a fenced code block.

* When defining a language name for a fenced code block, any
  non-whitespace character except the question mark is accepted. The
  question mark separates the language name itself from URL-like
  options.

* In attribute lists a class name may be defined by using any
  non-whitespace character except the dot and hash characters since
  those are used for chaining together ID and class names without
  whitespace.

Fixes #318
@cabo
Copy link
Contributor Author

cabo commented Mar 2, 2016

Just tried it -- great!

(My comment about requiring a space was that maybe

~~~@$)(@#)($*(*&

should not be recognized as a code block start. But then, the ~~~ is still pretty unique, so I'm happy with the way this is now.)

(If you really need a # or . in a class name specified in an IAL, use class=.... Allowing these in the language position of a code block is great for C# and ASN.1.)

Looking forward for a release, so I can start referencing the new version from kramdown-rfc2629.

@gettalong
Copy link
Owner

@cabo Already released version 1.10.0 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants