Character references in link definition labels #616

wooorm · 2019-10-01T18:59:52Z

Character references are allowed everywhere, except in fenced code, indented code, or code spans
They represent their resolved character, not syntax

There’s even example 318 of having them in link definition destinations and link definition titles.

But, the following does not resolve into a link:

[&copy;]: example.com

[©][]

I interpret the spec as saying that it should resolve, but then the dingus doesn’t.
This may be a bug for the dingus implementation, rather than the spec.

Crissov · 2019-10-02T07:05:58Z

I agree that, to meet author expectation or intuition, character references of all kinds should be normalized in link labels (and elsewhere), especially since letter case is being ignored. Unfortunately, only a single implementation, Maruku, does it this way, although most CM-conformant parsers (and Pandoc) will happily convert any HTML entities to plain characters on output.

One label matches another just in case their normalized forms are equal.
To normalize a label,
strip off the opening and closing brackets,
perform the Unicode case fold,
strip leading and trailing whitespace and
collapse consecutive internal whitespace to a single space.

Note that matching is performed on normalized strings,
not parsed inline content.
So the following does not match,
even though the labels define equivalent inline content:

Example 541
[bar][foo\!]

[foo!]: /url

The rules for the link text are the same as with inline links.

An inline link […]
character references in the destination will be parsed into the corresponding Unicode code points, as usual.

character references are recognized in any context besides code spans or code blocks,
including URLs, link titles, […]

link label […]
The contents of the first link label are parsed as inlines, which are used as the link’s text.

The link text may contain inline content: [Example 526]

mgeier · 2019-10-02T08:01:03Z

This might be related to #572.

wooorm · 2020-07-04T17:50:13Z

Btw, I think this should be true for character escapes too:

[&copy;]: a.com
[\!]: b.com

Both should link: [©], [!]

Yields:

wooorm · 2020-07-04T18:01:22Z

@jgm Is this something you agree with? I can create a PR to clarify the docs

vassudanagunta · 2020-07-06T20:27:27Z

I don't think this needs clarification of the docs so much as a bug report against CommonMark.js.

That said, every single Markdown implementation but one fails this test.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Character references in link definition labels #616

Character references in link definition labels #616

wooorm commented Oct 1, 2019

Crissov commented Oct 2, 2019 •

edited

mgeier commented Oct 2, 2019

wooorm commented Jul 4, 2020

wooorm commented Jul 4, 2020

vassudanagunta commented Jul 6, 2020

Character references in link definition labels #616

Character references in link definition labels #616

Comments

wooorm commented Oct 1, 2019

Crissov commented Oct 2, 2019 • edited

mgeier commented Oct 2, 2019

wooorm commented Jul 4, 2020

wooorm commented Jul 4, 2020

vassudanagunta commented Jul 6, 2020

Crissov commented Oct 2, 2019 •

edited