Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle Parenthesis in links properly #358

Closed
wants to merge 1 commit into from

Conversation

waseem
Copy link

@waseem waseem commented May 12, 2017

Links with parenthesis (like wikipedia links) are not correct when using autolinks

Properly handle parenthesis in links.

Fixes #291

Links with parenthesis (like wikipedia links) are not correct when using autolinks

Properly handle parenthesis in links.

Fixes russross#291
@@ -570,13 +570,16 @@ func TestInlineLink(t *testing.T) {

// no closing ), autolinking detects the url next
"[disambiguation](http://en.wikipedia.org/wiki/Disambiguation_(disambiguation) is the",
"<p>[disambiguation](<a href=\"http://en.wikipedia.org/wiki/Disambiguation_(disambiguation\">http://en.wikipedia.org/wiki/Disambiguation_(disambiguation</a>) is the</p>\n",
"<p>[disambiguation](<a href=\"http://en.wikipedia.org/wiki/Disambiguation_(disambiguation)\">http://en.wikipedia.org/wiki/Disambiguation_(disambiguation)</a> is the</p>\n",
Copy link
Author

@waseem waseem May 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It changes the way unbalanced parenthesis is handled in links. This results in forming a correct link by giving priority to the parenthesis in the link. This however, does not conform(like earlier) to common mark processing http://spec.commonmark.org/dingus/?text=foo%20%5Bdisambiguation%5D(http%3A%2F%2Fen.wikipedia.org%2Fwiki%2FDisambiguation_(disambiguation)%20is%20the.

I was trying to make it follow the common mark spec but I could get my way around the code. 🤔

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which section of Common Mark spec you had in mind here? It would be great to have this fix conform to Common Mark right away, let's try to see what it takes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See: http://spec.commonmark.org/0.27/#link-destination

A link destination consists of either

  • a sequence of zero or more characters between an opening < and a closing > that contains no spaces, line breaks, or unescaped < or > characters, or
  • a nonempty sequence of characters that does not include ASCII space or control characters, and includes parentheses only if (a) they are backslash-escaped or (b) they are part of a balanced pair of unescaped parentheses that is not itself inside a balanced pair of unescaped parentheses.

The spec does not talk about unescaped and unbalenced parenthesis in link destination. So I was just trying to make it behave the way http://spec.commonmark.org/dingus/ behaves for such links. It ignores the entire link processing in that case and renders it literally after processing. So:

foo [disambiguation](http://en.wikipedia.org/wiki/Disambiguation_(disambiguation) is the

becomes:

<p>foo [disambiguation](http://en.wikipedia.org/wiki/Disambiguation_(disambiguation) is the</p>.

Copy link
Author

@waseem waseem May 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By not talking about unescaped and unbalanced parenthesis, I suppose, the commonmark spec regards those as invalid links and hence does not render them as HTML links.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the section you're referring to talks about the "fully qualified" links, the ones with full link syntax.

And it seems that CommonMark only has autolinks with a hint: they require angle brackets around the link, which avoids the balancing problem altogether.


if openDelim == 0 {
linkEnd--
}
Copy link
Collaborator

@rtfb rtfb Jun 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this break this case?

foo (http://en.wikipedia.org/wiki/Disambiguation_(disambiguation)) bar

Autolink detector should recognize that the outer parentheses do not belong to the link. Can you add this case to the tests?

@ghost
Copy link

ghost commented Oct 28, 2019

Since this pull is being ignored, here is a workaround:

package main
import (
   "bytes"
   "github.com/yuin/goldmark"
   "github.com/yuin/goldmark/extension"
)
func main() {
   s1 := "https://en.wikipedia.org/wiki/The_Master_(2012_film)"
   o1 := goldmark.New(goldmark.WithExtensions(extension.Linkify))
   var b2 bytes.Buffer
   o1.Convert([]byte(s1), &b2)
   s2 := b2.String()
   print(s2)
}

https://github.com/yuin/goldmark

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Links with parenthesis (like wikipedia links) are not correct when using autolinks
2 participants