Should not link hyperlinks found in anchor text #56

Closed
isaacs opened this Issue Jun 20, 2012 · 5 comments

Comments

Projects
None yet
3 participants
Contributor

isaacs commented Jun 20, 2012

Test case:

> marked.parse('<p>Already linked: <a href="http://foo.com">http://foo.com</a>  </p>')
'<p>Already linked: <a href="http://foo.com"><a href="http://foo.com&lt;/a&gt;">http://foo.com&lt;/a&gt;</a>  </p>'

This is quite common when a bit of markdown contains some inline HTML.

A simple workaround that is probably not ideal or free of edge case failures:

result = result.replace(/<a href="([^"]+)&lt;\/a&gt;">\1&lt;\/a&gt;/, '$1');
Owner

chjj commented Jun 21, 2012

A simple workaround that is probably not ideal or free of edge case failures

I'd definitely be really hesitant to add that. If we're going to fix it, I would want to fix it the right way (but even that might be slightly awkward). I realize regular html links containing the url are pretty common markup, but the thing I don't like about this is that it requires marked to be too smart and break its own rules for a special case. ... But then again, markdown is essentially just a giant string of special cases. ;)

One possibility would be to change the inline.url regex to properly stop before the angle brackets. It would produce a redundant nested link, but wouldn't require any magic happening. The html would be malformed, but it would probably render in every browser.

Contributor

isaacs commented Jun 21, 2012

Yes, the workaround is working for me now, but it's not intended as a correct or general fix.

The inline.url parser should do two things, I think:

  1. Recognize that < is a delimiter, so stop parsing the URL if it sees that.
  2. Do not hyperlink any URLs that are immediately preceded by <a[^>]*>.

An even better option would be for the inline parser to keep track of whether or not it's inside of an <a> tag at any given point, and only hyperlink urls found outside of <a> tags.

Owner

chjj commented Jun 22, 2012

Recognize that < is a delimiter, so stop parsing the URL if it sees that.

Yeah, I think I'll add this no matter what.

An even better option would be for the inline parser to keep track of whether or not it's inside of an <a> tag at any given point, and only hyperlink urls found outside of <a> tags.

Yeah, that would be ideal. I've been meaning to rewrite the inline lexer in such a way that it can create an array of tokens (instead of compiling to html on the spot) without any performance hit. Being able to keep track of whether we're inside a link or not would also be nice. I'll try to figure out a clever way to implement it when I get some time. When I think about it, as long as nothing comes with a noticeable a performance hit, I think I'm okay with implementing it. I just don't want to stray from the original reason I created marked.

Contributor

isaacs commented Jun 22, 2012

When I think about it, as long as nothing comes with a noticeable a performance hit, I think I'm okay with implementing it. I just don't want to stray from the original reason I created marked.

It's pretty typical for parsers to give up a little bit of performance for the sake of correctness, and usually it's worthwhile. But I completely understand if you want to err on the side of speed in some cases. Part of the reason I prefer to use marked is because it's so fast and the lexed output is very simple. If I wasn't dealing with wordpress garbage pseudo-html, it wouldn't be an issue :)

@chjj chjj added a commit that referenced this issue Jan 9, 2013

@chjj chjj avoid nested links. see #56. 31c9b49

adam-p referenced this issue in adam-p/markdown-here Feb 18, 2013

Closed

Links with URL text won't render properly #51

Contributor

Mithgol commented Feb 19, 2013

Is this issue fixed or was the 31c9b49 fix partial?

If this issue is not fixed entirely yet, then what does remain to be done?

chjj closed this in bcf206e Feb 23, 2014

@Fogetti Fogetti pushed a commit to Fogetti/marked that referenced this issue May 13, 2016

@chjj chjj fix double gfm autolinks. fixes #56. 913181f
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment