New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In Markdown, URLs in <…> have the period moved outside the link #209

Closed
roryokane opened this Issue Jul 2, 2015 · 1 comment

Comments

Projects
None yet
2 participants
@roryokane

roryokane commented Jul 2, 2015

The symptoms

Try previewing a comment containing this markup (e.g. on this story):

<https://en.wikipedia.org/wiki/UNIX_System_Laboratories,_Inc._v._Berkeley_Software_Design,_Inc.>

What should be rendered:

https://en.wikipedia.org/wiki/UNIX_System_Laboratories,_Inc._v._Berkeley_Software_Design,_Inc.

What Lobsters renders:

https://en.wikipedia.org/wiki/UNIX_System_Laboratories,_Inc._v._Berkeley_Software_Design,_Inc.

The period at the end is moved outside of the link, breaking it. This is useful when autolinking, but not with explicit <…> links.

I can work around this by using […](…) syntax instead, duplicating the URL:

[https://en.wikipedia.org/wiki/UNIX_System_Laboratories,_Inc._v._Berkeley_Software_Design,_Inc.](https://en.wikipedia.org/wiki/UNIX_System_Laboratories,_Inc._v._Berkeley_Software_Design,_Inc.).

The cause

The problem is probably this part in extras/markdowner.rb, labeled “fix links that got the trailing punctuation appended to move it outside the link”. I know the problem is after line 15, because I confirmed that RDiscount 2.1.7.1 (the version Lobsters uses) includes the period correctly when I try to parse <…> with RDiscount directly from irb, with the same arguments.

The fix

RDiscount still wrongly includes the punctuation in links like

I don't like http://example.com/prices. They are too expensive.

So you can’t just delete that punctuation-removal code.

To fix this bug, I think you have to run the filter before the Markdown is processed. In the filter, look for raw URLs in the text that are not part of a link (a harder task than just searching for <a> tags), and then explicitly wrap them with <…>s that exclude ending periods.

@jcs jcs added the bug label Jul 20, 2015

jcs added a commit that referenced this issue Dec 3, 2015

add some tests for Markdowner
including failing ones for bug #209 and bug #242

@jcs jcs closed this Dec 3, 2015

@jcs jcs reopened this Dec 3, 2015

@jcs

This comment has been minimized.

Show comment
Hide comment
@jcs

jcs Dec 3, 2015

Contributor

fixed by 6e6803f

Contributor

jcs commented Dec 3, 2015

fixed by 6e6803f

@jcs jcs closed this Dec 3, 2015

jcs added a commit that referenced this issue Dec 7, 2015

Markdowner: use Nokogiri for html rewriting
brute forcing changes by regexps gets things wrong sometimes, so use
nokogiri to parse the html output of rdiscount and do changes on
individual nodes, then turn back into a string

we lose the ability to move punctuation inside of auto-generated
links, but i don't see any easy/definitive way to do this properly.

closes #242
closes #209

SiDevesh pushed a commit to RedCarpetUp/lobsters that referenced this issue Jul 11, 2016

add some tests for Markdowner
including failing ones for bug #209 and bug #242

SiDevesh pushed a commit to RedCarpetUp/lobsters that referenced this issue Jul 11, 2016

Markdowner: use Nokogiri for html rewriting
brute forcing changes by regexps gets things wrong sometimes, so use
nokogiri to parse the html output of rdiscount and do changes on
individual nodes, then turn back into a string

we lose the ability to move punctuation inside of auto-generated
links, but i don't see any easy/definitive way to do this properly.

closes #242
closes #209

SiDevesh pushed a commit to RedCarpetUp/lobsters that referenced this issue Aug 1, 2016

add some tests for Markdowner
including failing ones for bug #209 and bug #242

SiDevesh pushed a commit to RedCarpetUp/lobsters that referenced this issue Aug 1, 2016

Markdowner: use Nokogiri for html rewriting
brute forcing changes by regexps gets things wrong sometimes, so use
nokogiri to parse the html output of rdiscount and do changes on
individual nodes, then turn back into a string

we lose the ability to move punctuation inside of auto-generated
links, but i don't see any easy/definitive way to do this properly.

closes #242
closes #209
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment