URLs with parenthesis are broken #46

Closed
fiatjaf opened this Issue Mar 24, 2015 · 6 comments

Comments

Projects
None yet
3 participants
@fiatjaf

fiatjaf commented Mar 24, 2015

>>> import mistune
>>> import urllib
>>> import urlparse
>>> 
>>> url = 'https://trello-attachments.s3.amazonaws.com/550878b58559170febf8e69b/500x600/ff07160366332b3962320bdfb1693e3e/download_(2).jpg'
>>> mistune.markdown("here's a [broken URL](%s)" % url)
'<p>here\'s a <a href="https://trello-attachments.s3.amazonaws.com/550878b58559170febf8e69b/500x600/ff07160366332b3962320bdfb1693e3e/download_(2">broken URL</a>.jpg)</p>\n'
>>> urlp = urlparse.urlparse(url)
>>> mistune.markdown("now [it](%s) is not broken anymore" % (urlp.scheme + '://' + urlp.netloc + urllib.quote(urlp.path)))
'<p>now <a href="https://trello-attachments.s3.amazonaws.com/550878b58559170febf8e69b/500x600/ff07160366332b3962320bdfb1693e3e/download_%282%29.jpg">it</a> is not broken anymore</p>\n'
@lepture

This comment has been minimized.

Show comment
Hide comment
@lepture

lepture Mar 25, 2015

Owner

I see. You have () in the URL, which is not safe in the URL.

Owner

lepture commented Mar 25, 2015

I see. You have () in the URL, which is not safe in the URL.

@lepture lepture added the bug label Mar 25, 2015

@tonyseek

This comment has been minimized.

Show comment
Hide comment
@tonyseek

tonyseek Apr 22, 2015

It seems the URL string need to be normalized. Using werkzeug.urls may be better than scheme + '://' + netloc.

from werkzeug.urls import url_parse, url_quote


def safe_url(url):
    parsed = url_parse(url)
    path = url_quote(parsed.path, safe='/%')
    query = url_quote(parsed.query, safe='?=&')
    return parsed.replace(path=path, query=query).to_url()


url = 'https://trello-attachments.s3.amazonaws.com/550878b58559170febf8e69b/500x600/ff07160366332b3962320bdfb1693e3e/download_(2).jpg'
print mistune.markdown("here's a [URL](%s)" % safe_url(url))

There is a similar solution in douban/brownant.

It seems the URL string need to be normalized. Using werkzeug.urls may be better than scheme + '://' + netloc.

from werkzeug.urls import url_parse, url_quote


def safe_url(url):
    parsed = url_parse(url)
    path = url_quote(parsed.path, safe='/%')
    query = url_quote(parsed.query, safe='?=&')
    return parsed.replace(path=path, query=query).to_url()


url = 'https://trello-attachments.s3.amazonaws.com/550878b58559170febf8e69b/500x600/ff07160366332b3962320bdfb1693e3e/download_(2).jpg'
print mistune.markdown("here's a [URL](%s)" % safe_url(url))

There is a similar solution in douban/brownant.

lepture added a commit that referenced this issue May 29, 2015

An work around for links that contain a ")"
In case that you need to render links with ")" like #46, you need to
wrap it in `<` and `>`.
@lepture

This comment has been minimized.

Show comment
Hide comment
@lepture

lepture May 29, 2015

Owner

It is not easy to write a proper regex for this situation. I've fix the situation in this case:

[foo](<http://foo.bar.(2).jpg>)

You need to wrap the link with < and >.

Owner

lepture commented May 29, 2015

It is not easy to write a proper regex for this situation. I've fix the situation in this case:

[foo](<http://foo.bar.(2).jpg>)

You need to wrap the link with < and >.

@lepture

This comment has been minimized.

Show comment
Hide comment
@lepture

lepture Jun 17, 2015

Owner

Not really solved, but a workaround.

Owner

lepture commented Jun 17, 2015

Not really solved, but a workaround.

@lepture lepture closed this Jun 17, 2015

@fiatjaf

This comment has been minimized.

Show comment
Hide comment
@fiatjaf

fiatjaf Jun 17, 2015

Can I wrap every link, no only the ones with )?

fiatjaf commented Jun 17, 2015

Can I wrap every link, no only the ones with )?

@lepture

This comment has been minimized.

Show comment
Hide comment
Owner

lepture commented Jun 17, 2015

@fiatjaf yes.

@pyup-bot pyup-bot referenced this issue in rochacbruno/quokka Feb 6, 2018

Closed

Pin mistune to latest version 0.8.3 #547

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment