Skip to content
Browse files

Modify sanitize_url to accept IPv6 addresses

 * After the change the function also accepts some invalid links, but
   after discussion with spladug, it should not be a problem, since
   many invalid addresses that were accepted before the change exist
  • Loading branch information...
1 parent 164fe49 commit b713eca226c21e3202d70d04e329defeb919e237 @k21 committed Nov 29, 2011
Showing with 1 addition and 1 deletion.
  1. +1 −1 r2/r2/lib/utils/utils.py
View
2 r2/r2/lib/utils/utils.py
@@ -266,7 +266,7 @@ def get_title(url):
return None
valid_schemes = ('http', 'https', 'ftp', 'mailto')
-valid_dns = re.compile('\A[-a-zA-Z0-9]+\Z')
+valid_dns = re.compile('\A[-a-zA-Z0-9:]+\Z')
@spladug
spladug added a note Nov 30, 2011

I believe [ and ] need to be allowed by the regex as well, right?

@k21
Owner
k21 added a note Nov 30, 2011

spladug: they do not have to, because if the link contains IPv6 address, urlparse says that the hostname is the address and it removes the brackets automatically

@spladug
spladug added a note Nov 30, 2011

That doesn't seem to be the case when I test it, am I doing it wrong?

Python 2.7.1+ (r271:86832, Apr 11 2011, 18:13:53) 
[GCC 4.5.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> urlparse.urlparse("http://[3ffe:2a00:100:7031::1]")
ParseResult(scheme='http', netloc='[3ffe:2a00:100:7031::1]', path='', params='', query='', fragment='')
@k21
Owner
k21 added a note Nov 30, 2011

sanitize_url uses hostname, not netloc. Demo:

Python 2.7.2+ (default, Aug 16 2011, 09:23:59) 
[GCC 4.6.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from urlparse import urlparse
>>> u = urlparse('http://[3ffe:2a00:100:7031::1]')
>>> u
ParseResult(scheme='http', netloc='[3ffe:2a00:100:7031::1]', path='', params='', query='', fragment='')
>>> u.hostname
'3ffe:2a00:100:7031::1'
@spladug
spladug added a note Nov 30, 2011

Yay!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
def sanitize_url(url, require_scheme = False):
"""Validates that the url is of the form

0 comments on commit b713eca

Please sign in to comment.
Something went wrong with that request. Please try again.