Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Comparing changes

Choose two branches to see what's changed or to start a new pull request. If you need to, you can also compare across forks.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also compare across forks.
base fork: Pylons/pyramid
base: 02ad73df38
...
head fork: Pylons/pyramid
compare: 980f0a3be6
Checking mergeability… Don't worry, you can still create the pull request.
  • 2 commits
  • 10 files changed
  • 0 commit comments
  • 1 contributor
View
63 CHANGES.txt
@@ -1,3 +1,66 @@
+Next release
+============
+
+- Literal portions of route patterns were not URL-quoted when ``route_url``
+ or ``route_path`` was used to generate a URL or path.
+
+- The result of ``route_path`` or ``route_url`` might have been ``unicode``
+ or ``str`` depending on the input. It is now guaranteed to always be
+ ``str``.
+
+- URL matching when the pattern contained non-ASCII characters in literal
+ parts was indeterminate. Now the pattern supplied to ``add_route`` is
+ assumed to be either: a ``unicode`` value, or a ``str`` value that contains
+ only ASCII characters. If you now want to match the path info from a URL
+ that contains high order characters, you can pass the Unicode
+ representation of the decoded path portion in the pattern.
+
+- When using a ``traverse=`` route predicate, traversal would fail with a
+ URLDecodeError if there were any high-order characters in the traversal
+ pattern or in the matched dynamic segments.
+
+- Using a dynamic segment named ``traverse`` in a route pattern like this::
+
+ config.add_route('trav_route', 'traversal/{traverse:.*}')
+
+ Would cause a ``UnicodeDecodeError`` when the route was matched and the
+ matched portion of the URL contained any high-order characters. See
+ https://github.com/Pylons/pyramid/issues/385 .
+
+- When using a ``*traverse`` stararg in a route pattern, a URL that matched
+ that possessed a ``@@`` in its name (signifying a view name) would be
+ inappropriately quoted by the traversal machinery during traversal,
+ resulting in the view not being found properly. See
+ https://github.com/Pylons/pyramid/issues/382 and
+ https://github.com/Pylons/pyramid/issues/375 .
+
+Backwards Incompatibilities
+---------------------------
+
+- String values passed to ``route_url`` or ``route_path`` that are meant to
+ replace "remainder" matches will now be URL-quoted except for embedded
+ slashes. For example::
+
+ config.add_route('remain', '/foo*remainder')
+ request.route_path('remain', remainder='abc / def')
+ # -> '/foo/abc%20/%20def'
+
+ Previously string values passed as remainder replacements were tacked on
+ untouched, without any URL-quoting. But this doesn't really work logically
+ if the value passed is Unicode (raw unicode cannot be placed in a URL or in
+ a path) and it is inconsistent with the rest of the URL generation
+ machinery if the value is a string (it won't be quoted unless by the
+ caller).
+
+ Some folks will have been relying on the older behavior to tack on query
+ string elements and anchor portions of the URL; sorry, you'll need to
+ change your code to use the ``_query`` and/or ``_anchor`` arguments to
+ ``route_path`` or ``route_url`` to do this now.
+
+- If you pass a bytestring that contains non-ASCII characters to
+ ``add_route`` as a pattern, it will now fail at startup time. Use Unicode
+ instead.
+
1.2.5 (2011-12-14)
==================
View
123 docs/narr/urldispatch.rst
@@ -233,9 +233,9 @@ following pattern:
When matching the following URL:
-.. code-block:: text
-
- foo/La%20Pe%C3%B1a
+ .. code-block:: text
+
+ http://example.com/foo/La%20Pe%C3%B1a
The matchdict will look like so (the value is URL-decoded / UTF-8 decoded):
@@ -243,6 +243,51 @@ The matchdict will look like so (the value is URL-decoded / UTF-8 decoded):
{'bar':u'La Pe\xf1a'}
+Literal strings in the path segment should represent the *decoded* value of
+the ``PATH_INFO`` provided to Pyramid. You don't want to use a URL-encoded
+value or a bytestring representing the literal's UTF-8 in the pattern. For
+example, rather than this:
+
+.. code-block:: text
+
+ /Foo%20Bar/{baz}
+
+You'll want to use something like this:
+
+.. code-block:: text
+
+ /Foo Bar/{baz}
+
+For patterns that contain "high-order" characters in its literals, you'll
+want to use a Unicode value as the pattern as opposed to any URL-encoded or
+UTF-8-encoded value. For example, you might be tempted to use a bytestring
+pattern like this:
+
+.. code-block:: text
+
+ /La Pe\xc3\xb1a/{x}
+
+But this will either cause an error at startup time or it won't match
+properly. You'll want to use a Unicode value as the pattern instead rather
+than raw bytestring escapes. You can use a high-order Unicode value as the
+pattern by using `Python source file encoding
+<http://www.python.org/dev/peps/pep-0263/>`_ plus the "real" character in the
+Unicode pattern in the source, like so:
+
+.. code-block:: text
+
+ /La Peña/{x}
+
+Or you can ignore source file encoding and use equivalent Unicode escape
+characters in the pattern.
+
+.. code-block:: text
+
+ /La Pe\xf1a/{x}
+
+Dynamic segment names cannot contain high-order characters, so this applies
+only to literals in the pattern.
+
If the pattern has a ``*`` in it, the name which follows it is considered a
"remainder match". A remainder match *must* come at the end of the pattern.
Unlike segment replacement markers, it does not need to be preceded by a
@@ -612,7 +657,6 @@ Use the :meth:`pyramid.request.Request.route_url` method to generate URLs
based on route patterns. For example, if you've configured a route with the
``name`` "foo" and the ``pattern`` "{a}/{b}/{c}", you might do this.
-.. ignore-next-block
.. code-block:: python
:linenos:
@@ -620,8 +664,75 @@ based on route patterns. For example, if you've configured a route with the
This would return something like the string ``http://example.com/1/2/3`` (at
least if the current protocol and hostname implied ``http://example.com``).
-See the :meth:`~pyramid.request.Request.route_url` API documentation for more
-information.
+
+
+To generate only the *path* portion of a URL from a route, use the
+:meth:`pyramid.request.Request.route_path` API instead of
+:meth:`~pyramid.request.Request.route_url`.
+
+.. code-block:: python
+
+ url = request.route_path('foo', a='1', b='2', c='3')
+
+This will return the string ``/1/2/3`` rather than a full URL.
+
+Replacement values passed to ``route_url`` or ``route_path`` must be Unicode
+or bytestrings encoded in UTF-8. One exception to this rule exists: if
+you're trying to replace a "remainder" match value (a ``*stararg``
+replacement value), the value may be a tuple containing Unicode strings or
+UTF-8 strings.
+
+Note that URLs and paths generated by ``route_path`` and ``route_url`` are
+always URL-quoted string types (they contain no non-ASCII characters).
+Therefore, if you've added a route like so:
+
+.. code-block:: python
+
+ config.add_route('la', u'/La Peña/{city}')
+
+And you later generate a URL using ``route_path`` or ``route_url`` like so:
+
+.. code-block:: python
+
+ url = request.route_path('la', city=u'Québec')
+
+You will wind up with the path encoded to UTF-8 and URL quoted like so:
+
+.. code-block:: text
+
+ /La%20Pe%C3%B1a/Qu%C3%A9bec
+
+If you have a ``*stararg`` remainder dynamic part of your route pattern:
+
+.. code-block:: python
+
+ config.add_route('abc', 'a/b/c/*foo')
+
+And you later generate a URL using ``route_path`` or ``route_url`` using a
+*string* as the replacement value:
+
+.. code-block:: python
+
+ url = request.route_path('abc', foo=u'Québec/biz')
+
+The value you pass will be URL-quoted except for embedded slashes in the
+result:
+
+.. code-block:: text
+
+ /a/b/c/Qu%C3%A9bec/biz
+
+You can get a similar result by passing a tuple composed of path elements:
+
+.. code-block:: python
+
+ url = request.route_path('abc', foo=(u'Québec', u'biz'))
+
+Each value in the tuple will be url-quoted and joined by slashes in this case:
+
+.. code-block:: text
+
+ /a/b/c/Qu%C3%A9bec/biz
.. index::
single: static routes
View
7 pyramid/config/testing.py
@@ -6,7 +6,8 @@
from pyramid.interfaces import IRendererFactory
from pyramid.renderers import RendererHelper
-from pyramid.traversal import traversal_path_info
+
+from pyramid.traversal import split_path_info
from pyramid.config.util import action_method
@@ -64,9 +65,9 @@ def __init__(self, context):
self.context = context
def __call__(self, request):
- path = request.environ['PATH_INFO']
+ path = request.environ['PATH_INFO'].decode('utf-8')
ob = resources[path]
- traversed = traversal_path_info(path)
+ traversed = split_path_info(path)
return {'context':ob, 'view_name':'','subpath':(),
'traversed':traversed, 'virtual_root':ob,
'virtual_root_path':(), 'root':ob}
View
4 pyramid/config/util.py
@@ -3,7 +3,7 @@
from pyramid.exceptions import ConfigurationError
from pyramid.traversal import find_interface
-from pyramid.traversal import traversal_path_info
+from pyramid.traversal import traversal_path
from hashlib import md5
@@ -237,7 +237,7 @@ def traverse_predicate(context, request):
return True
m = context['match']
tvalue = tgenerate(m)
- m['traverse'] = traversal_path_info(tvalue)
+ m['traverse'] = traversal_path(tvalue)
return True
# This isn't actually a predicate, it's just a infodict
# modifier that injects ``traverse`` into the matchdict. As a
View
15 pyramid/tests/test_config/test_util.py
@@ -227,6 +227,21 @@ def test_traverse_matches(self):
self.assertEqual(info, {'match':
{'a':'a', 'b':'b', 'traverse':('1', 'a', 'b')}})
+ def test_traverse_matches_with_highorder_chars(self):
+ order, predicates, phash = self._callFUT(
+ traverse=unicode('/La Pe\xc3\xb1a/{x}', 'utf-8'))
+ self.assertEqual(len(predicates), 1)
+ pred = predicates[0]
+ info = {'match':{'x':unicode('Qu\xc3\xa9bec', 'utf-8')}}
+ request = DummyRequest()
+ result = pred(info, request)
+ self.assertEqual(result, True)
+ self.assertEqual(
+ info['match']['traverse'],
+ (unicode('La Pe\xc3\xb1a', 'utf-8'),
+ unicode('Qu\xc3\xa9bec', 'utf-8'))
+ )
+
def test_custom_predicates_can_affect_traversal(self):
def custom(info, request):
m = info['match']
View
44 pyramid/tests/test_traversal.py
@@ -148,6 +148,26 @@ def test_call_with_no_pathinfo(self):
self.assertEqual(result['virtual_root'], policy.root)
self.assertEqual(result['virtual_root_path'], ())
+ def test_call_with_pathinfo_highorder(self):
+ foo = DummyContext(None, unicode('Qu\xc3\xa9bec', 'utf-8'))
+ root = DummyContext(foo, 'root')
+ policy = self._makeOne(root)
+ path_info = '/Qu\xc3\xa9bec'
+ environ = self._getEnviron(PATH_INFO=path_info)
+ request = DummyRequest(environ)
+ result = policy(request)
+ self.assertEqual(result['context'], foo)
+ self.assertEqual(result['view_name'], '')
+ self.assertEqual(result['subpath'], ())
+ self.assertEqual(
+ result['traversed'],
+ (unicode('Qu\xc3\xa9bec', 'utf-8'),)
+ )
+ self.assertEqual(result['root'], policy.root)
+ self.assertEqual(result['virtual_root'], policy.root)
+ self.assertEqual(result['virtual_root_path'], ())
+
+
def test_call_pathel_with_no_getitem(self):
policy = self._makeOne(None)
environ = self._getEnviron(PATH_INFO='/foo/bar')
@@ -306,6 +326,30 @@ def test_call_with_vh_root_path_root(self):
self.assertEqual(result['virtual_root'], policy.root)
self.assertEqual(result['virtual_root_path'], ())
+ def test_call_with_vh_root_highorder(self):
+ bar = DummyContext(None, 'bar')
+ foo = DummyContext(bar, unicode('Qu\xc3\xa9bec', 'utf-8'))
+ root = DummyContext(foo, 'root')
+ policy = self._makeOne(root)
+ vhm_root = '/Qu\xc3\xa9bec'
+ environ = self._getEnviron(HTTP_X_VHM_ROOT=vhm_root,
+ PATH_INFO='/bar')
+ request = DummyRequest(environ)
+ result = policy(request)
+ self.assertEqual(result['context'], bar)
+ self.assertEqual(result['view_name'], '')
+ self.assertEqual(result['subpath'], ())
+ self.assertEqual(
+ result['traversed'],
+ (unicode('Qu\xc3\xa9bec', 'utf-8'), u'bar')
+ )
+ self.assertEqual(result['root'], policy.root)
+ self.assertEqual(result['virtual_root'], foo)
+ self.assertEqual(
+ result['virtual_root_path'],
+ (unicode('Qu\xc3\xa9bec', 'utf-8'),)
+ )
+
def test_non_utf8_path_segment_unicode_path_segments_fails(self):
foo = DummyContext()
root = DummyContext(foo)
View
79 pyramid/tests/test_urldispatch.py
@@ -113,6 +113,12 @@ def test_connect_static_overridden(self):
self.assertEqual(mapper.routelist[0].pattern,
'archives/:action/:article2')
+ def test___call__pathinfo_cant_be_decoded(self):
+ from pyramid.exceptions import URLDecodeError
+ mapper = self._makeOne()
+ request = self._getRequest(PATH_INFO='\xff\xfe\xe6\x00')
+ self.assertRaises(URLDecodeError, mapper, request)
+
def test___call__route_matches(self):
mapper = self._makeOne()
mapper.connect('foo', 'archives/:action/:article')
@@ -290,11 +296,6 @@ def test_no_beginning_slash(self):
self.assertEqual(matcher('foo/baz/biz/buz/bar'), None)
self.assertEqual(generator({'baz':1, 'buz':2}), '/foo/1/biz/2/bar')
- def test_url_decode_error(self):
- from pyramid.exceptions import URLDecodeError
- matcher, generator = self._callFUT('/:foo')
- self.assertRaises(URLDecodeError, matcher, '/\xff\xfe\x8b\x00')
-
def test_custom_regex(self):
matcher, generator = self._callFUT('foo/{baz}/biz/{buz:[^/\.]+}.{bar}')
self.assertEqual(matcher('/foo/baz/biz/buz.bar'),
@@ -325,7 +326,8 @@ def test_custom_regex_with_embedded_squigglies2(self):
self.assertEqual(generator({'buz':2001}), '/2001')
def test_custom_regex_with_embedded_squigglies3(self):
- matcher, generator = self._callFUT('/{buz:(\d{2}|\d{4})-[a-zA-Z]{3,4}-\d{2}}')
+ matcher, generator = self._callFUT(
+ '/{buz:(\d{2}|\d{4})-[a-zA-Z]{3,4}-\d{2}}')
self.assertEqual(matcher('/2001-Nov-15'), {'buz':'2001-Nov-15'})
self.assertEqual(matcher('/99-June-10'), {'buz':'99-June-10'})
self.assertEqual(matcher('/2-Nov-15'), None)
@@ -334,6 +336,63 @@ def test_custom_regex_with_embedded_squigglies3(self):
self.assertEqual(generator({'buz':'2001-Nov-15'}), '/2001-Nov-15')
self.assertEqual(generator({'buz':'99-June-10'}), '/99-June-10')
+ def test_pattern_with_high_order_literal(self):
+ pattern = unicode('/La Pe\xc3\xb1a/{x}', 'utf-8')
+ matcher, generator = self._callFUT(pattern)
+ self.assertEqual(matcher(unicode('/La Pe\xc3\xb1a/x', 'utf-8')),
+ {'x':'x'})
+ self.assertEqual(generator({'x':'1'}), '/La%20Pe%C3%B1a/1')
+
+ def test_pattern_generate_with_high_order_dynamic(self):
+ pattern = '/{x}'
+ _, generator = self._callFUT(pattern)
+ self.assertEqual(
+ generator({'x':unicode('La Pe\xc3\xb1a', 'utf-8')}),
+ '/La%20Pe%C3%B1a')
+
+ def test_docs_sample_generate(self):
+ # sample from urldispatch.rst
+ pattern = unicode('/La Pe\xc3\xb1a/{city}', 'utf-8')
+ _, generator = self._callFUT(pattern)
+ self.assertEqual(
+ generator({'city':unicode('Qu\xc3\xa9bec', 'utf-8')}),
+ '/La%20Pe%C3%B1a/Qu%C3%A9bec')
+
+ def test_generate_with_mixedtype_values(self):
+ pattern = '/{city}/{state}'
+ _, generator = self._callFUT(pattern)
+ result = generator(
+ {'city': unicode('Qu\xc3\xa9bec', 'utf-8'),
+ 'state': 'La Pe\xc3\xb1a'}
+ )
+ self.assertEqual(result, '/Qu%C3%A9bec/La%20Pe%C3%B1a')
+ # should be a native string
+ self.assertEqual(type(result), str)
+
+ def test_highorder_pattern_utf8(self):
+ pattern = '/La Pe\xc3\xb1a/{city}'
+ self.assertRaises(ValueError, self._callFUT, pattern)
+
+ def test_generate_with_string_remainder_and_unicode_replacement(self):
+ pattern = u'/abc*remainder'
+ _, generator = self._callFUT(pattern)
+ result = generator(
+ {'remainder': unicode('/Qu\xc3\xa9bec/La Pe\xc3\xb1a', 'utf-8')}
+ )
+ self.assertEqual(result, '/abc/Qu%C3%A9bec/La%20Pe%C3%B1a')
+ # should be a native string
+ self.assertEqual(type(result), str)
+
+ def test_generate_with_string_remainder_and_nonstring_replacement(self):
+ pattern = unicode('/abc/*remainder', 'utf-8')
+ _, generator = self._callFUT(pattern)
+ result = generator(
+ {'remainder': None}
+ )
+ self.assertEqual(result, '/abc/None')
+ # should be a native string
+ self.assertEqual(type(result), str)
+
class TestCompileRouteFunctional(unittest.TestCase):
def matches(self, pattern, path, expected):
from pyramid.urldispatch import _compile_route
@@ -364,8 +423,8 @@ def test_matcher_functional_newstyle(self):
{'x':'abc', 'traverse':('def', 'g')})
self.matches('*traverse', '/zzz/abc', {'traverse':('zzz', 'abc')})
self.matches('*traverse', '/zzz/%20abc', {'traverse':('zzz', '%20abc')})
- self.matches('{x}', '/La Pe\xc3\xb1a', {'x': u'La Pe\xf1a'})
- self.matches('*traverse', '/La Pe\xc3\xb1a/x',
+ self.matches('{x}', u'/La Pe\xf1a', {'x': u'La Pe\xf1a'})
+ self.matches('*traverse', u'/La Pe\xf1a/x',
{'traverse': (u'La Pe\xf1a', u'x')})
self.matches('/foo/{id}.html', '/foo/bar.html', {'id':'bar'})
self.matches('/{num:[0-9]+}/*traverse', '/555/abc/def',
@@ -387,8 +446,8 @@ def test_matcher_functional_oldstyle(self):
{'x':'abc', 'traverse':('def', 'g')})
self.matches('*traverse', '/zzz/abc', {'traverse':('zzz', 'abc')})
self.matches('*traverse', '/zzz/%20abc', {'traverse':('zzz', '%20abc')})
- self.matches(':x', '/La Pe\xc3\xb1a', {'x': u'La Pe\xf1a'})
- self.matches('*traverse', '/La Pe\xc3\xb1a/x',
+ self.matches(':x', u'/La Pe\xf1a', {'x': u'La Pe\xf1a'})
+ self.matches('*traverse', u'/La Pe\xf1a/x',
{'traverse': (u'La Pe\xf1a', u'x')})
self.matches('/foo/:id.html', '/foo/bar.html', {'id':'bar'})
self.matches('/foo/:id_html', '/foo/bar_html', {'id_html':'bar_html'})
View
35 pyramid/traversal.py
@@ -419,9 +419,11 @@ def traversal_path(path):
raised if the Unicode cannot be encoded directly to ASCII.
"""
if isinstance(path, unicode):
+ # must not possess characters outside ascii
path = path.encode('ascii')
- path = urllib.unquote(path)
- return traversal_path_info(path)
+ # we unquote this path exactly like a PEP 3333 server would
+ path = urllib.unquote(path) # result will be a native string
+ return traversal_path_info(path) # result will be a tuple of unicode
@lru_cache(1000)
def traversal_path_info(path):
@@ -493,6 +495,12 @@ def traversal_path_info(path):
path = path.decode('utf-8')
except UnicodeDecodeError, e:
raise URLDecodeError(e.encoding, e.object, e.start, e.end, e.reason)
+ return split_path_info(path)
+
+@lru_cache(1000)
+def split_path_info(path):
+ # suitable for splitting an already-unquoted-already-decoded (unicode)
+ # path value
path = path.strip('/')
clean = []
for segment in path.split('/'):
@@ -581,25 +589,34 @@ def __call__(self, request):
path = matchdict.get('traverse', '/') or '/'
if hasattr(path, '__iter__'):
# this is a *traverse stararg (not a {traverse})
- path = '/'.join([quote_path_segment(x) for x in path]) or '/'
+ # routing has already decoded these elements, so we just
+ # need to join them
+ path = '/'.join(path) or '/'
subpath = matchdict.get('subpath', ())
if not hasattr(subpath, '__iter__'):
# this is not a *subpath stararg (just a {subpath})
- subpath = traversal_path_info(subpath)
+ # routing has already decoded this string, so we just need
+ # to split it
+ subpath = split_path_info(subpath)
else:
# this request did not match a route
subpath = ()
try:
- path = environ['PATH_INFO'] or '/'
+ # empty if mounted under a path in mod_wsgi, for example
+ path = (environ['PATH_INFO'] or '/').decode('utf-8')
except KeyError:
path = '/'
+ except UnicodeDecodeError, e:
+ raise URLDecodeError(e.encoding, e.object, e.start, e.end,
+ e.reason)
if VH_ROOT_KEY in environ:
- vroot_path = environ[VH_ROOT_KEY]
- vroot_tuple = traversal_path_info(vroot_path)
- vpath = vroot_path + path
+ # HTTP_X_VHM_ROOT
+ vroot_path = environ[VH_ROOT_KEY].decode('utf-8')
+ vroot_tuple = split_path_info(vroot_path)
+ vpath = vroot_path + path # both will (must) be unicode or asciistr
vroot_idx = len(vroot_tuple) -1
else:
vroot_tuple = ()
@@ -619,7 +636,7 @@ def __call__(self, request):
# and this hurts readability; apologies
i = 0
view_selector = self.VIEW_SELECTOR
- vpath_tuple = traversal_path_info(vpath)
+ vpath_tuple = split_path_info(vpath)
for segment in vpath_tuple:
if segment[:2] == view_selector:
return {'context':ob,
View
2  pyramid/url.py
@@ -58,7 +58,7 @@ def route_url(self, route_name, *elements, **kw):
encoded to UTF-8. The resulting strings are joined with slashes
and rendered into the URL. If a string is passed as a
``*remainder`` replacement value, it is tacked on to the URL
- untouched.
+ after being URL-quoted-except-for-embedded-slashes.
If a keyword argument ``_query`` is present, it will be used to
compose a query string that will be tacked on to the end of the
View
119 pyramid/urldispatch.py
@@ -1,13 +1,11 @@
import re
-from urllib import unquote
from zope.interface import implements
from pyramid.interfaces import IRoutesMapper
from pyramid.interfaces import IRoute
-from pyramid.encode import url_quote
from pyramid.exceptions import URLDecodeError
-from pyramid.traversal import traversal_path_info
+from pyramid.traversal import split_path_info
from pyramid.traversal import quote_path_segment
_marker = object()
@@ -58,9 +56,11 @@ def __call__(self, request):
environ = request.environ
try:
# empty if mounted under a path in mod_wsgi, for example
- path = environ['PATH_INFO'] or '/'
+ path = (environ['PATH_INFO'] or '/' ).decode('utf-8')
except KeyError:
path = '/'
+ except UnicodeDecodeError, e:
+ raise URLDecodeError(e.encoding, e.object, e.start, e.end, e.reason)
for route in self.routelist:
match = route.match(path)
@@ -88,62 +88,96 @@ def update_pattern(matchobj):
return '{%s}' % name[1:]
def _compile_route(route):
+ # This function really wants to consume Unicode patterns natively, but if
+ # someone passes us a bytestring, we allow it by converting it to Unicode
+ # using the ASCII decoding. We decode it using ASCII because we dont
+ # want to accept bytestrings with high-order characters in them here as
+ # we have no idea what the encoding represents.
+ if route.__class__ is not unicode:
+ try:
+ route = unicode(route, 'ascii')
+ except UnicodeDecodeError:
+ raise ValueError(
+ 'The pattern value passed to add_route must be '
+ 'either a Unicode string or a plain string without '
+ 'any non-ASCII characters (you provided %r).' % route)
+
if old_route_re.search(route) and not route_re.search(route):
route = old_route_re.sub(update_pattern, route)
if not route.startswith('/'):
route = '/' + route
- star = None
+ remainder = None
if star_at_end.search(route):
- route, star = route.rsplit('*', 1)
+ route, remainder = route.rsplit('*', 1)
pat = route_re.split(route)
+
+ # every element in "pat" will be Unicode (regardless of whether the
+ # route_re regex pattern is itself Unicode or str)
pat.reverse()
rpat = []
gen = []
prefix = pat.pop() # invar: always at least one element (route='/'+route)
- rpat.append(re.escape(prefix))
- gen.append(prefix)
+
+
+ # We want to generate URL-encoded URLs, so we url-quote the prefix, being
+ # careful not to quote any embedded slashes. We have to replace '%' with
+ # '%%' afterwards, as the strings that go into "gen" are used as string
+ # replacement targets.
+ gen.append(quote_path_segment(prefix, safe='/').replace('%', '%%')) # native
+ rpat.append(re.escape(prefix)) # unicode
while pat:
- name = pat.pop()
+ name = pat.pop() # unicode
name = name[1:-1]
if ':' in name:
name, reg = name.split(':')
else:
reg = '[^/]+'
- gen.append('%%(%s)s' % name)
- name = '(?P<%s>%s)' % (name, reg)
+ gen.append('%%(%s)s' % name.encode('utf-8')) # native
+ name = '(?P<%s>%s)' % (name, reg) # unicode
rpat.append(name)
- s = pat.pop()
+ s = pat.pop() # unicode
if s:
- rpat.append(re.escape(s))
- gen.append(s)
+ rpat.append(re.escape(s)) # unicode
+ # We want to generate URL-encoded URLs, so we url-quote this
+ # literal in the pattern, being careful not to quote the embedded
+ # slashes. We have to replace '%' with '%%' afterwards, as the
+ # strings that go into "gen" are used as string replacement
+ # targets. What is appended to gen is a native string.
+ gen.append(quote_path_segment(s, safe='/').replace('%', '%%'))
- if star:
- rpat.append('(?P<%s>.*?)' % star)
- gen.append('%%(%s)s' % star)
+ if remainder:
+ rpat.append('(?P<%s>.*?)' % remainder) # unicode
+ gen.append('%%(%s)s' % remainder.encode('utf-8')) # native
- pattern = ''.join(rpat) + '$'
+ pattern = ''.join(rpat) + '$' # unicode
match = re.compile(pattern).match
def matcher(path):
+ # This function really wants to consume Unicode patterns natively,
+ # but if someone passes us a bytestring, we allow it by converting it
+ # to Unicode using the ASCII decoding. We decode it using ASCII
+ # because we dont want to accept bytestrings with high-order
+ # characters in them here as we have no idea what the encoding
+ # represents.
+ if path.__class__ is not unicode:
+ path = unicode(path, 'ascii')
m = match(path)
if m is None:
- return m
+ return None
d = {}
for k, v in m.groupdict().iteritems():
- if k == star:
- d[k] = traversal_path_info(v)
+ # k and v will be Unicode 2.6.4 and lower doesnt accept unicode
+ # kwargs as **kw, so we explicitly cast the keys to native
+ # strings in case someone wants to pass the result as **kw
+ nk = k.encode('ascii')
+ if k == remainder:
+ d[nk] = split_path_info(v)
else:
- try:
- d[k] = v.decode('utf-8')
- except UnicodeDecodeError, e:
- raise URLDecodeError(
- e.encoding, e.object, e.start, e.end, e.reason
- )
-
+ d[nk] = v
return d
@@ -152,16 +186,27 @@ def matcher(path):
def generator(dict):
newdict = {}
for k, v in dict.items():
- if isinstance(v, unicode):
+ if v.__class__ is unicode:
+ # url_quote below needs bytes, not unicode on Py2
v = v.encode('utf-8')
- if k == star and hasattr(v, '__iter__'):
- v = '/'.join([quote_path_segment(x) for x in v])
- elif k != star:
- try:
- v = url_quote(v)
- except TypeError:
- pass
+
+ if k == remainder:
+ # a stararg argument
+ if hasattr(v, '__iter__'):
+ v = '/'.join([quote_path_segment(x) for x in v]) # native
+ else:
+ if v.__class__ not in (str, unicode):
+ v = str(v)
+ v = quote_path_segment(v, safe='/')
+ else:
+ if v.__class__ not in (str, unicode):
+ v = str(v)
+ v = quote_path_segment(v)
+
+ # at this point, the value will be a native string
newdict[k] = v
- return gen % newdict
+
+ result = gen % newdict # native string result
+ return result
return matcher, generator

No commit comments for this range

Something went wrong with that request. Please try again.