Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Problem with non-escaped chars in URL. #648

Closed
gabrielpjordao opened this Issue · 7 comments

4 participants

@gabrielpjordao

Is it a project's decision or a bug? Please, check it out:

Gist:
https://gist.github.com/4259170

@soulseekah
from jinja2 import Environment

t = Environment().from_string( '{{ s }}' )
print t.render( s='tesét' )

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3: ordinal not in range(128)

See Unicode in Jinja2.

print t.render( s='tesét'.decode( 'utf8' ) )

or

print t.render( s=u'tesét' )

will work. Question is: should flask.request.url be a unicode string?

@boris317

Per RFC1738 characters in a URL must be ASCII or else encoded with "%". Even if request.url was a unicode string, would that not just move the point of failure to the place where request.url is decoded to unicode?

@soulseekah

The client (IE) is not conforming to the RFC. Werkzeug has to encode, which I think it does in latest trunk using iri_to_uri, inside of here it seems https://github.com/mitsuhiko/werkzeug/blob/master/werkzeug/wrappers.py#L980 so the request is coming in without any issues; request.url is not a unicode string.

request.url comes from here, so perhaps something inside of here has to be thought over. So would we like to force the current URL returned to be a unicode IRI or an ascii URL? And remember the values are supplied by the actual webserver via the environ that it passes through to Werkzeug wrappers.

@boris317

I think I like the idea of werkzeug trying to take my naughty url and turn it into a properly escaped ascii url. Should the resulting request.url value be unicode string though, I do not know.

Now if we do receive a non ascii URL we can only take a best guess as to what the actual encoding is. UTF-8 is probably the best guess. The code responsible for decoding the URL would need to do it in a manner similar to _decode_unicode. Though I think raising a HTTPUnicodeError if it fails is probably better than just replacing the offending characters.

@gabrielpjordao

Thanks for replying the issue so fast :+1:

You should decide what's better for flask, but I really like the idea of having it treated internally.

Independently of who sends the URL (IE, CURL, Whatever), I think that this is something that the end-user of flask should not have to worry about, as flask is intended to be "simple".

@mitsuhiko
Owner

At the moment request.url is indeed not a unicode string. This might change in the future.

@mitsuhiko mitsuhiko closed this
@mitsuhiko
Owner

Actually, since that will only affect the QUERY_STRING i could fix that. Might do in a new werkzeug release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.