Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-ascii characters in the URL result in malformed data #371

Closed
dcelasun opened this issue Sep 5, 2012 · 6 comments
Closed

Non-ascii characters in the URL result in malformed data #371

dcelasun opened this issue Sep 5, 2012 · 6 comments
Milestone

Comments

@dcelasun
Copy link

dcelasun commented Sep 5, 2012

Consider the hello world example:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from bottle import route, run
@route('/hello/<name>')
def index(name='World'):
    return '<b>Hello %s!</b>' % name

run(host='localhost', port=8080)

Here, calling the url http://localhost/hello/ŞĞÜ results in:

Hello ���!

Any ideas?

@iurisilvio
Copy link
Member

It works fine here, maybe it is just your browser with wrong encoding.

@dcelasun
Copy link
Author

dcelasun commented Sep 5, 2012

I've tested with Firefox and Chrome (on Linux and Windows) and it gave me garbled text.

What's your environment?

@dcelasun
Copy link
Author

dcelasun commented Sep 6, 2012

Found a fix! Encoding the string in latin-1 and then decoding in utf-8 works. Why is this necessary?

@bottle.route('/hello/<name>')
def index(name='World'):
    return '<b>Hello %s!</b>' % (name.encode('latin-1').decode('utf-8'))

@defnull
Copy link
Member

defnull commented Sep 16, 2012

Because WSGI 3333 (the Python3 WSGI standard) requires all incoming data to be latin1-encoded unicode and a framework has no way to guess the correct encoding.

Bottle assumes utf-8 in many places already, though. This should be cleaned up in a future release (it breaks backward compatibility, so it's not that easy)

@coffeeowl
Copy link

I guess I have the same problem. I am writing a webapp and unfortunately some URLs might contain non-latin characters. As a result I get a garbage instead of normal URL parameters and can query a DB with them. I am not sure what are the consequences of my hack, but it was easy to fix, I have changed the encoding to utf-8 on this line - https://github.com/defnull/bottle/blob/master/bottle.py#L82
And yes, I am using Python3.

@blastrock
Copy link

I had this same bug, it took me a while to understand what was going on. I couldn't find anywhere in the documentation where this is mentioned, please someone document it and include that encode("latin1") trick until you fix it.
To fix it and keep backward compatibility, maybe you can put a parameter in bottle.run() to tell that the encoding should be UTF-8 and have this parameter set to latin1 by default.

Btw, coffeeowl's fix does not work for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants