Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsed arguments in Python 3.x w/ Tornado are bytestrings instead of unicode #41

Closed
thomasboyt opened this issue May 12, 2015 · 3 comments

Comments

@thomasboyt
Copy link

commented May 12, 2015

I ran into some unexpected behavior using this library with Tornado in Python 3.x. It's easy to see using the hello world example:

import tornado.ioloop
from tornado.web import RequestHandler
from webargs import Arg
from webargs.tornadoparser import use_args

class HelloHandler(RequestHandler):
    """A welcome page."""

    hello_args = {
        'name': Arg(str, default='Friend')
    }

    @use_args(hello_args)
    def get(self, args):
        response = {'message': 'Welcome, {}!'.format(args['name'])}
        self.write(response)


if __name__ == '__main__':
    app = tornado.web.Application([
        (r'/', HelloHandler),
    ], debug=True)
    app.listen(5001)
    tornado.ioloop.IOLoop.instance().start()

Querying localhost:5001?name=foo will result in the output:

{"message": "Welcome, b'foo'!"}

This is because Tornado's arguments are bytestrings in Python 3.x, and casting them to str causes them to be in the form b'<val>'.

There are a few ways to fix this behavior. The easiest, requiring no actual code change, is to just use tornado.escape.to_unicode:

from tornado.escape import to_unicode

hello_args = {
    'name': Arg(to_unicode, default='Friend')
}

This results in the output you'd expect, and is reasonable enough, though I believe this should be documented if it's the recommended way to use this library with Tornado in Python 3.x.

Alternatively, get_value in tornadoparser.py could be updated to use Tornado's get_argument methods, which automatically decode to unicode: http://tornado.readthedocs.org/en/latest/web.html#input. This would mean any custom converter method would be passed a regular (unicode) string instead of a bytestring, which may be preferred.

@sloria

This comment has been minimized.

Copy link
Member

commented May 22, 2015

Thanks @thomasboyt . I think using get_argument is probably the right direction, as it meets the most common use case.

Pull requests welcome!

@sloria sloria added the help wanted label May 22, 2015

@sloria

This comment has been minimized.

Copy link
Member

commented Sep 28, 2015

This should now be fixed by virtue of using marshmallow's field.Str() rather than Arg(str). That said, it appears that get_query_argument and get_body_argument are the preferred way to access request input, so I still think it's a good idea to change the implementation in TornadoParser to be consistent with those.

@sloria sloria closed this in ede6075 Sep 29, 2015

@sloria

This comment has been minimized.

Copy link
Member

commented Sep 29, 2015

Arguments are now decoded to unicode strings. Thanks again for the suggestion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.