Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

url parameters casting bytes to str #1502

Closed
fuhrysteve opened this issue Apr 2, 2019 · 8 comments

Comments

Projects
None yet
4 participants
@fuhrysteve
Copy link

commented Apr 2, 2019

Hi there! I noticed a change in behavior in werkzeug 0.15.x whereby url parameters used to be safe to send as bytes (on python 3.6).

I apologize for not showing a more direct example in werkzeug, but like most users of werkzeug - I don't use it directly - I use it through flask.

Demonstration (using Flask)

from flask import Flask, url_for

app = Flask(__name__)
app.config['SERVER_NAME'] = 'www.foo.com'


@app.route('/<string:foo>')
def hello(foo):
    return 'Hello, World!'


with app.app_context():
    print(url_for('hello', foo=b'bar'))

With werkzeug<0.15, this will output:

http://www.foo.com/bar

However as of werkzeug>=0.15, it now returns this malformed url:

http://www.foo.com/b%27bar%27

I would expect that if werkzeug isn't going to decode bytes for you anymore, that it ought to raise a TypeError indicating that bytes is no longer safe to pass in.

I hope this example makes sense! Thanks for reviewing my issue - I'd be happy to (attempt to) submit a pull request if y'all agree that this is not desirable behavior.

@edk0

This comment has been minimized.

Copy link
Member

commented Apr 3, 2019

I'll have a look at this. We (evidently) don't have tests for this, so I suspect accepting bytes was never intended behaviour.

@edk0

This comment has been minimized.

Copy link
Member

commented Apr 3, 2019

Seems like fixing this is as easy as catching it: edk0@794eb16 so I'm inclined that way.

@davidism

This comment has been minimized.

Copy link
Member

commented Apr 3, 2019

@edk0 I don't think you changed that code in your PR, was there a related change that cause this?

I'm not opposed to this, but it seems a little weird to be passing bytes directly in a URL. @fuhrysteve Under what circumstances do you have UTF-8 bytes and not the text? If you're really passing bytes, and not text, I'd expect to pass them as Base64, which would require a custom converter.

@fuhrysteve

This comment has been minimized.

Copy link
Author

commented Apr 3, 2019

@davidism in my case it was a json web token. PyJWT returns bytes by default (at least in py3).

In [1]: import jwt                                                                                                                                                                                           

In [2]: jwt.encode({'some': 'payload'}, 'secret', algorithm='HS256')                                                                                                                                         
Out[2]: b'eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJzb21lIjoicGF5bG9hZCJ9.Joh1R2dYzkRvDkqv3sygm5YyK8Gi4ShZqbhK2gxcs2U'
@edk0

This comment has been minimized.

Copy link
Member

commented Apr 3, 2019

@davidism I believe it was an unintended consequence—I swapped out the implementation of to_url for a faster one that only took bytes as input; the usual behaviour is to convert to text_type and then encode, so I did that, and didn't notice the change in behaviour for bytes.

I do wonder if it might be a better idea to add a bytes converter and have BaseConverter throw an error.

@fuhrysteve

This comment has been minimized.

Copy link
Author

commented Apr 3, 2019

@edk0 isn't everything passing through UnicodeConverter here, which is the "default converter"? At a glance, I'd suspect it ought to go there and have BaseConverter or something beneath it raise TypeError prior to any unsafe conversion to str (at least in py3).

I share your hesitation to make utf-8 assumptions in BaseConverter

@edk0

This comment has been minimized.

Copy link
Member

commented Apr 4, 2019

It's not actually making UTF-8 assumptions, just URL encoding the bytes as-is, but it does seem wrong.

I'm torn because we did special-case bytes in the past. Maybe we should add a converter that passes bytes as-is, and have BaseConverter do the same for now, but with a warning.

@greyli

This comment has been minimized.

Copy link
Member

commented Apr 22, 2019

Same problem here. I use itsdangerous to generate JWT token for user confirmation in this way:

from flask import current_app
from itsdangerous import TimedJSONWebSignatureSerializer as Serializer

def generate_token(user, operation, expire_in=None, **kwargs):
    s = Serializer(current_app.config['SECRET_KEY'], expire_in)

    data = {'id': user.id, 'operation': operation}
    data.update(**kwargs)
    return s.dumps(data)

The return value will be in bytes type, like b'eyJhbGciOiJ...YmQ'. Then I generate the confirmation URL with the token returned in email template:

{{ url_for('auth.confirm', token=token, _external=True) }}

With Werkzeug >= 0.15, It end up to something like this:

http://127.0.0.1:5000/confirm/b%27eyJhbGciOiJ...YmQ%27

Notice the unnecessary b%27 and %27 part in the URL, it indicates the b prefix and the quotes has been decoded into the URL.

Here is the view function that handles token confirmation:

@auth_bp.route('/confirm/<token>')
@login_required
def confirm(token):
    print(token)  # b%27eyJhbGciOiJ...YmQ%27

The correct token value should be just eyJhbGciOiJ...YmQ, so the token confirmation will fail finally.

Hope this made things a bit more clear.

edk0 added a commit to edk0/werkzeug that referenced this issue Apr 24, 2019

davidism added a commit to edk0/werkzeug that referenced this issue May 8, 2019

davidism added a commit to edk0/werkzeug that referenced this issue May 8, 2019

@davidism davidism added this to the 0.15.3 milestone May 8, 2019

davidism added a commit to edk0/werkzeug that referenced this issue May 8, 2019

@davidism davidism closed this May 8, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.