Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

url parameters casting bytes to str #1502

Closed
fuhrysteve opened this issue Apr 2, 2019 · 8 comments
Closed

url parameters casting bytes to str #1502

fuhrysteve opened this issue Apr 2, 2019 · 8 comments
Milestone

Comments

@fuhrysteve
Copy link

@fuhrysteve fuhrysteve commented Apr 2, 2019

Hi there! I noticed a change in behavior in werkzeug 0.15.x whereby url parameters used to be safe to send as bytes (on python 3.6).

I apologize for not showing a more direct example in werkzeug, but like most users of werkzeug - I don't use it directly - I use it through flask.

Demonstration (using Flask)

from flask import Flask, url_for

app = Flask(__name__)
app.config['SERVER_NAME'] = 'www.foo.com'


@app.route('/<string:foo>')
def hello(foo):
    return 'Hello, World!'


with app.app_context():
    print(url_for('hello', foo=b'bar'))

With werkzeug<0.15, this will output:

http://www.foo.com/bar

However as of werkzeug>=0.15, it now returns this malformed url:

http://www.foo.com/b%27bar%27

I would expect that if werkzeug isn't going to decode bytes for you anymore, that it ought to raise a TypeError indicating that bytes is no longer safe to pass in.

I hope this example makes sense! Thanks for reviewing my issue - I'd be happy to (attempt to) submit a pull request if y'all agree that this is not desirable behavior.

@edk0
Copy link
Member

@edk0 edk0 commented Apr 3, 2019

I'll have a look at this. We (evidently) don't have tests for this, so I suspect accepting bytes was never intended behaviour.

@edk0
Copy link
Member

@edk0 edk0 commented Apr 3, 2019

Seems like fixing this is as easy as catching it: edk0@794eb16 so I'm inclined that way.

@davidism
Copy link
Member

@davidism davidism commented Apr 3, 2019

@edk0 I don't think you changed that code in your PR, was there a related change that cause this?

I'm not opposed to this, but it seems a little weird to be passing bytes directly in a URL. @fuhrysteve Under what circumstances do you have UTF-8 bytes and not the text? If you're really passing bytes, and not text, I'd expect to pass them as Base64, which would require a custom converter.

@fuhrysteve
Copy link
Author

@fuhrysteve fuhrysteve commented Apr 3, 2019

@davidism in my case it was a json web token. PyJWT returns bytes by default (at least in py3).

In [1]: import jwt                                                                                                                                                                                           

In [2]: jwt.encode({'some': 'payload'}, 'secret', algorithm='HS256')                                                                                                                                         
Out[2]: b'eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJzb21lIjoicGF5bG9hZCJ9.Joh1R2dYzkRvDkqv3sygm5YyK8Gi4ShZqbhK2gxcs2U'

@edk0
Copy link
Member

@edk0 edk0 commented Apr 3, 2019

@davidism I believe it was an unintended consequence—I swapped out the implementation of to_url for a faster one that only took bytes as input; the usual behaviour is to convert to text_type and then encode, so I did that, and didn't notice the change in behaviour for bytes.

I do wonder if it might be a better idea to add a bytes converter and have BaseConverter throw an error.

@fuhrysteve
Copy link
Author

@fuhrysteve fuhrysteve commented Apr 3, 2019

@edk0 isn't everything passing through UnicodeConverter here, which is the "default converter"? At a glance, I'd suspect it ought to go there and have BaseConverter or something beneath it raise TypeError prior to any unsafe conversion to str (at least in py3).

I share your hesitation to make utf-8 assumptions in BaseConverter

@edk0
Copy link
Member

@edk0 edk0 commented Apr 4, 2019

It's not actually making UTF-8 assumptions, just URL encoding the bytes as-is, but it does seem wrong.

I'm torn because we did special-case bytes in the past. Maybe we should add a converter that passes bytes as-is, and have BaseConverter do the same for now, but with a warning.

@greyli
Copy link
Member

@greyli greyli commented Apr 22, 2019

Same problem here. I use itsdangerous to generate JWT token for user confirmation in this way:

from flask import current_app
from itsdangerous import TimedJSONWebSignatureSerializer as Serializer

def generate_token(user, operation, expire_in=None, **kwargs):
    s = Serializer(current_app.config['SECRET_KEY'], expire_in)

    data = {'id': user.id, 'operation': operation}
    data.update(**kwargs)
    return s.dumps(data)

The return value will be in bytes type, like b'eyJhbGciOiJ...YmQ'. Then I generate the confirmation URL with the token returned in email template:

{{ url_for('auth.confirm', token=token, _external=True) }}

With Werkzeug >= 0.15, It end up to something like this:

http://127.0.0.1:5000/confirm/b%27eyJhbGciOiJ...YmQ%27

Notice the unnecessary b%27 and %27 part in the URL, it indicates the b prefix and the quotes has been decoded into the URL.

Here is the view function that handles token confirmation:

@auth_bp.route('/confirm/<token>')
@login_required
def confirm(token):
    print(token)  # b%27eyJhbGciOiJ...YmQ%27

The correct token value should be just eyJhbGciOiJ...YmQ, so the token confirmation will fail finally.

Hope this made things a bit more clear.

edk0 added a commit to edk0/werkzeug that referenced this issue Apr 24, 2019
davidism added a commit to edk0/werkzeug that referenced this issue May 8, 2019
davidism added a commit to edk0/werkzeug that referenced this issue May 8, 2019
@davidism davidism added this to the 0.15.3 milestone May 8, 2019
davidism added a commit to edk0/werkzeug that referenced this issue May 8, 2019
@davidism davidism closed this May 8, 2019
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 13, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants