-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Waitress errors on curl request with non ASCII in URL. #127
Comments
Thanks for the report. A minor clarification: the waitress process itself doesn't die: it closes the connection without returning anything. |
It looks like cURL is not percent encoding the URL, and is instead sending UTF-8 to the server, which is not valid for the HTTP specification which requires |
Why The standard seems to advocate UTF-8 rather than
https://tools.ietf.org/html/rfc3986 Percent-encoded URLs do not currently work either: Input - |
Percent encoding is latin-1 (ASCII).
I am not sure what you mean here... On a Pyramid application running locally on my machine (Python 3.5, waitress 1.0.1):
Same with:
Application output:
The issue is that cURL by default will NOT send the percent encoded request:
Which causes waitress to close the connection:
This behaviour should be improved upon, but is technically contra-spec because the sending entity should have percent encoded the URL before sending it to the server. |
I'm not sure how you got the above results, but the problematic behavior is demonstrated in existing unit tests:
whereas it "should" (should it?) be
This weird encoding ends up being stored in |
Example:
rr-@tornado:~$ curl 'localhost:1234/%E6%BC%A2%E5%AD%97'
/æ¼¢å% /漢字 The Edit: looks like pyramid does just that: https://github.com/Pylons/pyramid/blob/4acd85dc98fb2a43eae54d2116cc4bf383157269/pyramid/request.py#L283 In the test I see a reference to PEP 3333 https://www.python.org/dev/peps/pep-3333/#unicode-issues but the reason for |
Actually Pyramid uses WebOb which does the right thing here: https://github.com/Pylons/webob/blob/master/webob/request.py#L321 and https://github.com/Pylons/webob/blob/master/webob/request.py#L167. Which is similar to what Werkzeug does: https://github.com/pallets/werkzeug/blob/109dad4ac9e0a1690666b2d4f29d07d98a3701d9/werkzeug/wsgi.py#L233 That being said, the encode/decode spiel is indeed correct. Based upon the comments in the above bug reports linked by @GrahamDumpleton, it is expected that the The only way that waitress would fix this issue is for it to accept the UTF-8, encode it, and decode it as latin-1 and put it in |
Thanks for the confirmation, wish I had known sooner about that encoding gotcha (or at least thought about going to look for it in the WSGI ref.) Regarding the OP's issue I think curl is at fault for not encoding the URLs like the RFC linked earlier says to, and trying to parse such URLs seems like asking for trouble - for example, what if the user issues |
I agree with cURL being at fault. Trying UTF-8 and failing back to latin-1 might make sense. The other fix I am thinking about is having it actually return a 400 Bad Request instead of just closing the connection. Slamming the door in someones face is not my idea of a good web citizen. |
This issue is the same as #64. |
Fixed by #162 |
If you issue a request with curl of:
Waitress server will die with:
This came up in discussion:
You may want to check that and related issues:
to check how Waitress behaves in cases of client sending non ASCII.
Right now Waitress fails. Both wsgiref and Gunicorn appear to get it wrong. But mod_wsgi appears to get the desired result.
The text was updated successfully, but these errors were encountered: