Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix broken header parsing in logging #1607

Merged
merged 1 commit into from Oct 23, 2017

Conversation

Projects
None yet
4 participants
@mattbillenstein
Copy link
Contributor

mattbillenstein commented Sep 27, 2017

I found this by mixing flask-websocket in a flask app -- the parsing logic here leaves whitespace in the header values (including \r\n) so it's not suitable for extracting headers into logs.

@tilgovi

This comment has been minimized.

Copy link
Collaborator

tilgovi commented Sep 27, 2017

Can you explain how this fixes the problem? It's not at all obvious from the change.

@mattbillenstein

This comment has been minimized.

Copy link
Contributor Author

mattbillenstein commented Sep 27, 2017

Sorry, could have made that more clear -- pywsgi provides an object that is dict-like and wraps the headers -- the headers property on that object:

        @property
        def headers(self):
            for key, value in self._headers:
                yield '%s: %s\r\n' % (key, value)

adds whitespace for some reason, it's much cleaner to just call .items():

import sys
if sys.version_info.major == 2:
    from StringIO import StringIO
else:
    from io import BytesIO as StringIO

import gevent.pywsgi

raw_headers = b'Foo: bar\r\nBar: baz\r\nBaz: boo\r\n\r\n'
fp = StringIO(raw_headers)
headers = gevent.pywsgi.headers_factory(fp)

print([h.split(":", 1) for h in headers.headers])
# [['Foo', ' bar\r\n'], ['Bar', ' baz\r\n'], ['Baz', ' boo\r\n']]

print(headers.items())
# [('baz', 'boo'), ('foo', 'bar'), ('bar', 'baz')]
(tve)mattb@matt:~ $ python3 header_fix.py
[['Foo', ' bar\r\n'], ['Bar', ' baz\r\n'], ['Baz', ' boo\r\n']]
[('Foo', 'bar'), ('Bar', 'baz'), ('Baz', 'boo')]

(tve)mattb@matt:~ $ python2 header_fix.py
[['Foo', ' bar\r\n'], ['Bar', ' baz\r\n'], ['Baz', ' boo\r\n']]
[('baz', 'boo'), ('foo', 'bar'), ('bar', 'baz')]

I'm not sure what the actual interface for the headers attribute is supposed to be between the various servers and frameworks -- not sure how much test coverage you have around this; I think this code path isn't even exercised when using the gevent worker, I only have this problem when also using flask-sockets and geventwebsocket.

@tilgovi

This comment has been minimized.

Copy link
Collaborator

tilgovi commented Oct 23, 2017

I checked this out and it seems fine.

We don't have any coverage right now for anything around this, but in gevent 0.13.x up through the most recent versions this object is mimetools.Message or the wrapper on Python 3 that has the code you copy/pasted. In all cases, .items() works fine.

@berkerpeksag should we do anything more here or just merge this?

@@ -213,7 +213,7 @@ def log_request(self):
resp_headers = getattr(self, 'response_headers', {})
resp = GeventResponse(self.status, resp_headers, self.response_length)
if hasattr(self, 'headers'):
req_headers = [h.split(":", 1) for h in self.headers.headers]
req_headers = self.headers.items()
else:
req_headers = []

This comment has been minimized.

@berkerpeksag

berkerpeksag Oct 23, 2017

Collaborator

Shouldn't this be changed to {}?

This comment has been minimized.

@tilgovi

tilgovi Oct 23, 2017

Collaborator

Nope. It's been a list of tuples and that isn't being changed here.

@berkerpeksag

This comment has been minimized.

Copy link
Collaborator

berkerpeksag commented Oct 23, 2017

I don't know much about this part of Gunicorn, but this looks good to me. Thank you!

I think current behavior comes from the mimetools module (which is an ancient Python 2 module) so I don't know why mimetools behaved like that. In gevent/gevent@198f239, they make the behavior same under Python 3. There is also a check for ' ' and '\t' at https://github.com/gevent/gevent/blob/98c2c15fbbee0c6ab6fab0ba747fd0cdceca2ba8/src/gevent/pywsgi.py#L1036 but this detail is probably not important in our case.

I'll just ping @jamadden in case we missed something.

@jamadden

This comment has been minimized.

Copy link
Contributor

jamadden commented Oct 23, 2017

I'm not sure what you're asking, but calling items should be fine. I don't see that changing.

@berkerpeksag berkerpeksag merged commit 5d4f885 into benoitc:master Oct 23, 2017

1 of 8 checks passed

couverture-io/py26 Not all changes are covered
Details
couverture-io/py27 Not all changes are covered
Details
couverture-io/py34 Not all changes are covered
Details
couverture-io/py35 Not all changes are covered
Details
couverture-io/py36 Not all changes are covered
Details
couverture-io/py36-dev Not all changes are covered
Details
couverture-io/py37 Not all changes are covered
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

fofanov pushed a commit to fofanov/gunicorn that referenced this pull request Mar 16, 2018

[gevent worker] Fix broken header parsing in logging (benoitc#1607)
We have a http.client.HTTPMessage in Python 3
(mimetools.Message in Python 2), just take .items()
from that.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.