Skip to content

Conversation

@tarnfeld
Copy link
Contributor

When using keep-alive connections, some HTTP implementations insert unexpected and extra CRLF tokens between each connection, which can result in the next request being prefixed with a CRLF.

Some other client/server implementations are tolerant of this, and for example this bug doesn't show up when using the Go net/http library.

@bdarnell
Copy link
Member

Just out of curiousity, do you know which HTTP implementations do this? For the record, RFC 7230 does endorse ignoring the extra blank line: http://tools.ietf.org/html/rfc7230#section-3.5

A change like this needs a test. I'm also a little nervous about the performance impact of a re.sub on the entire header block on every request; maybe it should be baked into _parse_headers when it's extracting start_line?

@tarnfeld
Copy link
Contributor Author

Thanks for looking into it @bdarnell. This bug actually came about while implementing a client for Apache Mesos here which makes extensive use of HTTP keep-alive connections. I opened an issue MESOS-1625 to fix it on their end, but since nobody has had this problem before over the last few years with a variety of languages, I thought i'd propose this patch anyway.

I agree about the performance, I felt the same. I'll do a little comparison between a few options to see if I can find a faster way of stripping (lstrip perhaps). Also, yeah let's put this in the _parse_headers method. Alternatively we could change the way this works a bit to make it cope better with the leading CRLF?

@tarnfeld
Copy link
Contributor Author

Did a quick test to be sure...

$ python -m timeit -c 'import re; re.sub(r"^\r?\n", "", "\r\nPOST /foobar HTTP/1.0\r\nHeader: bar\r\n\r\n")'
100000 loops, best of 3: 2.31 usec per loop
$ python -m timeit -c '"\r\nPOST /foobar HTTP/1.0\r\nHeader: bar\r\n\r\n".lstrip("\r\n")'
10000000 loops, best of 3: 0.188 usec per loop

I think i'll use .lstrip().

tarnfeld added 2 commits July 23, 2014 09:28
When using keep-alive connections, some HTTP implementations insert unexpected and extra CRLF tokens between each connection, which can result in the next request being prefixed with a CRLF.

Some other client/server implementations are tollerant of this, and for example this bug doesn't show up when using the Go net/http library.
@tarnfeld
Copy link
Contributor Author

@bdarnell I've rebased my commits and pushed a regression test that passes (and fails as expected without the fix applied). We'll see how the travis build goes...

bdarnell added a commit that referenced this pull request Jul 24, 2014
Support for non-RFC compliant header prefixes
@bdarnell bdarnell merged commit 8940bec into tornadoweb:master Jul 24, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants