Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feed_data is not safe for http header fragmentation #21

Closed
youknowone opened this issue Jul 5, 2017 · 11 comments
Closed

feed_data is not safe for http header fragmentation #21

youknowone opened this issue Jul 5, 2017 · 11 comments

Comments

@youknowone
Copy link
Contributor

if any broken data chunk is fed to feed_data, it will be parsed in incomplete form.

@yohanboniface
Copy link
Contributor

@youknowone do you have a test case at hand to reproduce the issue?

@youknowone
Copy link
Contributor Author

A request:

b'''GET /ping/ HTTP/1.1\r\nHost: github.com\r\nConnection: keep-alive\r\nCache-Control: max-age=0\r\nUpgrade-Insecure-Requests: 1\r\nUser-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8\r\nAccept-Encoding: gzip, deflate\r\nAccept-Language: ko-KR,ko;q=0.8,en-US;q=0.6,en;q=0.4\r\n\r\n'''

I called feed_data each byte by byte and the headers are broken.

Related issue in sanic which uses httptools: sanic-org/sanic#755

They are implmenting their own header fragment buffer.

@1st1
Copy link
Member

1st1 commented Nov 24, 2017

I'll take a look as soon as I finish working on the next uvloop release.

@1st1
Copy link
Member

1st1 commented Nov 24, 2017

@yohanboniface feel free to work on this if you have time

@yohanboniface
Copy link
Contributor

yohanboniface commented Nov 26, 2017

@youknowone to make sure I understand the issue: it arises when a request is chunked in a middle of a header field?
So for example, to continue on your data, we'd have this as first chunk:

GET /ping/ HTTP/1.1\r\nHost: github.com\r\nConnection: keep-alive\r\nCache-Control: max-age=0\r\nUpgrade-Insecure-Requests: 1\r\nTransfer-Encoding: chunked\r\nUser-Agent: Mozilla/5.0

And this as second chunk:

(Macintosh; Intel Mac OS X 10_12_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8\r\nAccept-Encoding: gzip, deflate\r\nAccept-Language: ko-KR,ko;q=0.8,en-US;q=0.6,en;q=0.4\r\n\r\n

And then you'd have an incomplete value for User-Agent and an invalid header name with the rest of the value?

Is that correct or am I missing something?

edit bah, no, indeed a chunked request is only about the body.
So it's not about the request itself being split in the middle of a header field but that the code implementing httptools chunking it manually before calling feed_data?

@youknowone
Copy link
Contributor Author

@yohanboniface Yes, your description is correct. User-Agent will be Mozzle/5.0 in that case.

As I know, the chunked body is a spec about logical chunk, not about TCP packet fragment.

A question here: does httptools expect to feed the whole HTTP body (at least "a chunk") at same time? Then it can be a user fault - but still weird.

Because httptools is the parser, I think basically the users can't determine which part of http request is going to httptools or not. For the point of view of user, "end of chunk" of http body and any fragmented packet in http header is not recognizable before putting it into the parser.
I think httptools is the correct place to merge the fragmented tcp packets to avoid double-parsing http request both in httptools and the users.

@yohanboniface
Copy link
Contributor

@youknowone made a quick unittest to reproduce what I've understood of the issue, but… it passes ;)

See #26

Can you please check the unittest and tell me what I'm missing to properly reproduce the issue? thanks :)

@youknowone
Copy link
Contributor Author

Thanks, your test is really helpful.
It seems I need to look into both httptools and sanic.
Give me some time. I am new to both part.

@youknowone
Copy link
Contributor Author

I changed your test a little and it now starts to be broken: #27

@youknowone youknowone changed the title feed_data is not safe for http header feed_data is not safe for http header fragmentation Nov 27, 2017
@youknowone
Copy link
Contributor Author

I also added a patch to #27. Thanks @yohanboniface, I would never looked into it without your test.

@yohanboniface
Copy link
Contributor

Cool!
I'll let @1st1 do the final review :)

@1st1 1st1 closed this as completed in 772195c Dec 15, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants