Skip to content

Fix header fragmentation parsing (Fix #21)#27

Merged
1st1 merged 3 commits intoMagicStack:masterfrom
youknowone:framented-broken
Dec 15, 2017
Merged

Fix header fragmentation parsing (Fix #21)#27
1st1 merged 3 commits intoMagicStack:masterfrom
youknowone:framented-broken

Conversation

@youknowone
Copy link
Copy Markdown
Contributor

No description provided.

Comment thread tests/test_parser.py Outdated
p = httptools.HttpRequestParser(m)

REQUEST = (
b'PUT / HTTP/1.1\r\nHost: localhost:1234\r\nContent-',
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I slightly changed the fragmented point and it now starts to be broken

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it was on purpose where the splitting was, especially the empty HTTP header.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another test which feeds byte-for-byte would also be nice.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the advice. I added a few more tests

@youknowone youknowone changed the title Broken fragmented header test (edit of #26) Broken fragmented header test (including #26) Nov 27, 2017
@youknowone youknowone changed the title Broken fragmented header test (including #26) Fix header fragmentation parsing (Fix #21) Nov 27, 2017
Comment thread httptools/parser/parser.pyx Outdated

cdef _on_headers_complete(self):
self._maybe_call_on_header()
if self._current_header_name is not None:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why you moved this check out of _maybe_call_on_header (so it's now duplicated).
Seems to me you could keep it in there, but just move the reset of _current_XXX out of if, something like:

def _maybe_call_on_header():
   if self._current_header_name is not None:
       # do the real call
   # do the reset

Thoughts?

cdef _on_header_field(self, bytes field):
self._maybe_call_on_header()
self._current_header_name = field
if self._current_header_name is None:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about the if/elif/else here.
Why not just calling _maybe_call_on_header if _current_header_name is None (which means we are starting a new header) ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

Comment thread httptools/parser/parser.pyx Outdated
cdef _on_header_field(self, bytes field):
self._maybe_call_on_header()
self._current_header_name = field
if (self._current_header_value is not None or
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't get why we need to check _current_header_value here, given it's reset at each call to _maybe_call_on_header, but I'm certainly missing something :)

Seems to me this cover a case where both _current_header_name and _current_header_value would be set, but I fail to understand how this could happen, given they are reset all together when calling _maybe_call_on_header and any call to _on_header_field with an empty header name would call _maybe_call_on_header, and thus reset _current_header_value.

Can you elaborate a bit? :)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One header field can trigger multiple _on_header_field call. (the else part)
I think it must be there unless you know better method to check it without _current_header_value.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I'm missing something, I understand that _on_header_field will be called (once or more) for the header name only, so it seems to me that the check on _current_header_value is useless.

Is there a test that fails if you remove it?

Copy link
Copy Markdown
Contributor Author

@youknowone youknowone Nov 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's the only part I fixed for the problem.
Yes, than the tests I added fail.

I am checking weather it is same header name or not by checking the header value. Once the value is fed, it must be a new header name.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, now I think I get it :)

Here is the fix I suggest (from master):

diff --git a/httptools/parser/parser.pyx b/httptools/parser/parser.pyx
index 064e55c..b9b7175 100644
--- a/httptools/parser/parser.pyx
+++ b/httptools/parser/parser.pyx
@@ -95,10 +95,9 @@ cdef class HttpParser:
         self._last_error = None
 
     cdef _maybe_call_on_header(self):
-        if self._current_header_name is not None:
+        if self._current_header_value is not None:
             current_header_name = self._current_header_name
             current_header_value = self._current_header_value
-
             self._current_header_name = self._current_header_value = None
 
             if self._proto_on_header is not None:
@@ -107,7 +106,10 @@ cdef class HttpParser:
 
     cdef _on_header_field(self, bytes field):
         self._maybe_call_on_header()
-        self._current_header_name = field
+        if self._current_header_name is None:
+            self._current_header_name = field
+        else:
+            self._current_header_name += field

You tests should still pass :)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tested it?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do :)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually like the fix @yohanboniface proposes. Can we use it in this PR?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am sorry. I totally forgot about this PR due to other schedules. Let me try it again.

@youknowone
Copy link
Copy Markdown
Contributor Author

I replaced the fix to @yohanboniface's work

@1st1 1st1 merged commit 772195c into MagicStack:master Dec 15, 2017
@1st1
Copy link
Copy Markdown
Member

1st1 commented Dec 15, 2017

Merged. Guys, a wholehearted thank you for working on this PR!

@1st1
Copy link
Copy Markdown
Member

1st1 commented Dec 15, 2017

I've just released httptools 0.0.10 with this fix.

yohanboniface added a commit to pyrates/roll that referenced this pull request Dec 15, 2017
@youknowone
Copy link
Copy Markdown
Contributor Author

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants