Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a bug that the last CRLF of chunked body may be used in the next request #1812

Merged
merged 3 commits into from
Aug 3, 2019
Merged

Fix a bug that the last CRLF of chunked body may be used in the next request #1812

merged 3 commits into from
Aug 3, 2019

Conversation

kou
Copy link
Contributor

@kou kou commented Jun 8, 2019

The last CRLF of chunked body is checked by #1607. But it's
incomplete. If a client sends the last CRLF (or just LF) after Puma
processes "0\r\n" line, the last CRLF (or just LF) isn't dropped in
the "0\r\n" process:

puma/lib/puma/client.rb

Lines 183 to 192 in 675344e

if line.end_with?("\r\n")
len = line.strip.to_i(16)
if len == 0
@body.rewind
rest = io.read
rest = rest[2..-1] if rest.start_with?("\r\n")
@buffer = rest.empty? ? nil : rest
set_ready
return true
end

if line.end_with?("\r\n")
  len = line.strip.to_i(16)
  if len == 0
    @body.rewind
    rest = io.read
    # rest is "" with no the last CRLF case and
    # "\r" with no last LF case.
    # rest.start_with?("\r\n") returns false for
    # Both of these cases.
    rest = rest[2..-1] if rest.start_with?("\r\n")
    @buffer = rest.empty? ? nil : rest
    set_ready
    return true
  end

The unprocessed last CRLF (or LF) is used as the first data in the
next request. Because Puma::Client#reset sets @parsed_bytes
to 0.

puma/lib/puma/client.rb

Lines 100 to 109 in 675344e

def reset(fast_check=true)
@parser.reset
@read_header = true
@env = @proto_env.dup
@body = nil
@tempfile = nil
@parsed_bytes = 0
@ready = false
@body_remain = 0
@peerip = nil

def reset(fast_check=true)
  @parsed_bytes = 0

It means that data in @buffer (it's "\r" in no the last LF case) and
unread data in input socket (it's "\r\n" in no the last CRLF case and
"\n" in no the last LF case) are used used as the first data in the
next request.

This change fixes these cases by the followings:

  • Ensures reading the last CRLF by setting @partial_part_left when
    CRLF isn't read in processing "0\r\n" line.

  • Introduces a @in_last_chunk new state to detect whether the last
    CRLF is waiting or not. It's reset in Puma::Client#reset.

@@ -786,11 +786,23 @@ def test_chunked_keep_alive_two_back_to_back
@server.run

sock = TCPSocket.new @host, @server.connected_port
sock << "GET / HTTP/1.1\r\nConnection: Keep-Alive\r\nTransfer-Encoding: chunked\r\n\r\n1\r\nh\r\n4\r\nello\r\n0\r\n\r\n"
sock << "GET / HTTP/1.1\r\nConnection: Keep-Alive\r\nTransfer-Encoding: chunked\r\n\r\n1\r\nh\r\n4\r\nello\r\n0\r\n"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a huge fan of this test. I think a test for this PR would ideally test for the bug that you found. The test should process 2 requests in a row, and the 2nd request should not start with a CRLF.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you imagine a test like the following?

sock = TCPSocket.new @host, @server.connected_port
sock << "GET / HTTP/1.1\r\nConnection: Keep-Alive\r\nTransfer-Encoding: chunked\r\n\r\n1\r\nh\r\n4\r\nello\r\n0\r\n\r\n"
sock << "GET / HTTP/1.1\r\nConnection: Keep-Alive\r\nTransfer-Encoding: chunked\r\n\r\n4\r\ngood\r\n3\r\nbye\r\n0\r\n\r\n"

h = header(sock)
assert_equal ["HTTP/1.1 200 OK", "Content-Length: 0"], h
assert_equal "hello", body

h = header(sock)
# Not "HTTP/1.1 400 Bad Request" (Request started with "\r\n" is bad request)
assert_equal ["HTTP/1.1 200 OK", "Content-Length: 0"], h
assert_equal "goodbye", body

This doesn't reproduce this case. Because this case should delay "\r\n" in the last chunk.

The following test reproduces this case but this expects that we don't require the last empty chunk to process chunked body. This is the current Puma behavior. Should we keep this behavior?

def test_chunked_last_empty_chunk_delay
  # ...
  sock = TCPSocket.new @host, @server.connected_port
  sock << "GET / HTTP/1.1\r\nConnection: Keep-Alive\r\nTransfer-Encoding: chunked\r\n\r\n1\r\nh\r\n4\r\nello\r\n0\r\n"

  h = header(sock)
  assert_equal ["HTTP/1.1 200 OK", "Content-Length: 0"], h
  assert_equal "hello", body # Processed without the last "\r\n"

  sock << "\r\n"
  sock << "GET / HTTP/1.1\r\nConnection: Keep-Alive\r\nTransfer-Encoding: chunked\r\n\r\n4\r\ngood\r\n3\r\nbye\r\n0\r\n\r\n"

  h = header(sock)
  assert_equal ["HTTP/1.1 200 OK", "Content-Length: 0"], h
  assert_equal "goodbye", body

  sock.close
end

return false
if @in_last_chunk
set_ready
return true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unneccessary return

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so too but I follow the current style:

There are no return cases too:

Which style is Puma style?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From here on out (I make the rules now!) no more unnecessary returns.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I've removed unnecessary returns from this pull request.

@nateberkopec
Copy link
Member

Thanks for catching this.

I think this PR needs a clearer test that actually describes and tests the bug that you laid out in your very thorough and complete PR description. From there, we'll be able to write an implementation that catches it.

@kou
Copy link
Contributor Author

kou commented Jun 9, 2019

I've left a comment about test at #1812 (comment) .


h = header(sock)
assert_equal ["HTTP/1.1 200 OK", "Content-Length: 0"], h
assert_equal "hello", body
assert_equal true, last_crlf_written

last_crlf_writer.join
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this line have to be before assert last_crlf_written?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to put this here.
This tests that Puma must wait the last CRLF to complete HTTP body processing.
If we put this before assert_equal true, last_crlf_written, the assertion is passed even if Puma doesn't wait the last CRLF. (This is the current behavior. And this causes this problem.)

kou added 2 commits July 27, 2019 06:04
…request

The last CRLF of chunked body is checked by #1607. But it's
incomplete. If a client sends the last CRLF (or just LF) after Puma
processes "0\r\n" line, the last CRLF (or just LF) isn't dropped in
the "0\r\n" process:

https://github.com/puma/puma/blob/675344e8609509b0d767ae7680436b3b382d8394/lib/puma/client.rb#L183-L192

    if line.end_with?("\r\n")
      len = line.strip.to_i(16)
      if len == 0
        @body.rewind
        rest = io.read
        # rest is "" with no the last CRLF case and
        # "\r" with no last LF case.
        # rest.start_with?("\r\n") returns false for
        # Both of these cases.
        rest = rest[2..-1] if rest.start_with?("\r\n")
        @buffer = rest.empty? ? nil : rest
        set_ready
        return true
      end

The unprocessed last CRLF (or LF) is used as the first data in the
next request. Because Puma::Client#reset sets `@parsed_bytes`
to 0.

https://github.com/puma/puma/blob/675344e8609509b0d767ae7680436b3b382d8394/lib/puma/client.rb#L100-L109

    def reset(fast_check=true)
      @parsed_bytes = 0

It means that data in `@buffer` (it's "\r" in no the last LF case) and
unread data in input socket (it's "\r\n" in no the last CRLF case and
"\n" in no the last LF case) are used used as the first data in the
next request.

This change fixes these cases by the followings:

  * Ensures reading the last CRLF by setting `@partial_part_left` when
    CRLF isn't read in processing "0\r\n" line.

  * Introduces a `@in_last_chunk` new state to detect whether the last
    CRLF is waiting or not. It's reset in Puma::Client#reset.
#1812 (comment) is the
location where this rule is made.
@nateberkopec
Copy link
Member

Tests failing now.

@kou
Copy link
Contributor Author

kou commented Jul 27, 2019

I've added missing last CRLF for chunked request into existing tests.
Now CI is green.

@nateberkopec nateberkopec added this to the 4.1.0 milestone Jul 28, 2019
@nateberkopec
Copy link
Member

Thank you for your contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants