[BUGFIX] Fix issue #942 #943

Merged
merged 3 commits into from Mar 12, 2013

Projects

None yet

3 participants

@gerdr
Contributor
gerdr commented Mar 10, 2013

io_readline_encoded_string() used to wait for max_bytes_per_codepoint
bytes, which is 4 in case of UTF-8.

However, as UTF-8 is variable-length, there's no guarantee that we'll actually
receive that many bytes and we might hang until the stream gets closed.

@gerdr gerdr [BUGFIX] Fix issue #942
io_readline_encoded_string() used to wait for max_bytes_per_codepoint
bytes, which is 4 in case of UTF-8.

However, as UTF-8 is variable-length, there's no guarantee that we'll actually
receive that many bytes and we might hang until the stream gets closed.
2dcc1f4
@gerdr gerdr referenced this pull request in rakudo/rakudo Mar 10, 2013
Closed

Socket fix #112

Contributor
gerdr commented Mar 10, 2013

This needs some more work: there's another instance of the same problem a few lines down, and there needs to be some adjustments made to the termination logic...

Contributor
gerdr commented Mar 10, 2013

This should have done it

Owner
leto commented Mar 10, 2013

I would feel better about this pull request if it had some kind of test and/or documentation changes.

Contributor
gerdr commented Mar 11, 2013

A test is a good idea, but it's non-trivial: You have to set up a client and a server and fail the test if the server blocks on readline() even though the client sent a whole line (but with length less than 4). This probably can only be done via timeouts.

For now, you can test it via Rakudo's HTTP::Easy - a HTTP header ends with a blank line ("\r\n" which has length < 4) and thus will trigger the bug.

What do you want in terms of documentation? Just some comments describing the algorithm and why the original version was broken?

@gerdr gerdr Add notes about changes io_readline_encoded_string()
Also marks io_read_encoded_string() as FIXME
3c59862
Owner
leto commented Mar 11, 2013

@gerdr in terms of documentation, there should be some kind of developer-focused docs (not code comments, actual POD documentation) about how the relevant functions work, their edge cases, etc...

I understand that the test would need to rely on timeouts, but I still think it would be valuable. Otherwise, how do we know if we break this in the future, or change it's behavior accidentally?

Contributor

I will merge it as soon as rakudo works better. #942 can be closed until there is a test.

@zhuomingliang zhuomingliang merged commit fb2e957 into parrot:master Mar 12, 2013
Owner
leto commented Mar 12, 2013

This was merged prematurely. It breaks the master branch. It should have documentation and some kind of test.

https://travis-ci.org/parrot/parrot/builds/5428936

Contributor
gerdr commented Mar 12, 2013

I agree that it was merged prematurely. I'll see what I can do about the test failures later today. Also, I'd like readline() not to bail out early as outlined on parrot-dev.

Once that's done, there need to be 3 new socket-related tests added:

  • the client sends an empty line and then sleeps (but keeps the connection open): this blocks until the socket is closed right now, but should not
  • the client sends a large line (500 bytes was my test case): the lines will be returned in multiple chunks, which is in general not what the user wants
  • the client sends a multi-character terminator (eg \r\n) in two chunks: right now, readline will return this as two separate lines; I messed this up in a Rakudo fix, but removing the early bailout would handle this Parrot-side
Contributor
gerdr commented Mar 12, 2013

I believe I know what is wrong with my code: While previously, we tried to read too much (and thus could hang if not enough input is available). But now we don't read enough in case of multi-unit characters and loop endlessly instead.

@Util Util added a commit that referenced this pull request Mar 24, 2013
@Util Util Revert "Merge pull request #943 from gerdr/fix-socket-readline"
This reverts commit fb2e957, reversing
changes made to 83bb863.

This is needed for the duration of the 5.2.0 release process, because the
branch was merged prematurely, and causes test failures. When the branch is
in a completed state, and ready to merge, it is important that a
"revert-the-revert" is performed just before the merge:
    git revert ID-of-*this*-revert
For rationale, see:
https://www.kernel.org/pub/software/scm/git/docs/howto/revert-a-faulty-merge.txt
5120d02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment