Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upLinux: Always use blocking receive for followup fragments #48
+119
−12
Conversation
Rather then just sending a large block of the same byte, send an alternating sequence. This way we should be able to detect when parts of the message have been shuffled, even when the total length hasn't changed.
While large messages need to use fragmentation internally, from the user's point of view a receive should still be atomic. Thus, once we received the initial fragment of a multi-fragment message, the receive operations for the followup fragments should always block -- even if the caller requested non-blocking receive. Otherwise, a receive attempt on a followup fragment before it is available will cause the entire message receive invocation to abort -- so the message gets lost on the receiver side, while the sender blocks forever trying to send the remaining fragments. Note that blocking for follow-up fragments shouldn't actually cause major delays in the receiver, as the sender thread doesn't do any other slow or blocking operations between the individual fragment sends. The issue turns out rather tricky to demonstrace (at least on my Linux system): apparently the way the scheduler works, once the sender starts pushing out fragments, it usually doesn't get preempted; but rather just sends packets until the queue gets full, causing the send call to block -- until the receiver picks up the pending fragment, thus making space, and consequently enabling the sender to resume and immediately send the next fragment. The somewhat complex new try_recv_large_delayed() test (using multiple threads with artificial delays) manages to get around this on my system; but only with a release build -- I haven't found a sane way to reproduce the problem on non-release builds...
|
@bors-servo: r+ |
|
|
bors-servo
added a commit
that referenced
this pull request
Mar 9, 2016
Linux: Always use blocking receive for followup fragments Fixes an issue (see commit message for details) that I discovered in the code while working on other stuff. I don't know how likely it is to manifest in real life; but it seems entirely likely that it's causing some spurious failures now and again... The PR also includes a commit with improvements to several unrelated test cases, which I didn't want to untangle from the main commit with the newly added tests. It should do no harm.
|
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
antrik commentedMar 8, 2016
Fixes an issue (see commit message for details) that I discovered in the code while working on other stuff. I don't know how likely it is to manifest in real life; but it seems entirely likely that it's causing some spurious failures now and again...
The PR also includes a commit with improvements to several unrelated test cases, which I didn't want to untangle from the main commit with the newly added tests. It should do no harm.