Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upFix notification handling tests and `unix` implementation #197
Conversation
metajack
commented
May 17, 2018
|
Are there no other solutions? Sleeps feel like a bit of a hack. |
`no_receiver_notification()`, and especially `no_senders_notification_try_recv()`, fail quite frequently on CI. The only reason I can think of is that apparently the kernel sometimes doesn't process the close event immediately, but rather continues running the user thread, so the next read or write call doesn't get the notification yet. Attempting to fix this by repeatedly polling for the notifications to arrive. Let's hope this doesn't have any undesired side effects... Note that `no_senders_notification()` should not be affected, since that one isn't async. (The regular `recv()` method waits for a result to arrive.)
Turn `UnixError` into an enum with an explicit variant for the "connection closed" status -- like other back-ends do -- rather than overloading `ECONNRESET`. The overloading was causing misleading error reporting, especially when unwrapping a result: the panic message would report it like an error produced by the system -- while in fact Unix doesn't report this condition (attempting to receive from a socket with no senders left) as a system error. What's worse, Linux actually reports `ECONNRESET` in some cases when attempting a *send* after the *receiver* end was closed (specifically, when the receiver was closed while there were unreceived messages pending) -- so the overloaded error could cause serious confusion.
|
@metajack my assumption was that if it's a thread scheduling issue, a sleep of any duration should make it reliable, as it explicitly hands off scheduling... But it seems my assumption was wrong, as the sleep didn't fix the intermittents. I have implemented polling in a loop now instead -- let's hope this works better. In order to verify that the loop indeed handles delayed notifications, I added separate tests forcing a delay -- and uncovered an actual issue in the notification handling on Linux in the process... So the PR grew in scope :-) |
|
One of the TravisCI jobs seems displeased: https://travis-ci.org/servo/ipc-channel/jobs/385864860 |
|
@jdm that's a known intermittent (problem in test cases rather than implementation), unrelated to this PR. (I have known about this problem, and what causes it, for quite some time -- but only recently it started getting triggered quite a lot on Travis... So I'm moving it way up on my list.) |
|
@bors-servo r+ |
|
|
Fix notification handling tests and `unix` implementation The first commit attempts to make notification tests more robust. (Hopefully fixing intermittent failures on Linux.) The other commits add some additional related tests, and fix a (minor) issue uncovered by them.
|
|
antrik commentedMay 17, 2018
•
edited
The first commit attempts to make notification tests more robust. (Hopefully fixing intermittent failures on Linux.) The other commits add some additional related tests, and fix a (minor) issue uncovered by them.