Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.Sign up
Test 1564 failing intermittently on illumos #5037
I did this
Upgrading to curl 7.69.0 on OmniOS (an illumos distribution)
I expected the following
No new failing tests. The
always in the block which is disabled for Windows - I wonder if illumos/Solaris has the same asynchronous socketpair feature?
@citrus-it any chance you can debug it?
The test case first calls wakeup
Then the test calls
Then it calls
I did a bit of digging last night and it looks like the
I added some sread() calls after the recv() loop and see this when the test works:
but when the test fails:
Interestingly, even in the loop not all calls to
Yes, it appears that EAGAIN indicates that no data is available just now, but not that the pipe is necessarily empty. It's as if there is some latency there. I will ask some other developers about this.
I wrote a small program to test this behaviour and tried it on a few different platforms that I have to hand.
On OmniOS, OpenIndiana and Solaris the read() loop does not always fully drain the pipe.
I assume from the comment in the code that Windows is maybe in the first category too.
Thanks for this excellent input and data. So maybe we can change the test to just not be that excessive? Does the test work if you change
We could possibly even consider changing the documentation to say something about these new findings but it also seems like a rather extreme edge case when someone would call wakeup() on the handle this many times without it even being "active".
I did try with 8192 writes since I noticed that was the number that were successfully written on FreeBSD and MacOSX before write started returning EAGAIN whereas on the other platforms that number is 21504.
I just ran the attached test program 100 times in a loop like this:
It was also fine with 512, 1024, 2048 and 4096
With 8192, I got
So, yes, I don't think this is a problem with normal operation of the library. The test could be less aggressive :)
This test does A LOT of *wakeup() calls and then calls curl_multi_poll() twice. The first *poll() is then expected to return early and the second not - as the first is supposed to drain the socketpair pipe. It turns out however that when given "excessive" amounts of writes to the pipe, some operating systems (the Solaris based are known) will return EAGAIN before the pipe is drained, which in our test case causes the second *poll() call to also abort early. This change attempts to avoid the OS-specific behaviors in the test by reducing the amount of wakeup calls from 1234567 to 10. Reported-by: Andy Fiddaman Fixes #5037