8268714: [macos-aarch64] 7 java/net/httpclient/websocket tests failed #79
Please find below a test-only change to fix some intermittent failures observed with the httpclient/websocket tests:
Some machines in our CI seem to allow a higher level of concurrency while being (maybe) configured with lower system resources (such as available buffer space for the TCP stack).
Some of the httpclient/websocket tests attempt to fill the sockets buffers in order to assert some conditions when the buffers are full and writing is paused. When the test process terminates, this leaves behind TCP sockets in the TIME_WAIT state that still hold system buffer resources in case retransmission is needed. When several such tests are run this ends up causing random "No buffer space available" errors on other tests (including these tests themselves) running concurrently or shortly after on the same machine.
This change implements a few tricks to alleviate the situation:
With these changes, I have run the HttpClient tests 200 times on the problematic machines without observing any failures (where previously there was at least a couple of failures per 50 runs). I also ran tier1 once, and tier2 twice and the results came clean.
I am therefore claiming success (even if it might prove temporary ;-) )
If these failures come back to haunt the CI again after this fix, a further remediation policy could be to put the httpclient/websocket directory in exclusive test execution mode (in TEST.root) - this seems to work too - but cleaning up garbage in the tests themselves seems preferable.
The text was updated successfully, but these errors were encountered:
@dfuch This change now passes all automated pre-integration checks.
After integration, the commit message for the final commit will be:
At the time when this comment was updated there had been no new commits pushed to the