-
Notifications
You must be signed in to change notification settings - Fork 284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix @stresstest failure #5838
Fix @stresstest failure #5838
Conversation
5985db3
to
68682a7
Compare
Marking as draft there is another failure in the CI (which didn't happen locally):
|
Looks like the generated delay is too small, we can't perform delays <1us. |
d2d16b4
to
636dea6
Compare
The float argument gets converted to microseconds when making the system call. If it is <1us then it gets converted to 0, which has special meaning * for select/epoll it means timeout immediately * for setsockopt SO_RCVTIMEO it means never timeout So do not generate tests with such small timeout, instead replace them with no delays. We cannot call `QCheck2.assume` here (it can only be called from the body of a test), otherwise sometimes we might get a failure like this: ``` Test Dune__exe__Unixext_test.test_proxy failed: ERROR: uncaught exception in generator for test Dune__exe__Unixext_test.test_proxy after 100 steps: Exception: QCheck2.Failed_precondition Backtrace: Raised at QCheck2.assume in file "src/core/QCheck2.ml" (inlined), line 57, characters 29-54 Called from Xapi_fd_test__Generate.delay_of_size in file "ocaml/libs/xapi-stdext/lib/xapi-fd-test/generate.ml", line 66, characters 2-34 Called from QCheck2.Tree.bind in file "src/core/QCheck2.ml", line 207, characters 28-31 Called from QCheck2.Gen.bind in file "src/core/QCheck2.ml", line 271, characters 72-80 Called from QCheck2.Tree.bind in file "src/core/QCheck2.ml", line 207, characters 28-31 Called from QCheck2.Test.check_state in file "src/core/QCheck2.ml", line 1749, characters 12-32 ``` Signed-off-by: Edwin Török <edwin.torok@cloud.com>
… must be at least 1 Reject <1us timeouts, they are likely the result of a rounding down bug. If we try to use a timeout that is >0, but < 1e-6 it'd be truncated to 0. 0 has a special meaning of 'no timeout', rather than immediately timeout. The generator got already changed to avoid generating such short timeouts. Signed-off-by: Edwin Török <edwin.torok@cloud.com>
636dea6
to
3b9b7b4
Compare
Still fails, I wonder whether this is because this is the old code using Ah nevermind, I'm still testing the overall timeout, but even though the overall timeout wasn't hit, the per operation timeout might've been. |
…n call really_input takes a timeout parameter, however this is the timeout for each individual read call, not the entire function. So if we have a sender that sends a 64KiB buffer 1 byte at a time, and we set a timeout of 100ms then the operation can take up to 109 minutes. The timeout generated by quick-check is meant to be the total timeout, so divide it by the number of read/write calls that we are expected to make, to ensure that the overall timeout is not excessive. If the timeout ends up being too small then we tell quickcheck to generate another configuration with `assume`. Signed-off-by: Edwin Török <edwin.torok@cloud.com>
3b9b7b4
to
c70d3de
Compare
10 iterations of stresstest in
|
Noticed by @psafont in https://github.com/xapi-project/xen-api/actions/runs/9901545401/job/27354091939#step:7:149.
The bug is in the test code, I misinterpreted what the meaning of the timeout parameter in
Buf_io.really_input
means, it is not the overall timeout, but the timeout for each individual read call.So if the other end decides to send 1 byte at a time, then the overall timeout will be number of bytes * timeout.
This may be an unintentional bug in
really_input
, however it is unknown whether any callers rely on the current semantics or not, so fix the test instead to check for the actual behaviour ofreally_input
.There is also a bug in the use of
setsockopt
. It gets converted to astruct timeval
, and the microseconds value truncated.So if we end up asking for a too small timeout (which as a result of the above change we might!) then that would get truncated to 0 and instead we'd hang indefinetely, because 0 means "no timeout".