-
-
Notifications
You must be signed in to change notification settings - Fork 15.5k
tests: Wait for shell for twice as long (10m) #53828
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
See NixOS#49441 for an earlier attempt, which was subsequently reverted. I am assuming that doubling the time will be sufficient if the machine is overloaded since so many of the tests already pass at 5 minutes, while still not holding back failures for needlessly long.
I re-ran the uefi boot test 17 times in a row. The system was under load for many of these tries. I didn't produce a single timeout. Yet hydra has timed out twice in a row now. Seems fishy. Increasing the timeout is worth another try I guess. |
It would also be nice to have a little bit more detail in the error message when the timeout is reached. At least the actual timeout value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets try this out. The random test failures consistently block nixos-unstable, and just makes us inattentive to actually failing tests
I ran the test 30 times now, most of the times under load, without reproducing any failure. The 31st attempt failed, but with a differnt than usual error ("No space left on device" while creating the iso image, although there should've been plenty of space). I'm getting less optimistic that this will fix anything, but lets try. |
Nevermind, I did actually somehow run out of space, |
Maybe it was out of inodes ( |
Anyways, I think that is unrelated. |
Well… I'd say let's try (and backport to 18.09 which has also been blocked for ~5 days). The failures appear to be quite localized to Anyway, the only other option I can see (if it's not just slow to connect) is that it's blocking before connection, which should be easier to reproduce locally than a timeout, so… |
Actually something else maybe to note for the underlying issue: |
Ah, forgot to link this: That PR intends to add timing data to all tests, with an additional specific check for the "connecting" phase. This should help us find out where the hangup is. |
[release-18.09 4d5935d] tests: Wait for shell for twice as long (10m) |
See #49441 for an earlier attempt, which was subsequently reverted. I am
assuming that doubling the time will be sufficient if the machine is
overloaded since so many of the tests already pass at 5 minutes, while
still not holding back failures for needlessly long.
Things done
sandbox
innix.conf
on non-NixOS)nix-shell -p nox --run "nox-review wip"
./result/bin/
)nix path-info -S
before and after)cc @Ekleog @timokau who were involved in discussions on #nixos-dev