Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
Make DebugHook test more reliable on busy machines #7038
Conversation
anastasiamac
approved these changes
Feb 27, 2017
In terms of getting test to pass more frequently, increasing timeout seems fine. As such it's an improvement for getting more frequent Blesses :D Hence, LGTM \o/
However, I wonder if we are just kicking the can and are actually obscuring an underlying issue?
|
$$merge$$ |
|
Status: merge request accepted. Url: http://juju-ci.vapour.ws:8080/job/github-merge-juju |
jujubot
merged commit d277b90
into
juju:2.1
Feb 27, 2017
|
@anastasiamac yeah, I thought about that too - bumping up a timeout isn't a very satisfying "fix". I couldn't see a way of making it more inherently reliable in this case - since we're starting an external flock process running sleep there's not much we can do to connect the timeout to something happening in this process. I guess the flock command could run something that listened to a socket and died when it received a message, and we could write to the socket when we know the goroutine has run? It feels like that's complicated enough to end up with a less reliable test though. |
babbageclunk commentedFeb 27, 2017
Description of change
TestRunHook was regularly failing on test hosts with an error indicating
that the flock process was being killed by a timer before we managed to
get the debug directory. It passes reliably on a (relatively unloaded)
dev machine, but I could simulate a heavier load by adding a delay in
the goroutine the test starts and get similarly flaky behaviour.
Increase the timeout of the flock command to match the timeout of the
select loop below it - this should make the failures much less likely.
Bug reference
Hopefullly fixes https://bugs.launchpad.net/juju/+bug/1612747