-
Notifications
You must be signed in to change notification settings - Fork 497
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use the newer testing clock. #9709
Conversation
The testing clock now iterates faster when given a long timeout, and gives stack traces when the waiters don't match expectations.
!!build!! |
It appears that we have quite a few places that expected being able to pass a timeout of '0s' to work correctly, which I apparently broke. The old code would always check 10 times, and call time.Sleep(0) each time. The new one notices the timeout immediately. |
Still needs the test suite updates, but we need this change for other reasons.
Several tests used a timeout of 0, but the new clock means that the waiter has to *already* be waiting.
InstancePoller was the only place using that syntax, I just change it to use ShortWait. |
21:08:31 make: go: Command not found (doesn't sound like my code) |
#9712 ## Description of change This changes how we handle unblocking and how we determine nextTick. Instead of checking blocks on *every* pass through the loop(), we only check when we get a tick and trigger potential expiration. Christian and I discussed why it was safe, which boils down to. If someone adds a claim that would expire to a different controller, we would only care about expiring it if we had someone blocked on it expiring. Which we can tell from our blocks (arguably we can immediately unblock a block if it isn't in our Leases, which we could easily check without looking at all leases.). I renamed tick() to expire() and retryingTick to retryingExpire, because that code is only the expiration path, while tick() is now tracking the timer and handling unblocking people. There is probably also a lot of extra logging in there, and we should audit what is actually useful in practice vs just what was immediately developer focused. This currently needs: #9709 Though we could update the tests if we decide not to do it. expire() now informs the main loop that it is a good time to process its blocks() once it has gone through the processing. Further, handleCheck() now notices if it saw stale data, and then causes the expire() to tick faster than it would otherwise, which will speed up things getting unblocked. ## QA steps The test suite should cover the important cases. The other thing you could do is run up the leadershipclaimer I wrote and try to abuse leases with it, and see if we are able to get into any bad conditions. ## Documentation changes None. ## Bug reference https://bugs.launchpad.net/juju/+bug/1814556
#9748 ## Description of change This just brings develop up-to-date with all the 2.5 patches: prdesc Merge pull request #9745 from jameinel/2.5-lease-invalid-retries-1815719 prdesc Merge pull request #9740 from wallyworld/cmr-multi-offer-fix prdesc Merge pull request #9742 from jameinel/2.5-worker-lease-1815468 prdesc Merge pull request #9736 from jameinel/2.5-leadership-client prdesc Merge pull request #9737 from jameinel/2.5-worker-lease-1815468 prdesc Merge pull request #9722 from achilleasa/fix-1812227 prdesc Merge pull request #9734 from babbageclunk/state-worker-dep-message prdesc Merge pull request #9733 from babbageclunk/raftlease-stop-global-clock prdesc Merge pull request #9735 from howbazaar/2.5-mongo-systemd-ulimit prdesc Merge pull request #9724 from babbageclunk/raftlease-upgrade-blank prdesc Merge pull request #9731 from howbazaar/2.5-status-close-error prdesc Merge pull request #9730 from howbazaar/2.5-lease-race prdesc Merge pull request #9727 from achilleasa/fix-1814638 prdesc Merge pull request #9728 from wallyworld/rename-delete-storage-pool prdesc Merge pull request #9712 from jameinel/2.5-leases-nextTick prdesc Merge pull request #9709 from jameinel/2.5-update-testing-clock ## QA steps See individual patches. ## Documentation changes See individual patches. ## Bug reference prdesc https://bugs.launchpad.net/juju/+bug/1815719 prdesc https://bugs.launchpad.net/juju/+bug/1813151 prdesc https://bugs.launchpad.net/juju/+bug/1815179 prdesc https://bugs.launchpad.net/juju/+bug/1815471 prdesc https://bugs.launchpad.net/juju/+bug/1815468 prdesc https://bugs.launchpad.net/juju/+bug/1812227 prdesc https://bugs.launchpad.net/juju/+bug/1815405 prdesc https://bugs.launchpad.net/juju/+bug/1813996 prdesc https://bugs.launchpad.net/juju/+bug/1813995 prdesc https://bugs.launchpad.net/juju/+bug/1815397 prdesc https://bugs.launchpad.net/juju/+bug/1814638 prdesc https://bugs.launchpad.net/juju/+bug/1814556
Description of change
The testing clock now iterates faster when given a long timeout, and
gives stack traces when the waiters don't match expectations.
This is just cleanup of the testing clock that helped when I was debugging something else, but we may as well get it into our core code. Only changes the testing clock, so it may change the test suite, but doesn't change production code.
juju/clock#3
juju/clock#4
We need (4) #9712
QA steps
None.
Documentation changes
None.
Bug reference
None.