fix(deisctl): use wait groups for stopping and uninstalling store #2307
Conversation
LGTM. Clean shutdown is a definite improvement. However, I'd also like to do some more testing to make sure this fixes the reported issues. |
Code LGTM, but we should test updating v0.14.1 -> this. |
I have an 0.14.1 cluster. I'd be happy to test this if it's ready for testing. |
The only issue I'm asserting this PR resolves is #2300, and that's mostly because of the added workaround to the documentation. Everything else is an optimistic improvement, although I'm fairly confident in the upgrade path moving forward based on my testing of this PR.
We really can't, as you'd need the logic from this branch to do the |
Right. I guess I can stop and uninstall them one-by-one in that order to emulate this change and see if that approach works, then validate that your |
That'd be a good test. 👍 |
Just pushed a new commit to start gateway before volume. Some users are seeing volume hang on new installs. I'd really like to avoid having to put sleeps back in the /bin/boot scripts. |
I just deployed a fresh Deis v0.15 three times with this branch and haven't seen a single store issue. |
I did the same steps as in #2300 again--although I did |
Glad to hear it! |
LGTM. My testing went fine, although it was tricky to do. But this should improve things when it lands in the next |
fix(deisctl): use wait groups for stopping and uninstalling store
When we start deis-store components, we do so in a specific order to ensure they come up properly. Unfortunately, we don't do this for stop/uninstall, and bring them all down simultaneously. This seems to result in an issue that occurs when upgrading installations where the components don't all come back properly (see #2300, #2301, #2294, #2286). This PR changes deisctl to use wait groups in the stop and uninstall commands to also stop these services in an optimal order. Additionally, this PR also updates the "upgrading Deis" instructions to instruct users to stop services before uninstalling them. This gives components the opportunity to exit in a clean state rather than be forced down uncleanly.
Note that services should always be stopped before an uninstall to guarantee a safe shutdown, but we use a specific start order for uninstall just in case this helps when people don't stop first.
Unfortunately this doesn't help people who are upgrading from v0.14.x to v0.15.0 (as the fix will only be in upcoming releases of deisctl), but a note was added to the "upgrading Deis" instructions with a workaround to address the issue.
TESTING: I tested this PR by installing v0.15.0 of Deis, performing a
deisctl install platform && deisctl start platform
followed bydeisctl stop platform && deisctl uninstall platform
, and then again adeisctl install platform && deisctl start platform
. Everything came back happily.closes #2300
refs #2301, refs #2294, refs #2286