-
Notifications
You must be signed in to change notification settings - Fork 38.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rolling node upgrade #7938
Rolling node upgrade #7938
Conversation
+cc @roberthbailey |
@@ -102,6 +100,10 @@ function wait-for-master() { | |||
function prepare-upgrade() { | |||
ensure-temp-dir | |||
detect-project | |||
# TODO(mbforbes): Do we still need get-password? | |||
get-password | |||
# TODO(mbforbes): Do we need get-bearer-token here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see why you'd need the kubectl (client) bearer token to upgrade nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
KUBE_BEARER_TOKEN
is required by build-kube-env
:
https://github.com/GoogleCloudPlatform/kubernetes/blob/9b2a0df811db637442daf93bcffefacaedea6f2a/cluster/gce/debian/helper.sh#L46
However, this comment in configure-vm.sh
:
https://github.com/GoogleCloudPlatform/kubernetes/blob/9b2a0df811db637442daf93bcffefacaedea6f2a/cluster/gce/configure-vm.sh#L256
says that we don't need these data (KUBE_PASSWORD
, KUBE_USER
, KUBE_BEARER_TOKEN
, KUBELET_TOKEN
, KUBE_PROXY_TOKEN
) except on init, and specifically not on an upgrade. However, it might be assuming that an upgrade happens in-place and retains the data that happens on cluster init; with the upgrade process introduced in this PR, that isn't true as we blow away the node.
The call to the function discussed in the previous paragraph, create-salt-auth
, happens for both master and nodes:
https://github.com/GoogleCloudPlatform/kubernetes/blob/9b2a0df811db637442daf93bcffefacaedea6f2a/cluster/gce/configure-vm.sh#L456
so I don't know whether we can just remove these vars. @zmerlynn should probably chime in here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't need auth data when updating the master, since the master stores the auth data on a PD which survives rebuilding the VM (and if we lose that data PD we are pretty well screwed anyway since it contains the cluster state).
For the nodes, you are already scraping the kubelet token and kube proxy token from the existing metadata, so that can be used to fill in those two fields.
Unless I'm reading this incorrectly, this is within upgrade.sh so it shouldn't affect cluster provisioning to remove these here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've moved this to the master-only section since #8009 has been merged.
Can you explain what happens during the rolling update? Do the new VMs have different names than the original VMs? Does a new VM get created (and healthchecked in any way) before deleting an existing VM? How does the rolling update command know when to start updating the second VM? |
@roberthbailey these answers are from my current understanding of MIG rolling-updates. Do the new VMs have different names than the original VMs? Does a new VM get created (and healthchecked in any way) before deleting an existing VM? A rolling-update ensures that the state of the new VM is "running," but it is not healthchecked in any other way, it does not support pluggable healthchecking, and it will not try to repair a VM if it is not "running"—it will simply pause the rolling-update. How does the rolling update command know when to start updating the second VM?
then it would proceed to the second as soon as the first is finished. It determines that the first instance is finished as soon as it is "running". It will wait up to |
LGTM |
I think this actually might be actually ready to get in as a draft. I've addressed @roberthbailey's shell comments and pushed my newer version. There are still a handful of cleanup TODOs but the core of what we care about is in. This also fixes master upgrade, which has drifted. @roberthbailey I've updated my comment response above that answers the questions you posed; let me know if it's adequate. @zmerlynn LGTY? |
Well, that was fast. |
Fixes #6088
After doing this and waiting a bit,
kubectl get nodes
reports nodes asUnknown
and then back toReady
. Also,kubectl get pods
reports the pods on the minions asPending
and then eventuallyRunning
.I have some confidence this might actually be the true state of the world, because I screwed up the kube/kube-proxy tokens in an earlier implementation, and neither of the above were true.
BTW, the validate cluster doesn't actually do what's intended right now, I think, because it returned successfully far before my cluster was working again.
Comments / feedback welcome. And yes, definitely want to bake this into a(n e2e) test ASAP.