-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade: Best effort drain when upgrading single node cluster #3447
Upgrade: Best effort drain when upgrading single node cluster #3447
Conversation
Since we always run (even during upgrade) with Salt Python3 and a recent version of Jinja we can use `loop.previtem` to get the previous item from the loop that can be used as "requisite" for salt state
Even if the kube-apiserver is ready, apiserver may not be available because of ... reasons (e.g.: etcd not yet fully ready), in order to prevent a failure in next upgrade step, let's make sure apiserver is healthy with a query on apiserver. Also wait for etcd container before apiserver as apiserver cannot work without etcd
By default with kubeadm config kube-apiserver bind on all address `0.0.0.0` even if it's bind to a single one, this commit just bind APIServer to the Control Plane IP (which is the advertise IP) so APIServer is only reachable on Control Plane IP and no longer reachable on `127.0.0.1:6443`
Since salt drain module use kubernetes salt module as part of MetalK8s to retrieve objects and try to retrieve controller of each pod, it may happens that the controller object is not known by kubernetes salt module so we are unable to get the controller object. Instead of having an ugly traceback for this just consider we do not manage the controller so if `force=True` we just evict the Pod as every other "classic Pod"
In some case we may want to just do a "best effort" drain, so that we do not hang if unable to evict a specific pod for whatever reason
When running in single node cluster we want to drain the node so that we have as less pod running on the node as possible. Fixes: #3445
Hello teddyandrieux,My role is to assist you with the merge of this Status report is not available. |
Waiting for approvalThe following approvals are needed before I can proceed with the merge:
Peer approvals must include at least 1 approval from the following list:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
/approve |
In the queueThe changeset has received all authorizations and has been added to the The changeset will be merged in:
The following branches will NOT be impacted:
There is no action required on your side. You will be notified here once IMPORTANT Please do not attempt to modify this pull request.
If you need this pull request to be removed from the queue, please contact a The following options are set: approve |
I have successfully merged the changeset of this pull request
The following branches have NOT changed:
Please check the status of the associated issue None. Goodbye teddyandrieux. |
Component:
'lifecycle'
Context:
See: #3445
Summary:
When upgrading a cluster with a single control plane node, first do a "best-effort" drain that will evict all Pods but does not retry if some Pods cannot be evicted instead continue the "classic" upgrade process
NOTE: We do not explicitly uncordon the node at the end, as it's automatically handled by the "deploy_node" orchestrate
Fixes: #3445