Upgrade Kubernetes Cluster

For a cluster built with aws, pentagon and kops

Update the kubernetes version and AMI image

In the cluster.yaml, update the kubernetes version
In the nodes.yaml and masters.yaml, update the AMI image id

Replace those files using kops

Look at the kops.sh script. Make sure the verb replace is present, and comment-out the secrets script if you don't need it
./kops.sh

Update the launch configuration

Run kops update cluster
This means that any new nodes that launch will use the updated K8s version and AMI image

Upgrade just the masters

Run kops rolling-update cluster to see all nodes
Take note of the names of each instance group.
Run kops rolling-update cluster and pass in the instance group flag with the name of each ig, like this:
- kops rolling-update cluster --instance-group master-us-east-1a --instance-group master-us-east-1b --instance-group master-us-east-1c
This takes a while because each master is terminated and replaced one at a time

Bestest Cluster Upgrade

Double the cluster size by upping the mins and maxes

Edit the kops instance groups:
- kops edit ig nodes-us-east-1a

If you are using ELBs, take a moment here to validate that your new nodes are Healthy in the ELB.
Check your load balancer services and change the externalTrafficPolicy from Local to Cluster Here's an example of how to find those services:
- kubectl get svc --all-namespaces | grep LoadBalancer
- kubectl -n infra edit svc nginx-ingress-external-controller
- kubectl -n infra edit svc nginx-ingress-internal-controller
Cordon the old nodes
- kubectl get nodes | grep <old version> | awk '{print $1}' | xargs kubectl cordon
Drain the old nodes
- kubectl get nodes | grep SchedulingDisabled | awk '{print $1}' | xargs kubectl drain --ignore-daemonsets --delete-local-data --force
Change back what you've changed
- externalTrafficPolicy: Local
- mins and maxes with kops edit

note: there's an alternative to turning off the cluster-autoscaler and it is modifying the minimum node count on the ASG, and the autoscaler respects that number (especially since we are using automatic node-group detection), so it won’t try to delete any nodes during the upgrade because the minimum is the double cluster size

Vocabulary

A rolling update means that the deployment updates pods one at a time (or whatever cadence) so there is no downtime.
Cordoning a node: no new pods will be scheduled on the node
Draining a node: boots out all pods on a node

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upgrade Kubernetes Cluster

For a cluster built with aws, pentagon and kops

Bestest Cluster Upgrade

Vocabulary

Clone this wiki locally