Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrading Kubernetes #75

Open
guiocavalcanti opened this issue Sep 29, 2016 · 20 comments
Open

Upgrading Kubernetes #75

guiocavalcanti opened this issue Sep 29, 2016 · 20 comments
Assignees

Comments

@guiocavalcanti
Copy link

Do you have plans on using k8s 1.4.0? If not, how can I upgrade my version?

@wellsie
Copy link
Member

wellsie commented Sep 29, 2016

Yes. Will release update later today.

On Sep 29, 2016, at 6:46 AM, Guilherme Cavalcanti notifications@github.com wrote:

Do you have plans on using k8s 1.4.0? If not, how can I upgrade my version?


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.

@owenmorgan
Copy link

Will we be able to update an existing cluster?

@adambom
Copy link

adambom commented Oct 1, 2016

@owenmorgan upgrading the k8s version requires deleting the etcd cluster, where all the kubernetes state is stored on ephemeral disk. I have a forked version of tack where etcd state is persisted on an ELB volume, and that works beautifully. Would anybody be interested in a PR to contribute that back to tack (@wellsie)? Let me know and I will clean up and submit.

In the meantime, you can use a simple workaround to recover from losing the etcd cluster. Before upgrading, you would run this code snippet.

This will allow you to recover all cluster state (including PV's). ELB's will be regenerated, so update any DNS records accordingly.

@owenmorgan
Copy link

Thanks @adambom. how are we looking on an update @wellsie ?

@adambom
Copy link

adambom commented Oct 4, 2016

@owenmorgan looks like it was patched in 8f2a62e

@adambom
Copy link

adambom commented Oct 4, 2016

Oh one other thing you'll need to do when you upgrade is taint or manually update the S3 bucket, so that the files in manifests/etc.tar point to the version of k8s you want to use. Otherwise the update won't actually take.

@owenmorgan
Copy link

great. ill give it a shot. thanks @wellsie @adambom

@owenmorgan
Copy link

is the backup / restore still necessary @adambom ?

@wellsie
Copy link
Member

wellsie commented Oct 4, 2016

i recommend upgrading the cluster manually. i will write up the procedure later this week - in the meantime here is the basic process:

update kubelet.service on worker nodes

  • ssh into each node and update KUBELET_VERSION in /etc/systemd/system/kubelet.service

make instances (new with #77) will dump the ips of all nodes master (etcd,apiserver) and workers. do make ssh-bastion and then from there ssh into each box one at a time.

update kubelet.service on etcd/apiserver nodes

repeat the above procedure for the master (etcd,apiserver) nodes.

update version in kubernetes manifests on etcd/apiserver nodes

grep 1.4 /etc/kubernetes/manifests/*
/etc/kubernetes/manifests/kube-apiserver.yml:    image: quay.io/coreos/hyperkube:v1.4.0_coreos.0
/etc/kubernetes/manifests/kube-controller-manager.yml:    image: quay.io/coreos/hyperkube:v1.4.0_coreos.0
/etc/kubernetes/manifests/kube-proxy.yml:    image: quay.io/coreos/hyperkube:v1.4.0_coreos.0
/etc/kubernetes/manifests/kube-scheduler.yml:    image: quay.io/coreos/hyperkube:v1.4.0_coreos.0

i'm looking into ways to automate this. it hasn't been a priority since the procedure is fairly straight forward. note that running pods should continue to run during this procedure.

@wellsie wellsie self-assigned this Oct 12, 2016
@nkhine
Copy link
Contributor

nkhine commented Oct 21, 2016

adambom added a commit to adambom/tack that referenced this issue Oct 24, 2016
If you ever lose your etcd cluster for whatever reason, or if you should ever need to restart it, you should be able to recover your state. Mentioned in this issue: kz8s#75
@rimusz
Copy link

rimusz commented Oct 26, 2016

@wellsie any update on the kubernetes automated update? it is fine to do those ^^^ commands manually if you have a small cluster, but with the big would be a headache :)

@rimusz
Copy link

rimusz commented Oct 26, 2016

ok, have checked out to update /etc/systemd/system/kubelet.service with the never k8s version, the change does not survive the reboot. :(

@yagonobre
Copy link
Contributor

@rimusz, It is because tack use user-data, that run every time that the machine power up.
You can stop the instance and edit the version on user-data and then start the instance.

I replace user-data with cloud init in my environment, if everything work fine i will submit a PR.

@yagonobre
Copy link
Contributor

You can use this procedure

Update worker nodes

  1. Create a new launch configuration, you can clone the existing LC and edit the kubernetes version on user-data (have 2 Occurrence).
  2. Terminate all instances and create anothers with the new LC (Be sure that you no have persistentes volumes, e.g. databases, and that your pods are replicated)
    1. Detach Instance by In from the ASG, mark the checkbox for create a new instance, check with kubectl get nodes if the new node are running, then terminate the node that you detach from ASG.
    2. Do It for all nodes.
  • Update user-data for each instance is a alternative.

Update master nodes

  1. Update kubernetes manifests on s3 bucket
    1. Download tar file

      aws s3 cp s3://[BUCKET-URL]/manifests/etc.tar .
      tar -xvf etc.tar
      
    2. Edit the k8s version in all files

      grep 1.4 *.yml
      kube-apiserver.yml:    image: quay.io/coreos/hyperkube:v1.4.0_coreos.0
      kube-controller-manager.yml:    image: quay.io/coreos/hyperkube:v1.4.0_coreos.0
      kube-proxy.yml:    image: quay.io/coreos/hyperkube:v1.4.0_coreos.0
      kube-scheduler.yml:    image: quay.io/coreos/hyperkube:v1.4.0_coreos.0
      
    3. Compress and send file to s3

      tar -cvf etc.tar *.yml
      aws s3 cp etc.tar s3://[BUCKET-URL]/manifests/etc.tar
      
  2. Update user-data for each node
    1. You need to stop instance by instance for edit k8s version on user data (Be sure that you not stop more than one instance per time).
    2. Start the instance
    3. Check health of etcd cluster with etcdctl cluster-health, if all nodes are healthy do it to other instance

@wellsie, please validate this.

@rimusz
Copy link

rimusz commented Nov 7, 2016

@yagonobre thanks for your solution. it looks good, but has way to many manual fiddling, specially with the user-data for each instance, way too much hassle for production clusters.
I found using global fleet units for k8s services is much better way to make k8s upgrades.

@rokka-n
Copy link

rokka-n commented Jan 14, 2017

Why is it not possible to replace etcd node and let it re-sync with the cluster?

@yagonobre
Copy link
Contributor

@rokka-n I do it

@fearphage
Copy link

Are you open to incorporating automated Kubernetes upgrades? If not is the purpose of this project a one-time setup and then you don't need this project anymore?

@wellsie
Copy link
Member

wellsie commented Feb 3, 2017 via email

@fearphage
Copy link

Yes open to automated upgrades

Excellent! However without automated upgrades, is this intended to be a single-use project?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants