Skip to content
This repository has been archived by the owner on Sep 30, 2020. It is now read-only.

Allow major Etcd upgrades with safe roll-back #1773

Conversation

davidmccormick
Copy link
Contributor

@davidmccormick davidmccormick commented Nov 13, 2019

Safer major etcd upgrades by spinning up new etcds and performing a migration (copy of all kubernetes data)

Similar to the approach used during the stack migration but this time the major/minor version of etcd is used to control migration e.g 3.2.x -> 3.3.x will cause a migration and the migration is handled within the same etcd stack. It is safer because should the CF roll fail the previous etcd's should still be available to roll-back to. It works by embedding the version into the etcd cloudformation logical names so that version upgrades generates a new set of etcds, e.g.

+00:00:58	CREATE_IN_PROGRESS      		Etcdv3dot3i2ENI
+00:00:58	CREATE_IN_PROGRESS      		Etcdv3dot3i2LC
+00:00:58	CREATE_IN_PROGRESS      		Etcdv3dot3i0LC
+00:00:58	CREATE_IN_PROGRESS      		Etcdv3dot3i1LC
+00:00:59	CREATE_IN_PROGRESS      		Etcdv3dot3i2ENI       	"Resource creation Initiated"
+00:00:59	CREATE_IN_PROGRESS      		Etcdv3dot3i1ENI
+00:00:59	CREATE_IN_PROGRESS      		Etcdv3dot3i2LC        	"Resource creation Initiated"
+00:00:59	CREATE_IN_PROGRESS      		Etcdv3dot3i0LC        	"Resource creation Initiated"
+00:00:59	CREATE_IN_PROGRESS      		Etcdv3dot3i1LC        	"Resource creation Initiated"
+00:00:59	CREATE_IN_PROGRESS      		Etcdv3dot3i0ENI
+00:00:59	UPDATE_IN_PROGRESS      		IAMRoleEtcd
+00:00:59	CREATE_IN_PROGRESS      		Etcdv3dot3i1ENI       	"Resource creation Initiated"
+00:00:59	CREATE_COMPLETE         		Etcdv3dot3i2LC
+00:00:59	CREATE_COMPLETE         		Etcdv3dot3i0LC
+00:00:59	CREATE_COMPLETE         		Etcdv3dot3i1LC
+00:00:59	CREATE_IN_PROGRESS      		Etcdv3dot3i0ENI       	"Resource creation Initiated"
+00:01:09	UPDATE_COMPLETE         		IAMRoleEtcd
+00:01:15	CREATE_COMPLETE         		Etcdv3dot3i2ENI
+00:01:15	CREATE_COMPLETE         		Etcdv3dot3i1ENI
+00:01:15	CREATE_COMPLETE         		Etcdv3dot3i0ENI
+00:01:17	CREATE_IN_PROGRESS      		Etcdv3dot3i2EBS
+00:01:17	CREATE_IN_PROGRESS      		Etcdv3dot3i2EBS       	"Resource creation Initiated"
+00:01:17	CREATE_IN_PROGRESS      		Etcdv3dot3i1EBS
+00:01:18	CREATE_IN_PROGRESS      		Etcdv3dot3i1EBS       	"Resource creation Initiated"
+00:01:18	CREATE_IN_PROGRESS      		Etcdv3dot3i0EBS
+00:01:18	CREATE_IN_PROGRESS      		Etcdv3dot3i0EBS       	"Resource creation Initiated"
+00:01:33	CREATE_COMPLETE         		Etcdv3dot3i2EBS
+00:01:34	CREATE_COMPLETE         		Etcdv3dot3i1EBS
+00:01:34	CREATE_COMPLETE         		Etcdv3dot3i0EBS
+00:01:35	CREATE_IN_PROGRESS      		Etcdv3dot3i2
+00:01:36	CREATE_IN_PROGRESS      		Etcdv3dot3i1
+00:01:36	CREATE_IN_PROGRESS      		Etcdv3dot3i2          	"Resource creation Initiated"
+00:01:36	CREATE_IN_PROGRESS      		Etcdv3dot3i0
+00:01:36	CREATE_IN_PROGRESS      		Etcdv3dot3i1          	"Resource creation Initiated"
+00:01:37	CREATE_IN_PROGRESS      		Etcdv3dot3i0          	"Resource creation Initiated"
+00:03:59	CREATE_IN_PROGRESS      		Etcdv3dot3i2          	"Received SUCCESS signal with UniqueId i-0797f12fe4b3d5840"
+00:04:00	CREATE_COMPLETE         		Etcdv3dot3i2
+00:04:00	CREATE_IN_PROGRESS      		Etcdv3dot3i0          	"Received SUCCESS signal with UniqueId i-09caf63ec80dd3f37"
+00:04:01	CREATE_COMPLETE         		Etcdv3dot3i0
+00:04:39	CREATE_IN_PROGRESS      		Etcdv3dot3i1          	"Received SUCCESS signal with UniqueId i-02b447cadb74b4265"
+00:04:40	CREATE_COMPLETE         		Etcdv3dot3i1

kube-aws uses a new tag on the etcds kube-aws:etcd_upgrade_group to look for etcds in the same upgrade group as the requested deployment; if the cluster exists (i.e. cluster etcd stack exists) but there are no matching etcd_upgrade_group instances then kube-aws will trigger the migration mode. In migration mode the leader of the new etcd cluster exports the kubernetes registry from the existing/old etcds and restores them into the new cluster.

Features

  • Brings all new etcds up at same time during a migration.
  • When an etcd has an attached NIC use that address rather than the machine's private dnsname
  • Update Etcd to 3.3.17 release
  • Fix etcdadm so that it can still detect cluster healthy now written to stderr
  • Update etcd migration to respect keys with leases
  • Fix building etcd endpoints where the interfaces are listed in different orders
  • Move to a two export process for retrieving keys and values using etcdctl 'json' export type for key/value data, and then again using its 'fields' export type in order to successfully extract key/lease data. * Process the two files back together with a nod to performance.

…igration (copy of all kubernetes data)

Simialr to the approach used during the stack migration but this time the major/minor version of etcd is used to control migration e.g 3.2.x -> 3.3.x will cause a migration.
It is safer because should the CF roll fail the previous etcd's should still be available to fall-back to.

Bring all new etcds up at same time during a migration.
Correct looking up of configsets now that the instance name has changed.
When an etcd has an attached NIC use that address rather than the machine's private dnsname

Update Etcd to 3.3.17 release
Fix etcdadm so that it can still detect cluster healthy now written to stderr

Update etcd migration to respect keys with leases

Fix building etcd endpoints where the interfaces are listed in different orders

Move to a two export process for retrieving keys and values using etcdctl 'json'  export type for key/value data, and then again using its 'fields' export type in order to successfully extract key/lease data.  Process the two files back together with a nod to performance.
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Nov 13, 2019
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Nov 13, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign davidmccormick
You can assign the PR to them by writing /assign @davidmccormick in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

…ror if the default etcd version hadn't been correctly linked into the binary - which it isn't during testing).
@davidmccormick davidmccormick added this to the v0.15.0 milestone Nov 13, 2019
@codecov-io
Copy link

codecov-io commented Nov 13, 2019

Codecov Report

Merging #1773 into master will decrease coverage by 0.24%.
The diff coverage is 5.4%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1773      +/-   ##
==========================================
- Coverage   24.76%   24.52%   -0.25%     
==========================================
  Files          98       98              
  Lines        5023     5085      +62     
==========================================
+ Hits         1244     1247       +3     
- Misses       3640     3699      +59     
  Partials      139      139
Impacted Files Coverage Δ
pkg/api/etcd_cluster.go 18.51% <0%> (-7.8%) ⬇️
pkg/model/context.go 0% <0%> (ø) ⬆️
pkg/api/etcd.go 29.54% <0%> (-1.78%) ⬇️
pkg/model/etcd_node.go 6.95% <10%> (-0.26%) ⬇️
pkg/model/etcd_cluster.go 91.3% <100%> (+1.3%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 37d1ec7...8da1fcf. Read the comment docs.

Copy link
Contributor

@dominicgunn dominicgunn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of comments.

if state.EtcdMigrationEnabled {
logger.Warn("Performing a Major Etcd Version Upgrade: -")
logger.Warn("To do this we will spin up new etcd servers and then export the existing kubernetes state to them.")
logger.Warn("There will be cluster disruption until all of your existing controllers have rolled.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the window of disruption here? Are we talking service outages?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question - we are talking about a service outage of the kubernetes api of several minutes (good case) longer if a roll-back is triggered after api-servers have been rolled.

Values: []*string{aws.String("owned")},
},
{
Name: aws.String("tag:kube-aws:etcd_upgrade_group"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this backward compatible with older clusters? I guess it'd just return 0 instances right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes by returning 0 servers which match it knows it needs to perform a migration - so in effect even if we kept the version of etcd the same (3.2.26) it would still want to perform a migration the first time this code is merged and create Etcdv3dot2iX instances (removing the old Etcd[0-2] ones)


logger.Debugf("<- received %d instances from AWS", len(resp.Reservations))
if len(resp.Reservations) == 0 {
logger.Debugf("There are 0 instances matching major-minor version %s - this is an upgrade...", c.Etcd.Cluster.MajorMinorVersion())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this catches the backwards compatibility issues, cool.

logger.Warn("If the cloudformation update fails (at any point) then we will roll back to the original etcd servers.")
logger.Warn("You MAY lose/rollback changes that are made to the cluster AFTER the etcd export has been performed!")
logger.Warn("This operation is best scheduled for a quiet time or in an outage window.")
if state.EtcdMigrationExistingEndpoints, err = s.lookupExistingEtcdEndpoints(c); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going to bomb out on lookupExistingEtcdEndpoints, should we try and do that before telling the user that we're Performing a Major Etcd Upgrade?

As a layman, I might be concerned that something may have changed before the thing blew up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea - I've swapped it around so we announce the migration only if the lookup is successful.

@dominicgunn
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Nov 15, 2019
@davidmccormick davidmccormick merged commit ea799be into kubernetes-retired:master Nov 16, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants