Added user doc for GCE HA master #1810

jszczepkowski · 2016-11-29T12:30:15Z

Added user doc for GCE HA master.

This change is

jszczepkowski · 2016-11-29T12:47:10Z

Part of kubernetes/enhancements#48

jszczepkowski · 2016-11-29T13:33:38Z

CC @kubernetes/sig-cluster-lifecycle @kubernetes/sig-cluster-ops

roberthbailey · 2016-11-29T20:56:25Z

docs/admin/ha-master-gce.md

+
+The sample command to set up the HA-compatible cluster:
+
+```


If you mark this as ```shell does it format nicer?

roberthbailey · 2016-11-29T20:56:39Z

docs/admin/ha-master-gce.md

+$ MULTIZONE=true KUBE_GCE_ZONE=europe-west1-b  ENABLE_ETCD_QUORUM_READS=true ./cluster/kube-up.sh
+```
+
+Please note that execution of the comments above will create a cluster with one master,


s/comments/commands

roberthbailey · 2016-11-29T20:57:28Z

docs/admin/ha-master-gce.md

+master.
+
+* `KUBE_GCE_ZONE=zone` - zone where the master replica will run.
+Should be in the same region as other replicas' zones.


Does the script enforce this? Or is it just recommended?

We don't support different regions. I'll add check to kube-up script.

roberthbailey · 2016-11-29T20:59:05Z

docs/admin/ha-master-gce.md

+
+### Deployment best practices
+
+* Try to place master replicas in different zones. During a zone failure, all master placed inside the zone will fail.


s/master/masters

roberthbailey · 2016-11-29T20:59:15Z

docs/admin/ha-master-gce.md

+### Deployment best practices
+
+* Try to place master replicas in different zones. During a zone failure, all master placed inside the zone will fail.
+o prevent zone failure, also place nodes in  multiple zones


roberthbailey · 2016-11-29T21:00:39Z

docs/admin/ha-master-gce.md

+So, both replicas are needed and a failure of any replica turns cluster into majority failure state.
+Such two replica setup is worse in terms of HA than a single replica setup.
+
+* During addition of a master replica, cluster state (etcd) is copied to a new instance.


Is there a way to track the current status of the data migration to see when it completes?

I don't know any easy way to track it. Maybe it is somehow reflected in etcd logs.

roberthbailey · 2016-11-29T21:01:42Z

docs/admin/ha-master-gce.md

+
+When starting the second master replica, a load balancer containing the two replicas will be created
+and the IP address of the first replica will be promoted to IP address of load balancer.
+Similarly, when after removal of a master replica, only one replica will remain,


"... after removal of the penultimate master replica, the load balancer..."

roberthbailey · 2016-11-29T21:08:33Z

docs/admin/ha-master-gce.md

+
+* Master certificates
+
+Master TLS certificates will be generated for the external public IP (of the load balancer) and local IP of each replica.


remove "of the load balancer". I think the external ip is described sufficiently well above.

roberthbailey · 2016-11-29T21:09:11Z

docs/admin/ha-master-gce.md

+
+Similarly, the external IP will be used by kubelets to communicate with master.
+
+* Master certificates


I think this should be a sub-heading instead of a bullet point.

roberthbailey · 2016-11-29T21:09:42Z

docs/admin/ha-master-gce.md

+There will be no certs for ephemeral public IP of replicas.
+So, accessing them using ephemeral public IP will be possible only when skipping TLS verification.
+
+* Clustering etcd


This should be a heading too.

davidopp · 2016-11-29T23:49:57Z

Should you add a pointer from
http://kubernetes.io/docs/admin/high-availability/
to this doc?

jszczepkowski · 2016-11-30T07:34:07Z

Should you add a pointer from
http://kubernetes.io/docs/admin/high-availability/
to this doc?

Yes, but I plan to update docs/admin/high-availability in another PR

jszczepkowski · 2016-11-30T10:04:09Z

Comments applied, PTAL

roberthbailey

A couple of minor points, after which you can self-apply the lgtm label.

roberthbailey · 2016-11-30T18:35:08Z

docs/admin/ha-master-gce.md

+To allow etcd clustering, ports needed to communicate between etcd instances will be opened (for inside cluster communication).
+To make such deployment secure, communication between etcd instances is authorized using SSL.
+
+## Future reading


s/Future/Additional

roberthbailey · 2016-11-30T18:35:56Z

docs/admin/ha-master-gce.md

+* `KUBE_GCE_ZONE=zone` - zone where the master replica will run.
+Must be in the same region as other replicas' zones.
+
+* you don't need to set `MULTIZONE` or `ENABLE_ETCD_QUORUM_READS` flags as they values will be inherited from already running clusters


This can be a paragraph instead of a bullet point, since it isn't a flag to set but rather guidance about flags not to set.

jszczepkowski · 2016-12-01T09:51:41Z

A couple of minor points, after which you can self-apply the lgtm label.

Applying LGTM

roberthbailey · 2016-12-01T16:01:47Z

I took off the lgtm because I'm not sure if we need a docs lgtm in addition to a technical lgtm (which is what the spreadsheet implies). We should clarify that and then get this merged.

devin-donnelly · 2016-12-01T19:04:47Z

We need both Tech LGTM and Docs LGTM. I usually interpret the regular "lgtm" label as Tech LGTM. Doing docs review now.

roberthbailey · 2016-12-01T19:11:56Z

thanks @devin-donnelly. i've added back the lgtm label.

devin-donnelly · 2016-12-01T19:05:58Z

docs/admin/ha-master-gce.md

+
+## Introduction
+
+In kubernetes version 1.5, we added alpha support for replication of kubernetes masters in kube-up/down scripts for GCE.


Avoid "we" constructs.

Suggested rephrase: "Kubernetes version 1.5 adds alpha support for replicating Kubernetes masters in kube-up or kube-down scripts for Google Container Engine."

devin-donnelly · 2016-12-01T19:06:27Z

docs/admin/ha-master-gce.md

+## Introduction
+
+In kubernetes version 1.5, we added alpha support for replication of kubernetes masters in kube-up/down scripts for GCE.
+This document describes how to use kube-up/down scripts to manage highly available (HA) masters and how HA masters are implemented for GCE case.


"implemented for GCE case" -> "implmented for use with GCE."

devin-donnelly · 2016-12-01T19:07:42Z

docs/admin/ha-master-gce.md

+
+## Running HA cluster on GCE
+
+### Starting HA-compatible cluster


Avoid nesting headings directly beneath one another; that's usually indicative of structural problems.

In this case, "Running HA Cluster on GCE" is uncessary and doesn't add anything. Move the subsequent headers one level up.

devin-donnelly · 2016-12-01T19:17:57Z

docs/admin/ha-master-gce.md

+
+### Starting HA-compatible cluster
+
+When creating a new HA cluster, two flags need to be set for kube-up script:


"To create a new HA cluster, you must set the following flags in your kube-up script:"

devin-donnelly · 2016-12-01T19:19:06Z

docs/admin/ha-master-gce.md

+If true, reads will be directed to leader etcd replica.
+Setting this value to true is optional: reads will be more reliable but will also be slower.
+
+In addition, we may specify in which GCE zone the first master replica will be created by setting:


"Optionally, you can specify a GCE zone where the first master replica is to be created. Set the the following flag:"

devin-donnelly · 2016-12-01T19:33:25Z

docs/admin/ha-master-gce.md

+(see [multiple-zones](http://kubernetes.io/docs/admin/multiple-zones/) for details).  
+
+* Do not use cluster with two master replicas. Consensus on a two replica cluster requires both replicas running when changing persistent state.
+So, both replicas are needed and a failure of any replica turns cluster into majority failure state.


"So", "As a result,"

devin-donnelly · 2016-12-01T19:34:06Z

docs/admin/ha-master-gce.md

+
+* Do not use cluster with two master replicas. Consensus on a two replica cluster requires both replicas running when changing persistent state.
+So, both replicas are needed and a failure of any replica turns cluster into majority failure state.
+Such two replica setup is worse in terms of HA than a single replica setup.


"A two-replica cluster is thus inferior, in terms of HA, to a single replica cluster."

devin-donnelly · 2016-12-01T19:34:37Z

docs/admin/ha-master-gce.md

+So, both replicas are needed and a failure of any replica turns cluster into majority failure state.
+Such two replica setup is worse in terms of HA than a single replica setup.
+
+* During addition of a master replica, cluster state (etcd) is copied to a new instance.


"During addition of a master replica," -> "When you add a master replica,"

devin-donnelly · 2016-12-01T19:36:02Z

docs/admin/ha-master-gce.md

+
+### Master service & kubelets
+
+Instead of trying to keep up-to-date list of kubernetes apiserver in kubernetes service, we will direct all traffic to the external IP:


"We" constuct. If "We" is in this case the Kubernetes system, say so.

"Instead of trying to keep an up-to-date list of Kubernetes apiserver in the Kubernetes service, the system directs all traffic to the external IP:"

devin-donnelly · 2016-12-01T19:39:15Z

docs/admin/ha-master-gce.md

+
+Master TLS certificates will be generated for the external public IP and local IP of each replica.
+There will be no certs for ephemeral public IP of replicas.
+So, accessing them using ephemeral public IP will be possible only when skipping TLS verification.


Avoid "so."
Avoid the the "them" pronoun; as written, "them" refers to "ephemeral public IPs" when it looks like you mean "master replicas. Be explicit. Also try to avoid future tense and passive voice.

Example rephrasing:
"Kubernetes generates Master TLS certificates for the external public IP and local IP for each replica. There are no certificates for the ephemeral public IP for replicas; to access a replica via its ephemeral public IP, you must skip TLS verification."

jszczepkowski · 2016-12-02T08:57:03Z

@devin-donnelly
Comments applied, PTAL

devin-donnelly · 2016-12-02T18:54:53Z

docs/admin/ha-master-gce.md

+Kubernetes version 1.5 adds alpha support for replicating Kubernetes masters in kube-up or kube-down scripts for Google Compute Engine.
+This document describes how to use kube-up/down scripts to manage highly available (HA) masters and how HA masters are implemented for use with GCE.
+
+## Starting HA-compatible cluster


"Starting an HA-compatible cluster"

devin-donnelly · 2016-12-02T18:55:19Z

docs/admin/ha-master-gce.md

+
+## Starting HA-compatible cluster
+
+To create a new HA-compatible cluster, you must set the following flags in your kube-up script:


code format kube-up

devin-donnelly · 2016-12-02T18:55:40Z

docs/admin/ha-master-gce.md

+## Adding a new master replica
+
+After you have created an HA-compatible cluster, you can add master replicas to it.
+You add master replicas by using a kube-up script with the following flags:


code format kube-up

devin-donnelly · 2016-12-02T18:55:50Z

docs/admin/ha-master-gce.md

+$ KUBE_GCE_ZONE=europe-west1-c KUBE_REPLICATE_EXISTING_MASTER=true ./cluster/kube-up.sh
+```
+
+## Removing master replica


"Removing a master replica"

devin-donnelly · 2016-12-02T18:57:05Z

docs/admin/ha-master-gce.md

+* `KUBE_GCE_ZONE=zone` - zone where the master replica will run.
+Must be in the same region as other replicas' zones.
+
+You don't need to set the `MULTIZONE` or `ENABLE_ETCD_QUORUM_READS` flags, as those values are inherited from already running cluster,


Three clauses is probably one too many for this sentence. :)

"You don't need to set the MULTIZONE or ENABLE_ETCD_QUORUM_READS flags, as those are inherited from when you started your HA-compatible cluster."

devin-donnelly · 2016-12-02T18:58:49Z

Awesome, thanks. Just a few more changes.

roberthbailey · 2016-12-02T19:49:46Z

@devin-donnelly - have the content changed enough that i should take another pass once you are finished reviewing?

Added user doc for GCE HA master.

jszczepkowski · 2016-12-02T20:55:34Z

@devin-donnelly
comments applied, PTAL

devin-donnelly · 2016-12-02T21:13:34Z

@roberthbailey , all my comments are on doc organization and wording; the same things get said, but they may use fewer pronouns, active voice, or be said in a slightly different order. I think your LGTM should still stand.

roberthbailey · 2016-12-02T21:14:06Z

@devin-donnelly - thanks.

devin-donnelly · 2016-12-02T21:15:30Z

Thanks, @jszczepkowski ! This is good to go.

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Nov 29, 2016

jszczepkowski assigned fgrzadkowski and roberthbailey Nov 29, 2016

jszczepkowski added the area/HA label Nov 29, 2016

jszczepkowski added this to the 1.5 milestone Nov 29, 2016

jszczepkowski mentioned this pull request Nov 29, 2016

Simplify HA Setup for Master kubernetes/enhancements#48

Closed

22 tasks

jszczepkowski added the area/cluster-lifecycle label Nov 29, 2016

roberthbailey reviewed Nov 29, 2016

View reviewed changes

jszczepkowski force-pushed the ha-doc branch 2 times, most recently from 89a58ef to a108a57 Compare November 30, 2016 10:03

jszczepkowski mentioned this pull request Nov 30, 2016

Umbrella issue for HA and HA upgrades #1733

Closed

roberthbailey reviewed Nov 30, 2016

View reviewed changes

jszczepkowski force-pushed the ha-doc branch from a108a57 to b2cecff Compare December 1, 2016 09:50

jszczepkowski added the lgtm label Dec 1, 2016

roberthbailey removed the lgtm label Dec 1, 2016

roberthbailey added the lgtm label Dec 1, 2016

devin-donnelly suggested changes Dec 1, 2016

View reviewed changes

devin-donnelly added the Docs Review: Open Issues label Dec 1, 2016

jszczepkowski force-pushed the ha-doc branch from b2cecff to 3a5a969 Compare December 2, 2016 08:55

devin-donnelly approved these changes Dec 2, 2016

View reviewed changes

devin-donnelly suggested changes Dec 2, 2016

View reviewed changes

Added user doc for GCE HA master.

3b572b5

Added user doc for GCE HA master.

jszczepkowski force-pushed the ha-doc branch from 3a5a969 to 3b572b5 Compare December 2, 2016 20:54

devin-donnelly approved these changes Dec 2, 2016

View reviewed changes

devin-donnelly added Docs LGTM and removed Docs Review: Open Issues labels Dec 2, 2016

devin-donnelly merged commit f9b4854 into kubernetes:release-1.5 Dec 2, 2016


		### Deployment best practices

		* Try to place master replicas in different zones. During a zone failure, all master placed inside the zone will fail.


		* Master certificates

		Master TLS certificates will be generated for the external public IP (of the load balancer) and local IP of each replica.


		Similarly, the external IP will be used by kubelets to communicate with master.

		* Master certificates


		## Introduction

		In kubernetes version 1.5, we added alpha support for replication of kubernetes masters in kube-up/down scripts for GCE.


		## Running HA cluster on GCE

		### Starting HA-compatible cluster


		### Starting HA-compatible cluster

		When creating a new HA cluster, two flags need to be set for kube-up script:


		### Master service & kubelets

		Instead of trying to keep up-to-date list of kubernetes apiserver in kubernetes service, we will direct all traffic to the external IP:


		## Starting HA-compatible cluster

		To create a new HA-compatible cluster, you must set the following flags in your kube-up script:

Added user doc for GCE HA master #1810

Added user doc for GCE HA master #1810

Conversation

jszczepkowski commented Nov 29, 2016 • edited by thockin

jszczepkowski commented Nov 29, 2016

jszczepkowski commented Nov 29, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davidopp commented Nov 29, 2016

jszczepkowski commented Nov 30, 2016

jszczepkowski commented Nov 30, 2016

roberthbailey left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jszczepkowski commented Dec 1, 2016

roberthbailey commented Dec 1, 2016

devin-donnelly commented Dec 1, 2016

roberthbailey commented Dec 1, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jszczepkowski commented Dec 2, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

devin-donnelly commented Dec 2, 2016

roberthbailey commented Dec 2, 2016

jszczepkowski commented Dec 2, 2016

devin-donnelly commented Dec 2, 2016

roberthbailey commented Dec 2, 2016

devin-donnelly commented Dec 2, 2016

jszczepkowski commented Nov 29, 2016 •

edited by thockin