Kubernetes on AWS mutli-AZ by default #13063

erulabs · 2015-08-22T00:46:13Z

Hello!

References issue #13056 - Kubernetes on AWS via "kube-up" should be more highly available.

Part one: Minions should scale across two Amazon Availability zones
Part two: Kube-master should be highly available with one instance per AZ

This is part one of two.

This has a fairly high potential to break things - but I've done quite a bit of testing and it seems to behave just fine. Since the two AZs exist in one VPC and networking is flat between them, I don't expect any issues.

Additionally, this replaces the subnet which was a single 0/24 with two /24s in the larger /16 of the VPC. Therefore the total number of minions on AWS without any modification is upped from 254 to 508 (two /24s).

@justinsb should probably take a peek at this one :)

Thanks everyone! We're absolutely loving Kubernetes on my team - Keep up the great work!!

k8s-bot · 2015-08-22T00:47:05Z

Can one of the admins verify that this patch is reasonable to test? (reply "ok to test", or if you trust the user, reply "add to whitelist")

If this message is too spammy, please complain to ixdy.

erulabs · 2015-08-22T00:57:13Z

It occurs to me where this will not work is EBS disks... Since EBS volumes cannot be moved between environments, this will be tricky.

Thoughts are welcome.. I wonder if this is something that has been dealt with on other providers. I'd be tempted to modify the scheduler such that it would prefer to keep a pod which relies on an awsElasticBlockStore volume in the same AZ unless it cannot fit, in which case it could perhaps snapshot the disk into the other AZ? That seems clunky (I wouldn't want extended pod relocation time due to a very very large snapshot)...

davidopp · 2015-08-22T06:33:40Z

Assigned to @justinsb
cc/ @quinton-hoole

k8s-github-robot · 2015-08-27T23:05:51Z

Labelling this PR as size/L

zytek · 2015-10-03T00:00:10Z

IMO having primary and secondary AZ is Not Enough (tm). Proper multi-AZ highly available cluster should span >2 zones to enable quorum-type service deployments (redis sentinels for example) and properly handle network partitions. See also my comment on the linked issue.

We either have HA setup (starting from kubernetes components themselves, like etcd nodes spanning >2 zones) or we stick to one-cluster-per-zone to not give users false assumptions about availability and fault tolerance.

k8s-bot · 2015-10-10T15:45:06Z

Can one of the admins verify that this patch is reasonable to test? (reply "ok to test", or if you trust the user, reply "add to whitelist")

If this message is too spammy, please complain to ixdy.

ghost · 2015-10-12T20:51:48Z

+1 to what @zytek said above.

mikedanese · 2015-12-21T19:13:59Z

cc @quinton-hoole @justinsb is this going to get a review?

k8s-bot · 2016-01-28T18:50:16Z

Can one of the admins verify that this patch is reasonable to test? (reply "ok to test", or if you trust the user, reply "add to whitelist")

If this message is too spammy, please complain to ixdy.

k8s-bot · 2016-01-28T19:54:09Z

Can one of the admins verify that this patch is reasonable to test? (reply "ok to test", or if you trust the user, reply "add to whitelist")

If this message is too spammy, please complain to ixdy.

k8s-bot · 2016-02-04T19:37:46Z

Can one of the admins verify that this patch is reasonable to test? (reply "ok to test", or if you trust the user, reply "add to whitelist")

If this message is too spammy, please complain to ixdy.

mikedanese · 2016-02-04T19:46:45Z

This appears to be stalled

Kubernetes on AWS mutli-AZ by default

f6d9b25

googlebot added the cla: yes label Aug 22, 2015

davidopp assigned justinsb Aug 22, 2015

justinsb added the area/platform/aws label Aug 22, 2015

k8s-github-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Aug 27, 2015

k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 6, 2015

kevin-wangzefeng mentioned this pull request Nov 16, 2015

Implement dedicated nodes using taints and tolerations #17190

Open

davidopp mentioned this pull request Nov 16, 2015

(Tracking Issue) Ubernetes (Federation) Lite #17059

Closed

7 tasks

mikedanese assigned ghost and unassigned justinsb Jan 28, 2016

mikedanese closed this Feb 4, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kubernetes on AWS mutli-AZ by default #13063

Kubernetes on AWS mutli-AZ by default #13063

erulabs commented Aug 22, 2015

k8s-bot commented Aug 22, 2015

erulabs commented Aug 22, 2015

davidopp commented Aug 22, 2015

k8s-github-robot commented Aug 27, 2015

zytek commented Oct 3, 2015

k8s-bot commented Oct 10, 2015

ghost commented Oct 12, 2015

mikedanese commented Dec 21, 2015

k8s-bot commented Jan 28, 2016

k8s-bot commented Jan 28, 2016

k8s-bot commented Feb 4, 2016

mikedanese commented Feb 4, 2016

Kubernetes on AWS mutli-AZ by default #13063

Kubernetes on AWS mutli-AZ by default #13063

Conversation

erulabs commented Aug 22, 2015

k8s-bot commented Aug 22, 2015

erulabs commented Aug 22, 2015

davidopp commented Aug 22, 2015

k8s-github-robot commented Aug 27, 2015

zytek commented Oct 3, 2015

k8s-bot commented Oct 10, 2015

ghost commented Oct 12, 2015

mikedanese commented Dec 21, 2015

k8s-bot commented Jan 28, 2016

k8s-bot commented Jan 28, 2016

k8s-bot commented Feb 4, 2016

mikedanese commented Feb 4, 2016