Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS: We should run the master in an autoscaling group of size 1 #11934

Closed
justinsb opened this issue Jul 28, 2015 · 16 comments
Closed

AWS: We should run the master in an autoscaling group of size 1 #11934

justinsb opened this issue Jul 28, 2015 · 16 comments

Comments

@justinsb
Copy link
Member

@justinsb justinsb commented Jul 28, 2015

This will provide automatic relaunch in case of failure.

@roberthbailey
Copy link
Member

@roberthbailey roberthbailey commented Jul 28, 2015

How does AWS handle mounting persistent disks to instances in an autoscaling group? Also, what about health checks (you also want to re-launch the VM if the VM is running but the apiserver is down)?

@jboelter
Copy link

@jboelter jboelter commented Jul 29, 2015

Are the files/configuration that need to survive termination in a known location?

We could create an EBS volume and mount it in the master instance. Alternately, I think the same idea would work, but it would need to be the boot volume.

@iterion
Copy link
Contributor

@iterion iterion commented Jul 29, 2015

@jboelter We put all of the config that needs to survive on an EBS that is mounted to the master when it is initially created (not the boot volume, but a second disk that has the essential info placed on it).

@roberthbailey We can mount a blank disk or a snapshot of a disk. But, I don't think there is any way to have the ASG know to remount the disk that was used previously.

For this to come back up with the correct data we could run a script when the instance starts. That script would make some AWS API calls to try to find an existing EBS volume for the master and remount it. @justinsb might have some better solution in mind though :)

@jboelter
Copy link

@jboelter jboelter commented Jul 29, 2015

@iterion perfect -- The ASG has an associated LaunchConfiguration that specifies the details. We should be able to reference a known volume id created prior. This assumes there are no race conditions w/ the volume in use after termination while a new instance is created.

Edit: It appears that the AutoScaling EBS type doesn't allow for a volume id (which would only make sense for a ASG size of 1) -- mounting w/ an init script may be the way to go. Should still be able to use a well-known volume id though.

http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-as-launchconfig-blockdev-template.html

@iterion
Copy link
Contributor

@iterion iterion commented Jul 29, 2015

@jboelter Interesting, I can't find where to specify the volume id when creating a launch configuration, perhaps I'm looking in the wrong place. It looks as if you can specify a BlockDeviceMapping. On that mapping I see that there is a way to configure an EBS instance but it only lets you specify a snapshot id.

FYI - I'm looking here: http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-as-launchconfig-blockdev-mapping.html#cfn-as-launchconfig-blockdev-mapping-ebs

@jboelter
Copy link

@jboelter jboelter commented Jul 29, 2015

@iterion yeah, just noticed the same and edited my note above as you posted

@iterion
Copy link
Contributor

@iterion iterion commented Jul 29, 2015

Bummer, perhaps we could tag the ASG or launch configuration with the volume id that was used? Alternatively, we could tag the EBS with something that identifies it as the master disk for that cluster. We run the risk of having multiple disks with the same tags though.

@justinsb
Copy link
Member Author

@justinsb justinsb commented Aug 15, 2015

I'm going to make an attempt at this.

I am planning on using the approach of tagging the volume and then trying to mount it as part of instance boot.

@justinsb
Copy link
Member Author

@justinsb justinsb commented Aug 17, 2015

Rather than have a separate process or script that: discovers the volume, tries to mount it and then starts our processes, I am experimenting with using the kubelet for this:
justinsb@334ad49

Advantages:

  • we could easily have hot-failover machines (i.e. run an auto-scaling group with multiple machines). Mounting a volume is a simple way to do leader election on many clouds/environments.

Shortcomings:

  • this requires passing an explicit volume ID in, but I hope that in future we will be able to specify volumes using something like k8s selectors & labels (#9712).
  • this requires a volume per process. This may not be a bad thing: better isolation, and volumes are pretty cheap (on AWS & GCE at least). We could implement volumes on volumes (a subdirectory on a volume, which k8s could copy/move around).
  • because of the above, there is no guarantee that we will launch everything on the same machine in a multi-machine environment. This may require some tweaks particularly during bootstrapping, and we would prefer minimal latency to etcd.
@pikeas
Copy link

@pikeas pikeas commented Nov 12, 2015

+1, AWS should come up with an ASG in front of the master for self healing (in conjunction with master using an EIP, can't seem to find the issue # at the moment), or be configured with multiple masters (preferably still behind an ASG!).

@justinsb
Copy link
Member Author

@justinsb justinsb commented Nov 12, 2015

Good news is I have this working on a branch. Bad news is that the diff is pretty substantial. I am cherry picking smaller PRs across so that the remaining changes become palatable!

@justinsb justinsb self-assigned this Nov 12, 2015
@jwerak
Copy link

@jwerak jwerak commented May 12, 2016

Do you have a list of things which needs to be restored except etcd?

@namliz
Copy link

@namliz namliz commented Aug 18, 2016

Is it plausible to split out etcd into its own autoscaling group?
If so, you could just scale masters and the etcd cluster independently and there's no need to persist anything.

@justinsb
Copy link
Member Author

@justinsb justinsb commented Aug 18, 2016

This is implemented in kops. As kube-up is in maintenance mode, it won't be implemented there.

@zilman it's plausible, but then the etcd ASG becomes the challenging one!

@justinsb justinsb closed this Aug 18, 2016
@namliz
Copy link

@namliz namliz commented Aug 18, 2016

@justinsb: well, if you have an etcd ASG of size 3, seems to me like you don't really need to persist anything as at least one etcd instance is guaranteed to stay up.

@erutherford
Copy link

@erutherford erutherford commented Aug 18, 2016

A 3 node etcd cluster can't operate with less than 2 nodes running. If you lose more than 1 nodes data you're restoring from backups.

Also, without running a runtime reconfig, your etcd instances are fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
10 participants
You can’t perform that action at this time.