Dramatically simplify Kubernetes deployment #2303

jbeda · 2014-11-11T21:24:20Z

We need to dramatically simplify deploying Kubernetes.

Task list to reduce/eliminate the bash/salt necessary for getting a cluster up and running:

The end result is that we'd love for something like this to work:

Once per cluster start up:

$ kubernetes cluster-create
Creating new cluster via https://k8s.io/
Cluster ID: f95e695d6eac75b7
Admin certificate saved to ~/.k8s/f95e695d6eac75b7/admin.crt
Admin key saved to ~/.k8s/f95e695d6eac75b7/admin.key

Launch kubernetes on multiple machines with:
  kubernetes --bootstrap=https://k8s.io/k/f95e695d6eac75b7 --bootstrap-key=041fb4965cd31be8

And then on every node in the cluster:

kubernetes --bootstrap=https://k8s.io/k/f95e695d6eac75b7 --bootstrap-key=041fb4965cd31be8
docker -d $(cat /var/run/kube-docker-flags)

And then from the client, you can use the same URL to auth to the servers as an 'admin' user. It'll use the bootstrap url to get admin keys, certs and such. If you connect to a mode and it isn't the master, it'll return a redirect to who is currently the master. This way updating the bootstrap URL can lag actual master election.

kubectl --bootstrap=https://k8s.io/k/f95e695d6eac75b7 list pods

The text was updated successfully, but these errors were encountered:

smarterclayton · 2014-11-11T21:34:58Z

----- Original Message -----

We need to dramatically simplify deploying Kubernetes.

cc: @eparis @smarterclayton

Obvious things to think about:

Reduce the number of server side binaries to 1 that can morph to take on
(perhaps multiple) roles. @brendandburns already started this with Create a standalone k8s binary, capable of running a full cluster #2121.

Users could still run an 'exploded' set of servers (and we would test
this to keep ourselves honest) but the common case for small clusters
would be a single binary.

Consider binding in etcd and flannel into the combined binary also.

Run it in a Docker container - I think we can make everything we need work with:

$ docker run -v /var/run/docker.sock:/var/run/docker.sock --net=host --privileged openshift/origin start

(that's kubelet plus master all in one, I don't know what in the container would block us).

Reduce the amount of data that needs to be populated across the cluster to
get stuff up and running. Probably define a 'cluster bootstrap' API that
both provides for a secret and a simple API endpoint used start the cluster.
This has been called a 'cluster discovery API' or a 'rally point' in the
past and is similar to what etcd does and what Docker is planning for
cluster.

Open issue for how much logic we put there.

Could be used to simplify/bootstrap auth

Allocating new clusters should be doable via the API w/ no auth too

Implementation of the API should be part of k8s and runnable in a
container

Host a version of this on https://bootstrap.k8s.io.

Rethink about how nodes are registered with the master. In addition to the
explicit registration that we have today (where some already trusted
user/component/server does the registration) we could either:

Run in a 'wide open' mode where anyone coming from a whitelisted CIDR is
accepted as a node.

Set up a pattern where we get enough auth to the nodes (via the bootstrap
API or other means) so that the nodes have permission to 'self register'.

With scoped tokens we can do "this token gives you the right to identify yourself as X and set your own minion description, as well as watch for pods under X". Unfortunately you'd have to have one token per minion there so it's easy. So some sort of back-and-forth bootstrapper seems inevitable: "client: hey, I'm X", "server: I see you at X, here's your token to create yourself".

As much as possible our ops team(s) wants to be able to push a button to provision new infra and have it self join, and are usually willing to embed whatever secrets are needed to make that easy into the image or image template.

The end result is that we'd love for something like this to work:

Once per cluster start up:
$ KUBE_BOOTSTRAP=$(curl -s https://bootstrap.k8s.io/v1/new)
$ echo $KUBE_BOOTSTRAP
https://bootsrap.k8s.io/v1/cluster/f95e695d6eac75b78743092a99fcc7de/
And then on every node in the cluster:
kube-cluster --bootstrap=${KUBE_BOOTSTRAP}
docker -d $(cat /var/run/kube-docker-flags)
And then from the client, you can use the same (or perhaps related, derived?)
URL to auth to the servers. It'll use the discovery url to get admin keys,
certs and such. Connecting to any node will forward to the master.
kubectl --bootstrap=${KUBE_BOOTSTRAP} list pods
If you are running on a different network you can override the host that
kubectl talks to yet still use the bootstrap to get auth data. This could
eliminate the the hackery around the HTTP password and using SSH to get
certs and such.

jbeda · 2014-11-11T21:43:14Z

I think that we'll want to have a scalable set of choices -- from having a shared token across all minions that lets them join to allowing for per-minion tokens that are much tricker to distribute but also more secure.

Another option would be to borrow the salt model:

A node uses the bootstrap API to find the master.
The node then sends a request to the master with something like {client cert, name, IP}. This sits in an async queue.
The master then decides to either approve/reject that node. This could be done via policy (accept everyone!), be done by the cloud provider or done by hand. The cert could also be pre-approved before the node ever asks. If we want to get really fancy we could do some sort of PKI with signing and stuff based on IP but I'm not sure that is worth the complexity.
The node now has permission to register itself and watch/modify what it needs to do.

In this case, the bootstrap API is purely for discovery and not for auth bootstrapping.

stp-ip · 2014-11-12T07:18:27Z

I dislike the non-timeout nature of the shared token. In my opinion it eases deployment considerably, but the attack vector of only needing this one secret to get into the cluster is not dealt with during the later lifetime of the cluster.
My suggestion would be to use time sensitive shared or per-minion tokens.
You could setup a token to be valid for 24 hours and bake it into the latest hardware to be deployed. After 24 hours you can be sure that at least the attack window is done and perhaps can verify that only your hardware was added.
Perhaps make it possible to list all registered nodes for a token + the total number, which could be used to check against hardware nodes deployed in these 24 hours.
Nothing fancy, but at least it's simple and in my opinion reduces the attack surface.

smarterclayton · 2014-11-12T16:30:26Z

It seems like it would be fairly easy for an ops team to rotate that token every X hours via a script against their IaaS, and maybe have a window where they overlap (valid for 9 hours, regenerate every 8).

----- Original Message -----

I dislike the non-timeout nature of the shared token. In my opinion it eases
deployment considerably, but the attack vector of only needing this one
secret to get into the cluster is not dealt with during the later lifetime
of the cluster.
My suggestion would be to use time sensitive shared or per-minion tokens.
You could setup a token to be valid for 24 hours and bake it into the latest
hardware to be deployed. After 24 hours you can be sure that at least the
attack window is done and perhaps can verify that only your hardware was
added.
Perhaps make it possible to list all registered nodes for a token + the total
number, which could be used to check against hardware nodes deployed in
these 24 hours.
Nothing fancy, but at least it's simple and in my opinion reduces the attack
surface.

Reply to this email directly or view it on GitHub:
#2303 (comment)

jbeda · 2014-11-12T19:21:58Z

I'm leaning more and more to having this bootstrap API not be used for widespread identity or auth but rather for cluster metadata and simple auth. We will need auth to the bootstrap API itself.

Quick spec that could work for bootstrap service.

Kubernetes Cluster Bootstrap API

When bootstrapping up a cluster, there are two things that we need to do: Identify the master of the cluster (for both worker nodes and clients) and establish the 'admin' credentials for administering the cluster.

The Kubernetes Cluster Bootstrap API is optional -- it is possible to get clusters started without using this API. It is also possible to easily run this API inside of a constrained private network.

The basic flow -- some of these steps are optional to further lock down the security of the cluster.

The user creates a new cluster. This is typically an unauthenticated operation. As a result of creating the cluster, the user gets back a set of information:
- A cluster-id that is used as part of the cluster-bootstrap-url. This should be guarded but is not super sensitive.
- A bootstrap-auth-token. This is a single use token. It can be used to establish a single writer into the bootstrap API. It is typically time bounded (1 hour?)
- Credentials for the cluster admin account. This includes a public cert and a private key used for TLS client auth.
The user starts up a set of servers and gives them the cluster-bootstrap-url along with the bootstrap-auth-token.
The servers boot up and race to see who can claim to be master. They do this using the bootstrap-auth-token. Only one wins and writes its addresses to the bootstrap API.
Other servers that don't win the race use the bootstrap API to find the master. They then register themselves with the master. If the master automatically accepts those workers or not is a matter of policy out of scope of the bootstrap API.
The server that won the master race can add more 'writer' keys to the bootstrap API. That way if the master dies a new master can be elected and update the bootstrap API.
The user uses a Kubernetes client (kubectl) to talk to the cluster. They can either specify the appropriate config data directly or can simply specify the cluster-bootstrap-url. kubectl will use the admin key with client TLS to identify itself to the master.
From here more scoped (password?) auth can be configured and the cluster can be further configured.

Further notes:

The user can create/supply their own admin public cert when creating the cluster. Or they can opt out of using the bootstrap API to set up the admin credentials altogether. The admin cert can be cleared at any time.
The IP range that the master claims can be restricted to a known good range.

API Definition

<base> must be over TLS. We would host a public one on https://k8s.io.
Auth:
- All auth is done via client TLS certs. There are three levels of auth:
  - none -- this method/endpoint is publicly accessible. The random cluster-id is the only protection.
  - bootstrap-cert -- this method/endpoint is accessible to any cert in the bootstrap-cert list or the admin-cert.
  - admin-cert -- this method/endpoint is only accessible to the admin-cert. If there is no admin-cert or the admin-cert is lost, this can never be modified or accessed.
<base>/k/
- POST: creates a new cluster bootstrap endpoint.
  - Auth: none
  - Post body:
    - admin client options -- NONE, CREATE, or specify a cert
    - bootstrap cert -- list of certificates that can be used to write to the bootstrap API.
    - bootstrap token request -- true (with time limit) if a bootstrap token should be returned.
    - master IP range -- the master IP must be confined to this range.
  - Response:
    - cluster id -- long random string
    - cluster url -- <base>/k/<cluster-id>
    - bootstrap token
    - The public cert and private
<base>/k/<cluster-id>/bootstrap-certs This is the list of TLS client certs that are allowed to write to the bootstrap API endpoint.
- GET: returns the list of bootstrap-certs along with the time and IP address that added them.
  - Auth: bootstrap-cert
- PUT - updates the list of bootstrap-certs. Users must use an ETag to ensure that concurrent writes aren't clobbered. The admin-cert cannot be modified with this method.
  - Auth: bootstrap-cert
<base>/k/<cluster-id>/claim-token Used to add an entry to the bootstrap-cert list via a token.
- POST
  - Auth: None, but must present valid one use token
  - Returns: Nothing, 200 if everything is good.
<base>/k/<cluster-id>/mint-token Create a new one use token that can be used to add a bootstrap-cert
- POST
  - Auth: bootstrap-cert
  - Post Body: the time limit. Up to 24 hours.
  - Response: The token.
<base>/cluster-meta/<cluster-id>/admin-cert
- GET: Gets the public key used for the admin account.
  - Auth: bootstrap-cert
- PUT: Sets a new admin key. Can be cleared completely in which case there is no admin key.
  - Auth: admin-cert
<base>/cluster-meta/<cluster-id>/master. This is used to get/set the current master for the cluster.
- This resource includes:
  - The current IP of the master. This is what other nodes in the cluster should use.
  - Preferred IPs or DNS names for clients. This is a priority list from most likely to least likely.
  - The CA cert used for server identification. If blank, then any system trusted CA would suffice.
- GET: Gets the current master.
  - Auth: none
- PUT: Sets the master.
  - Auth: bootstrap-cert
  - Optimistic concurrency here -- the user must use ETag to make sure that they aren't clobbering others in a start up race.

jbeda · 2014-11-13T00:18:25Z

Other previous work here is the etcd discovery protocol: https://github.com/coreos/etcd/blob/master/Documentation/discovery-protocol.md

bgrant0607 · 2014-11-20T05:45:15Z

Question: Are you thinking we'd run pods on the master node just like on any other node?

One issue users have raised is that in a small cluster the master has minimal requirements, whereas the other nodes need to accommodate the requirements of the application being run on the cluster, and even single-digit numbers of pods may have significant cpu, memory, and/or flash requirements.

jbeda · 2014-11-20T23:00:49Z

@bgrant0607 Yes -- I want to enable a mode where you have 1-5 machines and just want to kick the tires. It should be dead simple to get stuff running in that situation. Download/install one binary/package and copy/paste around a super minimal amount of info.

smarterclayton · 2014-12-15T18:03:17Z

Has the discussion from the face-to-face been reflected in an issue? If so can we link it here?

jbeda · 2014-12-15T18:05:07Z

I'm going to pick this stuff up again this week and break it down into a bunch of sub-items. I think that @erictune and @kelseyhightower were going to write some stuff up. I'm happy to get on that though.

kapilt · 2014-12-15T18:16:35Z

fwiw there's similiar work to the etcd discovery protocol embedded in
swarmd's discovery.

On Mon, Dec 15, 2014 at 1:05 PM, Joe Beda notifications@github.com wrote:

I'm going to pick this stuff up again this week and break it down into a
bunch of sub-items. I think that @erictune https://github.com/erictune
and @kelseyhightower https://github.com/kelseyhightower were going to
write some stuff up. I'm happy to get on that though.

—
Reply to this email directly or view it on GitHub
#2303 (comment)
.

jbeda · 2014-12-15T18:22:27Z

@kapilt Yup -- I bugged them to document it. I'd love for us to do something that it a little bit more secure than that. Right now if you can steal the single token/cluster id, you can steal the cluster.

We also need to bootstrap the cluster parameters and the admin account.

timothysc · 2015-01-06T21:37:33Z

So I'm a huge +1 for this, with one caveat of having a whitelist option on master #3103 , with reverse DNS lookup on minions to join. That should narrow the vector.

This is related to kubernetes#2303 and steals from kubernetes#2435.

jdef · 2015-03-11T14:36:24Z

xref mesosphere/kubernetes-mesos#169

alex-mohr · 2015-03-19T21:14:56Z

We now separately track the items from this uber-proposal that are part of milestone v1, so removing that tag.

alex-mohr · 2015-03-19T22:19:44Z

Created label cluster/platform/mesos

On Thu, Mar 19, 2015 at 2:40 PM, Timothy St. Clair <notifications@github.com

wrote:

@alex-mohr https://github.com/alex-mohr, @jdef https://github.com/jdef

could we get a new label for tracking mesos-framework integration pieces?

—
Reply to this email directly or view it on GitHub
#2303 (comment)
.

roberthbailey · 2015-03-21T04:28:12Z

I've broken cluster bootstrapping out into #5754.

roberthbailey · 2015-03-21T04:36:05Z

I've broken the all-in-one binary out into #5755.

roberthbailey · 2015-03-21T04:40:37Z

For reference, "Allow for kubelets to securely nominate themselves to join the cluster" is being tracked in #3168.

roberthbailey · 2015-03-21T04:41:44Z

And "Recast add-ons as post-cluster deploy scripts/tools that run on top of kubernetes" is being tracked in #3579.

roberthbailey · 2015-03-21T04:43:07Z

Closing this umbrella issue now that all of the various parts have been split into separate issues.

sandric · 2015-04-15T21:02:45Z

@jbeda, is it implemented/in process? I mean master/minion combined installation? I just didn't see it in the ones @roberthbailey separated this monstruosed issue to, can you point me to discussion? thx.

bgrant0607 · 2015-04-15T22:16:07Z

@sandric Check out:
https://github.com/GoogleCloudPlatform/kubernetes/blob/master/docs/getting-started-guides/docker.md

This is related to kubernetes#2303 and steals from kubernetes#2435.

erictune mentioned this issue Nov 18, 2014

Salt Minions should be authenticated #2448

Closed

This was referenced Nov 18, 2014

Store master and node config in API #1627

Closed

Update the kubelet with the ability to register itself with the master. #2435

Closed

bgrant0607 added area/bootstrapping area/build-release labels Nov 20, 2014

a-robinson mentioned this issue Nov 26, 2014

Cluster Upgrade #2524

Closed

goltermann added the priority/backlog Higher priority than priority/awaiting-more-evidence. label Dec 3, 2014

jbeda mentioned this issue Dec 9, 2014

Client Cert Authentication #2591

Closed

bgrant0607 mentioned this issue Dec 16, 2014

Remove need for the apiserver to contact kubelet for current container state #156

Closed

jbeda mentioned this issue Dec 17, 2014

Quote strings in bash populated YAML files. #2988

Merged

jbeda mentioned this issue Jan 5, 2015

docs/local_docker: fix documentation and bootstrap #2455

Closed

proppy mentioned this issue Jan 6, 2015

Provide an easy way to bootstrap a cluster on Mac OS #3244

Closed

goltermann mentioned this issue Jan 7, 2015

Replace -machines with -whitelist #3103

Closed

jbeda added a commit to jbeda/kubernetes that referenced this issue Jan 7, 2015

Design doc for clustering.

7a806b2

This is related to kubernetes#2303 and steals from kubernetes#2435.

jbeda mentioned this issue Jan 7, 2015

Design doc for clustering. #3281

Merged

bgrant0607 mentioned this issue Jan 23, 2015

DESIGN: mental model of infrastructure and kube interactions #2003

Closed

davidopp added the sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. label Feb 17, 2015

davidopp added this to the v1.0 milestone Feb 18, 2015

bgrant0607 mentioned this issue Feb 18, 2015

Node should sync back to master with allocated Pods through file source #4090

Closed

This was referenced Mar 18, 2015

Create a CoreOS (fleet) cloudprovider #2890

Closed

Proposal to rework Kubernetes deployment CLI #5472

Closed

alex-mohr removed this from the v1.0 milestone Mar 19, 2015

alex-mohr added priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. and removed priority/backlog Higher priority than priority/awaiting-more-evidence. labels Mar 19, 2015

roberthbailey mentioned this issue Mar 21, 2015

Implement the cluster bootstrap API #5754

Closed

6 tasks

roberthbailey mentioned this issue Mar 21, 2015

Provide an 'all-in-one' binary for server components (hyperkube) #5755

Closed

3 tasks

roberthbailey closed this as completed Mar 21, 2015

bgrant0607 mentioned this issue Mar 28, 2015

Kubelet to POST node status to apiserver #4562

Closed

gtank mentioned this issue Dec 2, 2015

Implement dynamic clustering with TLS #18112

Closed

jbeda mentioned this issue Jun 22, 2016

Dramatically Simplify Kubernetes Cluster Creation (kubeadm umbrella issue). kubernetes/enhancements#11

Closed

18 tasks

xingzhou pushed a commit to xingzhou/kubernetes that referenced this issue Dec 15, 2016

Design doc for clustering.

8456993

This is related to kubernetes#2303 and steals from kubernetes#2435.

mikedanese mentioned this issue Sep 4, 2019

Retroactive KEP: Certificates API kubernetes/enhancements#1097

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dramatically simplify Kubernetes deployment #2303

Dramatically simplify Kubernetes deployment #2303

jbeda commented Nov 11, 2014

smarterclayton commented Nov 11, 2014

jbeda commented Nov 11, 2014

stp-ip commented Nov 12, 2014

smarterclayton commented Nov 12, 2014

jbeda commented Nov 12, 2014

jbeda commented Nov 13, 2014

bgrant0607 commented Nov 20, 2014

jbeda commented Nov 20, 2014

smarterclayton commented Dec 15, 2014

jbeda commented Dec 15, 2014

kapilt commented Dec 15, 2014

jbeda commented Dec 15, 2014

timothysc commented Jan 6, 2015

jdef commented Mar 11, 2015

alex-mohr commented Mar 19, 2015

alex-mohr commented Mar 19, 2015

roberthbailey commented Mar 21, 2015

roberthbailey commented Mar 21, 2015

roberthbailey commented Mar 21, 2015

roberthbailey commented Mar 21, 2015

roberthbailey commented Mar 21, 2015

sandric commented Apr 15, 2015

bgrant0607 commented Apr 15, 2015

Dramatically simplify Kubernetes deployment #2303

Dramatically simplify Kubernetes deployment #2303

Comments

jbeda commented Nov 11, 2014

smarterclayton commented Nov 11, 2014

jbeda commented Nov 11, 2014

stp-ip commented Nov 12, 2014

smarterclayton commented Nov 12, 2014

jbeda commented Nov 12, 2014

Kubernetes Cluster Bootstrap API

API Definition

jbeda commented Nov 13, 2014

bgrant0607 commented Nov 20, 2014

jbeda commented Nov 20, 2014

smarterclayton commented Dec 15, 2014

jbeda commented Dec 15, 2014

kapilt commented Dec 15, 2014

jbeda commented Dec 15, 2014

timothysc commented Jan 6, 2015

jdef commented Mar 11, 2015

alex-mohr commented Mar 19, 2015

alex-mohr commented Mar 19, 2015

roberthbailey commented Mar 21, 2015

roberthbailey commented Mar 21, 2015

roberthbailey commented Mar 21, 2015

roberthbailey commented Mar 21, 2015

roberthbailey commented Mar 21, 2015

sandric commented Apr 15, 2015

bgrant0607 commented Apr 15, 2015