Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Dramatically Simplify Kubernetes Cluster Creation #30360

258 changes: 258 additions & 0 deletions docs/proposals/dramatically-simplify-cluster-creation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,258 @@
# Proposal: Dramatically Simplify Kubernetes Cluster Creation

Luke Marsden & many others in [SIG-cluster-lifecycle](https://github.com/kubernetes/community/tree/master/sig-cluster-lifecycle).

17th August 2016

*This proposal aims to capture the latest consensus and plan of action of SIG-cluster-lifecycle. It should satisfy the first bullet point [required by the feature description](https://github.com/kubernetes/features/issues/11).*

See also: [this presentation to community hangout on 4th August 2016](https://docs.google.com/presentation/d/17xrFxrTwqrK-MJk0f2XCjfUPagljG7togXHcC39p0sM/edit?ts=57a33e24#slide=id.g158d2ee41a_0_76)

## Motivation

Kubernetes is hard to install, and there are many different ways to do it today. None of them are excellent. We believe this is hindering adoption.

## Goals

Have one recommended, official, tested, "happy path" which will enable a majority of new and existing Kubernetes users to:

* Kick the tires and easily turn up a new cluster on infrastructure of their choice

* Get a reasonably secure, production-ready cluster, with reasonable defaults and a range of easily-installable add-ons

We plan to do so by improving and simplifying Kubernetes itself, rather than building lots of tooling which "wraps" Kubernetes by poking all the bits into the right place.

## Scope of project

There are logically 3 steps to deploying a Kubernetes cluster:

1. *Provisioning*: Getting some servers - these may be VMs on a developer's workstation, VMs in public clouds, or bare-metal servers in a user's data center.

2. *Install & Discovery*: Installing the Kubernetes core components on those servers (kubelet, etc) - and bootstrapping the cluster to a state of basic liveness, including allowing each server in the cluster to discover other servers: for example teaching etcd servers about their peers, having TLS certificates provisioned, etc.

3. *Add-ons*: Now that basic cluster functionality is working, installing add-ons such as DNS or a pod network (should be possible using kubectl apply).

Notably, this project is *only* working on dramatically improving 2 and 3 from the perspective of users typing commands directly into root shells of servers. The reason for this is that there are a great many different ways of provisioning servers, and users will already have their own preferences.

What's more, once we've radically improved the user experience of 2 and 3, it will make the job of tools that want to do all three much easier.

## User stories

### Phase I

**_In time to be an alpha feature in Kubernetes 1.4._**

Note: the current plan is to deliver `kubeadm` which implements these stories as "alpha" packages built from master (after the 1.4 feature freeze), but which are capable of installing a Kubernetes 1.4 cluster.

* *Install*: As a potential Kubernetes user, I can deploy a Kubernetes 1.4 cluster on a handful of computers running Linux and Docker by typing two commands on each of those computers. The process is so simple that it becomes obvious to me how to easily automate it if I so wish.

* *Pre-flight check*: If any of the computers don't have working dependencies installed (e.g. bad version of Docker, too-old Linux kernel), I am informed early on and given clear instructions on how to fix it so that I can keep trying until it works.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should cross like the @kubernetes/sig-node feature here.


* *Control*: Having provisioned a cluster, I can gain user credentials which allow me to remotely control it using kubectl.

* *Install-addons*: I can select from a set of recommended add-ons to install directly after installing Kubernetes on my set of initial computers with kubectl apply.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where will the recipes for the add-ons live?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


* *Add-node*: I can add another computer to the cluster.

* *Secure*: As an attacker with (presumed) control of the network, I cannot add malicious nodes I control to the cluster created by the user. I also cannot remotely control the cluster.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbeda had suggested relaxing this point (at least at first) and operating in the mode that if you are on the network you are trusted to be part of the cluster. That seems like a reasonable starting point if we can't get to fully secure in the first iteration.


### Phase II

**_In time for Kubernetes 1.5:_**
*Everything from Phase I as beta/stable feature, everything else below as beta feature in Kubernetes 1.5.*

* *Upgrade*: Later, when Kubernetes 1.4.1 or any newer release is published, I can upgrade to it by typing one other command on each computer.

* *HA*: If one of the computers in the cluster fails, the cluster carries on working. I can find out how to replace the failed computer, including if the computer was one of the masters.

## Top-down view: UX for Phase I items

We will introduce a new binary, kubeadm, which ships with the Kubernetes OS packages (and binary tarballs, for OSes without package managers).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can put this as a subcommand of kubelet. What do people think of that?

$ kubelet cluster join node
$ kubelet cluster init master
$ # or
$ kubelet join node
$ kubelet init master

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would we want to do that? To save the user from downloading another binary?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm torn. The engineer in me says these are different things, so different names make sense. But this is all about streamlining and reducing moving parts, so I think less is more here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe kubelet admin init master to namespace a little deeper without producing multiple CLIs?

Copy link
Contributor Author

@lukemarsden lukemarsden Aug 11, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a developer experience perspective, I would strongly encourage us not to make kubelet the top-level command that the user interacts with (even with subcommands). In anything like its current form, kubelet should be plumbing, not porcelain IMO. It's highly likely that developers/users will see the docs/blog post/whatever instructions that mention kubelet and then type kubelet --help. They need to get less than a page of clear instructions on how to get started. What they get currently is anything-but. That --help text is our canvas. Let's start from a clean one.

The reason I don't think we can "clean" the kubelet "canvas" as that it would break compatibility with lots of things that already assume things about how kubelet works and which arguments it takes. We don't want to break everyone's existing kubernetes-installation systems when we introduce this new happy path.

We previously agreed in the SIG that the happy path includes "download OS packages" for most distros. Is there an issue with the OS packages including two binaries that I don't see?

Copy link
Contributor Author

@lukemarsden lukemarsden Aug 12, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the input guys. Let's start with the simplest thing, a new kubeadm binary. I'm not averse to git-style subcommands, I just don't think we need them right now to solve the problem SIG-cluster-lifecycle is trying to solve.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can always deprecate kubeadm if we decide otherwise. I'm partial to
kubelet init but I understand the concerns regarding it. A new binary name
is a good place to experiment.

On Fri, Aug 12, 2016 at 4:14 AM, lukemarsden notifications@github.com
wrote:

In docs/proposals/dramatically-simplify-cluster-creation.md
#30360 (comment)
:

+* Add-node: I can add another computer to the cluster.
+
+* Secure: As an attacker with (presumed) control of the network, I cannot add malicious nodes I control to the cluster created by the user. I also cannot remotely control the cluster.
+
+### Phase II
+
+In time for Kubernetes 1.5:
+Everything from Phase I as beta/stable feature, everything else below as beta feature in Kubernetes 1.5.
+
+* Upgrade: Later, when Kubernetes 1.4.1 or any newer release is published, I can upgrade to it by typing one other command on each computer.
+
+* HA: If one of the computers in the cluster fails, the cluster carries on working. I can find out how to replace the failed computer, including if the computer was one of the masters.
+
+## Top-down view: UX for Phase I items
+
+We will introduce a new binary, kubeadm, which ships with the Kubernetes OS packages (and binary tarballs, for OSes without package managers).

Let's start with the simplest thing, a new kubeadm binary. I'm not averse
to git-style subcommands, I just don't think we need them right now to
solve the problem SIG-cluster-lifecycle is trying to solve.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/kubernetes/pull/30360/files/7f3458c36e81ee2a5511bdef1bc96e76e05ccdc2#r74555184,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_p1Fpnd9677bRaOL9n4FhBN0xtgsJks5qfCtvgaJpZM4JhIGl
.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI docker daemon is no longer a thing, as of 1.12 there is a separate dockerd binary...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we are suggesting kubeadm instead of the simpler kube?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, @jbeda was concerned that kube is possibly being taken by the hyperkube binary. It probably isn't. But we didn't want to risk it.


```
laptop$ kubeadm --help
kubeadm: bootstrap a secure kubernetes cluster easily.

/==========================================================\
| KUBEADM IS ALPHA, DO NOT USE IT FOR PRODUCTION CLUSTERS! |
| |
| But, please try it out! Give us feedback at: |
| https://github.com/kubernetes/kubernetes/issues |
| and at-mention @kubernetes/sig-cluster-lifecycle |
\==========================================================/

Example usage:

Create a two-machine cluster with one master (which controls the cluster),
and one node (where workloads, like pods and containers run).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmmm if building a universal bootstrapper, can we start moving away from the term master ? one concrete reason: for HA dont we worry about what a "master" is w/ the scheduler and apiserver and so on being possibly segregated,moving around over time ? and also, we know there may be many apiservers....


On the first machine
====================
master# kubeadm init master
Your token is: <token>

On the second machine
=====================
node# kubeadm join node --token=<token> <ip-of-master>

Usage:
kubeadm [command]

Available Commands:
init Run this on the first server you deploy onto.
join Run this on other servers to join an existing cluster.
user Get initial admin credentials for a cluster.
manual Advanced, less-automated functionality, for power users.

Use "kubeadm [command] --help" for more information about a command.
```

### Install

*On first machine:*

```
master# kubeadm init master
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand the workflow. Presumably I have to do something before I run kubeadm, because initially I won't have that binary. To help me understand better:

  • What non-Kubernetes components do you expect to exist before you run this? (Just Docker? Systemd?)
  • What will the user download before running kubeadm init?
  • Will kubeadm init download anything, or will it assume that everything has been downloaded?

I'm curious about the case where I might have limited Internet connectivity--some users may prefer to pre-download everything out of band.

Copy link
Contributor Author

@lukemarsden lukemarsden Aug 11, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above:

We will introduce a new binary, kubeadm, which ships with the Kubernetes OS packages (and binary tarballs, for OSes without package managers).

So a complete flow might be:

# apt-add-repository [official k8s deb repo]
# apt-get install kubernetes
# kubeadm init master

(see kubernetes/release#35 for more detail)

The OS packages can declare a dependency on Docker, and configure the init system to start kubelet on boot. Does this answer your question?

Copy link
Contributor Author

@lukemarsden lukemarsden Aug 11, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will kubeadm init download anything, or will it assume that everything has been downloaded?

Great question. It will cause static pods to be written which will download other kubernetes components and run them as pods. If a user wants a completely isolated, disconnected install, they would need to pre-fetch these container images, or bake them into their AMIs or what-have-you. Note that working when disconnected from the Internet is one of the key benefits of the approach advocated for in #30361.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlainRoy woo small world, great to see in you kube-land.

Initializing kubernetes master... [done]
Cluster token: 73R2SIPM739TNZOA
Run the following command on machines you want to become nodes:
kubeadm join node --token=73R2SIPM739TNZOA <master-ip>
You can now run kubectl here.
```

*On N "node" machines:*

```
node# kubeadm join node --token=73R2SIPM739TNZOA <master-ip>
Copy link
Contributor

@aaronlevy aaronlevy Aug 10, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is actually occurring here? e.g. is this responsible for installing the kubelet/docker/etc - or is that left to deployer?

If we already have a kubelet available, why have a separate kubeadm tool vs just:
kubelet --token=xyz --master=foo?

Copy link
Contributor Author

@lukemarsden lukemarsden Aug 10, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#30361 tries to articulate an implementation strategy for "what is actually occuring here", which is that kubeadm just writes (basically) its CLI arguments to a file that kubelet is watching out for.

We should have a separate tool because they have different lifecycles: the kubelet needs to be started on boot (presumably it would be configured and started in e.g. systemd by the OS package that installed it, in the usual case), and it should run forever, whereas kubeadm needs to be runnable by a user and exit immediately. Also, kubelet --help is totally unusable for the UX we're aiming for. This proposal makes kubelet into plumbing underneath the nice kubeadm interface that's usable by mortals ;)

If you really want them to be the same binary then we could do some argv[0] trickery + hardlink or symlink the binaries, but what would the motivation be for that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The less distinct root CLIs we have the better. The less binaries people need to know about, the better. I can live with kubeadm, but it feels a little like debian: apt, apt-get, apt-cache, dpkg, etc - I never know which one to run.

addtionalyy, we have work to make kube be a wrapper binary, does this eventuaklly get subsumed, too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think of kubelet as a CLI, rather it's a component of Kubernetes that gets installed and started when you install the kubernetes OS package. That makes kubeadm and kubectl the only two pieces of UI surface area:

kubeadm for creating clusters and managing the servers in a cluster. kubectl is for deploying add-ons and workloads to running clusters. That'd be how I'd think about it.

Regarding the hyperkube work, the conclusion offered in the last SIG-cluster-lifecycle was that it would be 6-9 months before we can realistically package everything in the same binary.

Initializing kubernetes node... [done]
Bootstrapping certificates... [done]
Joined node to cluster, see 'kubectl get nodes' on master.
```

Note `[done]` would be colored green in all of the above.

### Install: alternative for automated deploy

*The user (or their config management system) creates a token and passes the same one to both init and join.*

```
master# kubeadm init master --token=73R2SIPM739TNZOA
Initializing kubernetes master... [done]
You can now run kubectl here.
```

### Pre-flight check

```
master# kubeadm init master
Error: socat not installed. Unable to proceed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh? What needs socat and why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was just an example: see #26093 for some dependencies that folks have discovered. A better example might be a kernel which doesn't support the required cgroups settings.

Copy link
Contributor

@philips philips Aug 10, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this tool care about whether the kubelet has what it needs? In the case of CoreOS we run kubelet in a container and so this check wouldn't work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's trying to help the user with early detection of issues on platforms where the kubelet runs on the host. I agree this may not work so well if the user is deploying containerize kubelets, so we'd at least need a way to turn it off.

```

### Control

*On master, after Install, kubectl is automatically able to talk to localhost:8080:*

```
master# kubectl get pods
[normal kubectl output]
```

*To mint new user credentials on the master:*

```
master# kubeadm user create -o kubeconfig-bob bob

Waiting for cluster to become ready... [done]
Creating user certificate for user... [done]
Waiting for user certificate to be signed... [done]
Your cluster configuration file has been saved in kubeconfig.

laptop# scp <master-ip>:/root/kubeconfig-bob ~/.kubeconfig
laptop# kubectl get pods
[normal kubectl output]
```

### Install-addons

*Using CNI network as example:*

```
master# kubectl apply --purge -f \
https://git.io/kubernetes-addons/<X>.yaml
[normal kubectl apply output]
```

### Add-node

*Same as Install – "on node machines".*

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I like this idea. But, it seems to gloss over how networking will be handled/configured. Would you be installing an overlay automatically or will there be some other solution?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the idea is that pod networks should be installed as add-ons, that is, with kubectl apply -f. See above section "Install-addons". Does that answer your question?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the idea is that networking is configured as an add-on. We should make that clear in the proposal.

### Secure

```
node# kubeadm join --token=GARBAGE node <master-ip>
Unable to join mesh network. Check your token.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please don't say mesh network. It's really not. I know it sounds awesome and Docker uses it, bu theirs is not a mesh either.

Copy link
Contributor Author

@lukemarsden lukemarsden Aug 11, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I'll change this to "Unable to authenticate to <master-ip>. Check your token."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that "mesh network" was pre-supposing that the implementation was using the gossip protocol (which does have a mesh). But doing that in the top level doc is also incorrect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be fixed now.

```

## Work streams – critical path – must have in 1.4 before feature freeze

1. [TLS bootstrapping](https://github.com/kubernetes/features/issues/43) - so that kubeadm can mint credentials for kubelets and users

* Requires [#25764](https://github.com/kubernetes/kubernetes/pull/25764) and auto-signing [#30153](https://github.com/kubernetes/kubernetes/pull/30153) but does not require [#30094](https://github.com/kubernetes/kubernetes/pull/30094).
* @philips, @gtank & @yifan-gu

1. Fix for [#30515](https://github.com/kubernetes/kubernetes/issues/30515) - so that kubeadm can install a kubeconfig which kubelet then picks up

* @smarterclayton

## Work streams – can land after 1.4 feature freeze

1. [Debs](https://github.com/kubernetes/release/pull/35) and [RPMs](https://github.com/kubernetes/release/pull/50) (and binaries?) - so that kubernetes can be installed in the first place

* @mikedanese & @dgoodwin

1. [kubeadm implementation](https://github.com/lukemarsden/kubernetes/tree/kubeadm-scaffolding) - the kubeadm CLI itself, will get bundled into "alpha" kubeadm packages

* @lukemarsden & @errordeveloper

1. [Implementation of JWS server](https://github.com/jbeda/kubernetes/blob/discovery-api/docs/proposals/super-simple-discovery-api.md#method-jws-token) from [#30707](https://github.com/kubernetes/kubernetes/pull/30707) - so that we can implement the simple UX with no dependencies

* @jbeda & @philips?

1. Documentation - so that new users can see this in 1.4 (even if it’s caveated with alpha/experimental labels and flags all over it)

* @lukemarsden

1. `kubeadm` alpha packages

* @lukemarsden, @mikedanese, @dgoodwin

### Nice to have

1. [Kubectl apply --purge](https://github.com/kubernetes/kubernetes/pull/29551) - so that addons can be maintained using k8s infrastructure

* @lukemarsden & @errordeveloper

## kubeadm implementation plan

Based on [@philips' comment here](https://github.com/kubernetes/kubernetes/pull/30361#issuecomment-239588596).
The key point with this implementation plan is that it requires basically no changes to kubelet except [#30515](https://github.com/kubernetes/kubernetes/issues/30515).
It also doesn't require kubelet to do TLS bootstrapping - kubeadm handles that.

### kubeadm init master

1. User installs and configures kubelet to look for manifests in `/etc/kubernetes/manifests`
1. API server CA certs are generated by kubeadm
1. kubeadm generates pod manifests to launch API server and etcd
1. kubeadm pushes replica set for prototype jsw-server and the JWS into API server with host-networking so it is listening on the master node IP
1. kubeadm prints out the IP of JWS server and JWS token

### kubeadm join node --token IP

1. User installs and configures kubelet to have a kubeconfig at `/var/lib/kubelet/kubeconfig` but the kubelet is in a crash loop and is restarted by host init system
1. kubeadm talks to jws-server on IP with token and gets the cacert, then talks to the apiserver TLS bootstrap API to get client cert, etc and generates a kubelet kubeconfig
1. kubeadm places kubeconfig into `/var/lib/kubelet/kubeconfig` and waits for kubelet to restart
1. Mission accomplished, we think.

## See also

* [Joe Beda's "K8s the hard way easier"](https://docs.google.com/document/d/1lJ26LmCP-I_zMuqs6uloTgAnHPcuT7kOYtQ7XSgYLMA/edit#heading=h.ilgrv18sg5t) which combines Kelsey's "Kubernetes the hard way" with history of proposed UX at the end (scroll all the way down to the bottom).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a link for Kelsey's doc? Also, it may be better to use github handles rather than full names.