-
Notifications
You must be signed in to change notification settings - Fork 39k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Dramatically Simplify Kubernetes Cluster Creation #30360
Changes from all commits
c6ad397
3df3c76
5233dfa
7f3458c
00ce976
a07a448
1856015
de8e890
a692dea
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,258 @@ | ||
# Proposal: Dramatically Simplify Kubernetes Cluster Creation | ||
|
||
Luke Marsden & many others in [SIG-cluster-lifecycle](https://github.com/kubernetes/community/tree/master/sig-cluster-lifecycle). | ||
|
||
17th August 2016 | ||
|
||
*This proposal aims to capture the latest consensus and plan of action of SIG-cluster-lifecycle. It should satisfy the first bullet point [required by the feature description](https://github.com/kubernetes/features/issues/11).* | ||
|
||
See also: [this presentation to community hangout on 4th August 2016](https://docs.google.com/presentation/d/17xrFxrTwqrK-MJk0f2XCjfUPagljG7togXHcC39p0sM/edit?ts=57a33e24#slide=id.g158d2ee41a_0_76) | ||
|
||
## Motivation | ||
|
||
Kubernetes is hard to install, and there are many different ways to do it today. None of them are excellent. We believe this is hindering adoption. | ||
|
||
## Goals | ||
|
||
Have one recommended, official, tested, "happy path" which will enable a majority of new and existing Kubernetes users to: | ||
|
||
* Kick the tires and easily turn up a new cluster on infrastructure of their choice | ||
|
||
* Get a reasonably secure, production-ready cluster, with reasonable defaults and a range of easily-installable add-ons | ||
|
||
We plan to do so by improving and simplifying Kubernetes itself, rather than building lots of tooling which "wraps" Kubernetes by poking all the bits into the right place. | ||
|
||
## Scope of project | ||
|
||
There are logically 3 steps to deploying a Kubernetes cluster: | ||
|
||
1. *Provisioning*: Getting some servers - these may be VMs on a developer's workstation, VMs in public clouds, or bare-metal servers in a user's data center. | ||
|
||
2. *Install & Discovery*: Installing the Kubernetes core components on those servers (kubelet, etc) - and bootstrapping the cluster to a state of basic liveness, including allowing each server in the cluster to discover other servers: for example teaching etcd servers about their peers, having TLS certificates provisioned, etc. | ||
|
||
3. *Add-ons*: Now that basic cluster functionality is working, installing add-ons such as DNS or a pod network (should be possible using kubectl apply). | ||
|
||
Notably, this project is *only* working on dramatically improving 2 and 3 from the perspective of users typing commands directly into root shells of servers. The reason for this is that there are a great many different ways of provisioning servers, and users will already have their own preferences. | ||
|
||
What's more, once we've radically improved the user experience of 2 and 3, it will make the job of tools that want to do all three much easier. | ||
|
||
## User stories | ||
|
||
### Phase I | ||
|
||
**_In time to be an alpha feature in Kubernetes 1.4._** | ||
|
||
Note: the current plan is to deliver `kubeadm` which implements these stories as "alpha" packages built from master (after the 1.4 feature freeze), but which are capable of installing a Kubernetes 1.4 cluster. | ||
|
||
* *Install*: As a potential Kubernetes user, I can deploy a Kubernetes 1.4 cluster on a handful of computers running Linux and Docker by typing two commands on each of those computers. The process is so simple that it becomes obvious to me how to easily automate it if I so wish. | ||
|
||
* *Pre-flight check*: If any of the computers don't have working dependencies installed (e.g. bad version of Docker, too-old Linux kernel), I am informed early on and given clear instructions on how to fix it so that I can keep trying until it works. | ||
|
||
* *Control*: Having provisioned a cluster, I can gain user credentials which allow me to remotely control it using kubectl. | ||
|
||
* *Install-addons*: I can select from a set of recommended add-ons to install directly after installing Kubernetes on my set of initial computers with kubectl apply. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Where will the recipes for the add-ons live? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|
||
* *Add-node*: I can add another computer to the cluster. | ||
|
||
* *Secure*: As an attacker with (presumed) control of the network, I cannot add malicious nodes I control to the cluster created by the user. I also cannot remotely control the cluster. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jbeda had suggested relaxing this point (at least at first) and operating in the mode that if you are on the network you are trusted to be part of the cluster. That seems like a reasonable starting point if we can't get to fully secure in the first iteration. |
||
|
||
### Phase II | ||
|
||
**_In time for Kubernetes 1.5:_** | ||
*Everything from Phase I as beta/stable feature, everything else below as beta feature in Kubernetes 1.5.* | ||
|
||
* *Upgrade*: Later, when Kubernetes 1.4.1 or any newer release is published, I can upgrade to it by typing one other command on each computer. | ||
|
||
* *HA*: If one of the computers in the cluster fails, the cluster carries on working. I can find out how to replace the failed computer, including if the computer was one of the masters. | ||
|
||
## Top-down view: UX for Phase I items | ||
|
||
We will introduce a new binary, kubeadm, which ships with the Kubernetes OS packages (and binary tarballs, for OSes without package managers). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we can put this as a subcommand of kubelet. What do people think of that? $ kubelet cluster join node
$ kubelet cluster init master
$ # or
$ kubelet join node
$ kubelet init master There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why would we want to do that? To save the user from downloading another binary? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm torn. The engineer in me says these are different things, so different names make sense. But this is all about streamlining and reducing moving parts, so I think less is more here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. maybe There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. From a developer experience perspective, I would strongly encourage us not to make The reason I don't think we can "clean" the We previously agreed in the SIG that the happy path includes "download OS packages" for most distros. Is there an issue with the OS packages including two binaries that I don't see? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for the input guys. Let's start with the simplest thing, a new There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can always deprecate kubeadm if we decide otherwise. I'm partial to On Fri, Aug 12, 2016 at 4:14 AM, lukemarsden notifications@github.com
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. FYI There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there a reason we are suggesting There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, @jbeda was concerned that |
||
|
||
``` | ||
laptop$ kubeadm --help | ||
kubeadm: bootstrap a secure kubernetes cluster easily. | ||
|
||
/==========================================================\ | ||
| KUBEADM IS ALPHA, DO NOT USE IT FOR PRODUCTION CLUSTERS! | | ||
| | | ||
| But, please try it out! Give us feedback at: | | ||
| https://github.com/kubernetes/kubernetes/issues | | ||
| and at-mention @kubernetes/sig-cluster-lifecycle | | ||
\==========================================================/ | ||
|
||
Example usage: | ||
|
||
Create a two-machine cluster with one master (which controls the cluster), | ||
and one node (where workloads, like pods and containers run). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. hmmmm if building a universal bootstrapper, can we start moving away from the term master ? one concrete reason: for HA dont we worry about what a "master" is w/ the scheduler and apiserver and so on being possibly segregated,moving around over time ? and also, we know there may be many apiservers.... |
||
|
||
On the first machine | ||
==================== | ||
master# kubeadm init master | ||
Your token is: <token> | ||
|
||
On the second machine | ||
===================== | ||
node# kubeadm join node --token=<token> <ip-of-master> | ||
|
||
Usage: | ||
kubeadm [command] | ||
|
||
Available Commands: | ||
init Run this on the first server you deploy onto. | ||
join Run this on other servers to join an existing cluster. | ||
user Get initial admin credentials for a cluster. | ||
manual Advanced, less-automated functionality, for power users. | ||
|
||
Use "kubeadm [command] --help" for more information about a command. | ||
``` | ||
|
||
### Install | ||
|
||
*On first machine:* | ||
|
||
``` | ||
master# kubeadm init master | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't quite understand the workflow. Presumably I have to do something before I run kubeadm, because initially I won't have that binary. To help me understand better:
I'm curious about the case where I might have limited Internet connectivity--some users may prefer to pre-download everything out of band. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See above:
So a complete flow might be:
(see kubernetes/release#35 for more detail) The OS packages can declare a dependency on Docker, and configure the init system to start kubelet on boot. Does this answer your question? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Great question. It will cause static pods to be written which will download other kubernetes components and run them as pods. If a user wants a completely isolated, disconnected install, they would need to pre-fetch these container images, or bake them into their AMIs or what-have-you. Note that working when disconnected from the Internet is one of the key benefits of the approach advocated for in #30361. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @AlainRoy woo small world, great to see in you kube-land. |
||
Initializing kubernetes master... [done] | ||
Cluster token: 73R2SIPM739TNZOA | ||
Run the following command on machines you want to become nodes: | ||
kubeadm join node --token=73R2SIPM739TNZOA <master-ip> | ||
You can now run kubectl here. | ||
``` | ||
|
||
*On N "node" machines:* | ||
|
||
``` | ||
node# kubeadm join node --token=73R2SIPM739TNZOA <master-ip> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is actually occurring here? e.g. is this responsible for installing the kubelet/docker/etc - or is that left to deployer? If we already have a kubelet available, why have a separate kubeadm tool vs just: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. #30361 tries to articulate an implementation strategy for "what is actually occuring here", which is that We should have a separate tool because they have different lifecycles: the If you really want them to be the same binary then we could do some There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The less distinct root CLIs we have the better. The less binaries people need to know about, the better. I can live with kubeadm, but it feels a little like debian: apt, apt-get, apt-cache, dpkg, etc - I never know which one to run. addtionalyy, we have work to make There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think of
Regarding the hyperkube work, the conclusion offered in the last SIG-cluster-lifecycle was that it would be 6-9 months before we can realistically package everything in the same binary. |
||
Initializing kubernetes node... [done] | ||
Bootstrapping certificates... [done] | ||
Joined node to cluster, see 'kubectl get nodes' on master. | ||
``` | ||
|
||
Note `[done]` would be colored green in all of the above. | ||
|
||
### Install: alternative for automated deploy | ||
|
||
*The user (or their config management system) creates a token and passes the same one to both init and join.* | ||
|
||
``` | ||
master# kubeadm init master --token=73R2SIPM739TNZOA | ||
Initializing kubernetes master... [done] | ||
You can now run kubectl here. | ||
``` | ||
|
||
### Pre-flight check | ||
|
||
``` | ||
master# kubeadm init master | ||
Error: socat not installed. Unable to proceed. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Huh? What needs socat and why? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It was just an example: see #26093 for some dependencies that folks have discovered. A better example might be a kernel which doesn't support the required cgroups settings. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why does this tool care about whether the kubelet has what it needs? In the case of CoreOS we run kubelet in a container and so this check wouldn't work. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's trying to help the user with early detection of issues on platforms where the kubelet runs on the host. I agree this may not work so well if the user is deploying containerize kubelets, so we'd at least need a way to turn it off. |
||
``` | ||
|
||
### Control | ||
|
||
*On master, after Install, kubectl is automatically able to talk to localhost:8080:* | ||
|
||
``` | ||
master# kubectl get pods | ||
[normal kubectl output] | ||
``` | ||
|
||
*To mint new user credentials on the master:* | ||
|
||
``` | ||
master# kubeadm user create -o kubeconfig-bob bob | ||
|
||
Waiting for cluster to become ready... [done] | ||
Creating user certificate for user... [done] | ||
Waiting for user certificate to be signed... [done] | ||
Your cluster configuration file has been saved in kubeconfig. | ||
|
||
laptop# scp <master-ip>:/root/kubeconfig-bob ~/.kubeconfig | ||
laptop# kubectl get pods | ||
[normal kubectl output] | ||
``` | ||
|
||
### Install-addons | ||
|
||
*Using CNI network as example:* | ||
|
||
``` | ||
master# kubectl apply --purge -f \ | ||
https://git.io/kubernetes-addons/<X>.yaml | ||
[normal kubectl apply output] | ||
``` | ||
|
||
### Add-node | ||
|
||
*Same as Install – "on node machines".* | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In general, I like this idea. But, it seems to gloss over how networking will be handled/configured. Would you be installing an overlay automatically or will there be some other solution? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the idea is that pod networks should be installed as add-ons, that is, with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, the idea is that networking is configured as an add-on. We should make that clear in the proposal. |
||
### Secure | ||
|
||
``` | ||
node# kubeadm join --token=GARBAGE node <master-ip> | ||
Unable to join mesh network. Check your token. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. please don't say mesh network. It's really not. I know it sounds awesome and Docker uses it, bu theirs is not a mesh either. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, I'll change this to "Unable to authenticate to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think that "mesh network" was pre-supposing that the implementation was using the gossip protocol (which does have a mesh). But doing that in the top level doc is also incorrect. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This should be fixed now. |
||
``` | ||
|
||
## Work streams – critical path – must have in 1.4 before feature freeze | ||
|
||
1. [TLS bootstrapping](https://github.com/kubernetes/features/issues/43) - so that kubeadm can mint credentials for kubelets and users | ||
|
||
* Requires [#25764](https://github.com/kubernetes/kubernetes/pull/25764) and auto-signing [#30153](https://github.com/kubernetes/kubernetes/pull/30153) but does not require [#30094](https://github.com/kubernetes/kubernetes/pull/30094). | ||
* @philips, @gtank & @yifan-gu | ||
|
||
1. Fix for [#30515](https://github.com/kubernetes/kubernetes/issues/30515) - so that kubeadm can install a kubeconfig which kubelet then picks up | ||
|
||
* @smarterclayton | ||
|
||
## Work streams – can land after 1.4 feature freeze | ||
|
||
1. [Debs](https://github.com/kubernetes/release/pull/35) and [RPMs](https://github.com/kubernetes/release/pull/50) (and binaries?) - so that kubernetes can be installed in the first place | ||
|
||
* @mikedanese & @dgoodwin | ||
|
||
1. [kubeadm implementation](https://github.com/lukemarsden/kubernetes/tree/kubeadm-scaffolding) - the kubeadm CLI itself, will get bundled into "alpha" kubeadm packages | ||
|
||
* @lukemarsden & @errordeveloper | ||
|
||
1. [Implementation of JWS server](https://github.com/jbeda/kubernetes/blob/discovery-api/docs/proposals/super-simple-discovery-api.md#method-jws-token) from [#30707](https://github.com/kubernetes/kubernetes/pull/30707) - so that we can implement the simple UX with no dependencies | ||
|
||
* @jbeda & @philips? | ||
|
||
1. Documentation - so that new users can see this in 1.4 (even if it’s caveated with alpha/experimental labels and flags all over it) | ||
|
||
* @lukemarsden | ||
|
||
1. `kubeadm` alpha packages | ||
|
||
* @lukemarsden, @mikedanese, @dgoodwin | ||
|
||
### Nice to have | ||
|
||
1. [Kubectl apply --purge](https://github.com/kubernetes/kubernetes/pull/29551) - so that addons can be maintained using k8s infrastructure | ||
|
||
* @lukemarsden & @errordeveloper | ||
|
||
## kubeadm implementation plan | ||
|
||
Based on [@philips' comment here](https://github.com/kubernetes/kubernetes/pull/30361#issuecomment-239588596). | ||
The key point with this implementation plan is that it requires basically no changes to kubelet except [#30515](https://github.com/kubernetes/kubernetes/issues/30515). | ||
It also doesn't require kubelet to do TLS bootstrapping - kubeadm handles that. | ||
|
||
### kubeadm init master | ||
|
||
1. User installs and configures kubelet to look for manifests in `/etc/kubernetes/manifests` | ||
1. API server CA certs are generated by kubeadm | ||
1. kubeadm generates pod manifests to launch API server and etcd | ||
1. kubeadm pushes replica set for prototype jsw-server and the JWS into API server with host-networking so it is listening on the master node IP | ||
1. kubeadm prints out the IP of JWS server and JWS token | ||
|
||
### kubeadm join node --token IP | ||
|
||
1. User installs and configures kubelet to have a kubeconfig at `/var/lib/kubelet/kubeconfig` but the kubelet is in a crash loop and is restarted by host init system | ||
1. kubeadm talks to jws-server on IP with token and gets the cacert, then talks to the apiserver TLS bootstrap API to get client cert, etc and generates a kubelet kubeconfig | ||
1. kubeadm places kubeconfig into `/var/lib/kubelet/kubeconfig` and waits for kubelet to restart | ||
1. Mission accomplished, we think. | ||
|
||
## See also | ||
|
||
* [Joe Beda's "K8s the hard way easier"](https://docs.google.com/document/d/1lJ26LmCP-I_zMuqs6uloTgAnHPcuT7kOYtQ7XSgYLMA/edit#heading=h.ilgrv18sg5t) which combines Kelsey's "Kubernetes the hard way" with history of proposed UX at the end (scroll all the way down to the bottom). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Add a link for Kelsey's doc? Also, it may be better to use github handles rather than full names. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably should cross like the @kubernetes/sig-node feature here.