Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Dramatically Simplify Kubernetes Cluster Creation #30360

Conversation

lukemarsden
Copy link
Contributor

@lukemarsden lukemarsden commented Aug 10, 2016

@k8s-github-robot k8s-github-robot added kind/design Categorizes issue or PR as related to design. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. release-note-label-needed labels Aug 10, 2016
@bgrant0607 bgrant0607 added release-note-none Denotes a PR that doesn't merit a release note. and removed release-note-label-needed labels Aug 10, 2016

```
master# kubeadm init master
Error: socat not installed. Unable to proceed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh? What needs socat and why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was just an example: see #26093 for some dependencies that folks have discovered. A better example might be a kernel which doesn't support the required cgroups settings.

Copy link
Contributor

@philips philips Aug 10, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this tool care about whether the kubelet has what it needs? In the case of CoreOS we run kubelet in a container and so this check wouldn't work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's trying to help the user with early detection of issues on platforms where the kubelet runs on the host. I agree this may not work so well if the user is deploying containerize kubelets, so we'd at least need a way to turn it off.

### Add-node

*Same as Install – "on node machines".*

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I like this idea. But, it seems to gloss over how networking will be handled/configured. Would you be installing an overlay automatically or will there be some other solution?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the idea is that pod networks should be installed as add-ons, that is, with kubectl apply -f. See above section "Install-addons". Does that answer your question?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the idea is that networking is configured as an add-on. We should make that clear in the proposal.

@smarterclayton
Copy link
Contributor

I don't recall significant support for the gossip path. Did I misinterpret the call?

@mikedanese
Copy link
Member

We need a way to provide the "discovery" payload (api url and ca.crt) to initiate this process. The most uncontroversial way to do this is they way we do it now: manually. We can improve on that mechanism incrementally, but IMO this proposal his orthogonal and higher priority than implementing more advanced discovery processes.

Even if I have to bother with getting a list of apiservers and a ca.crt onto a node, this is a HUGE improvement over what we have today.


## Top-down view: UX for Phase I items

We will introduce a new binary, kubeadm, which ships with the Kubernetes OS packages (and binary tarballs, for OSes without package managers).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can put this as a subcommand of kubelet. What do people think of that?

$ kubelet cluster join node
$ kubelet cluster init master
$ # or
$ kubelet join node
$ kubelet init master

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would we want to do that? To save the user from downloading another binary?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm torn. The engineer in me says these are different things, so different names make sense. But this is all about streamlining and reducing moving parts, so I think less is more here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe kubelet admin init master to namespace a little deeper without producing multiple CLIs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a developer experience perspective, I would strongly encourage us not to make kubelet the top-level command that the user interacts with (even with subcommands). In anything like its current form, kubelet should be plumbing, not porcelain IMO. It's highly likely that developers/users will see the docs/blog post/whatever instructions that mention kubelet and then type kubelet --help. They need to get less than a page of clear instructions on how to get started. What they get currently is anything-but. That --help text is our canvas. Let's start from a clean one.

The reason I don't think we can "clean" the kubelet "canvas" as that it would break compatibility with lots of things that already assume things about how kubelet works and which arguments it takes. We don't want to break everyone's existing kubernetes-installation systems when we introduce this new happy path.

We previously agreed in the SIG that the happy path includes "download OS packages" for most distros. Is there an issue with the OS packages including two binaries that I don't see?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the input guys. Let's start with the simplest thing, a new kubeadm binary. I'm not averse to git-style subcommands, I just don't think we need them right now to solve the problem SIG-cluster-lifecycle is trying to solve.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can always deprecate kubeadm if we decide otherwise. I'm partial to
kubelet init but I understand the concerns regarding it. A new binary name
is a good place to experiment.

On Fri, Aug 12, 2016 at 4:14 AM, lukemarsden notifications@github.com
wrote:

In docs/proposals/dramatically-simplify-cluster-creation.md
#30360 (comment)
:

+* Add-node: I can add another computer to the cluster.
+
+* Secure: As an attacker with (presumed) control of the network, I cannot add malicious nodes I control to the cluster created by the user. I also cannot remotely control the cluster.
+
+### Phase II
+
+In time for Kubernetes 1.5:
+Everything from Phase I as beta/stable feature, everything else below as beta feature in Kubernetes 1.5.
+
+* Upgrade: Later, when Kubernetes 1.4.1 or any newer release is published, I can upgrade to it by typing one other command on each computer.
+
+* HA: If one of the computers in the cluster fails, the cluster carries on working. I can find out how to replace the failed computer, including if the computer was one of the masters.
+
+## Top-down view: UX for Phase I items
+
+We will introduce a new binary, kubeadm, which ships with the Kubernetes OS packages (and binary tarballs, for OSes without package managers).

Let's start with the simplest thing, a new kubeadm binary. I'm not averse
to git-style subcommands, I just don't think we need them right now to
solve the problem SIG-cluster-lifecycle is trying to solve.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/kubernetes/pull/30360/files/7f3458c36e81ee2a5511bdef1bc96e76e05ccdc2#r74555184,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_p1Fpnd9677bRaOL9n4FhBN0xtgsJks5qfCtvgaJpZM4JhIGl
.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI docker daemon is no longer a thing, as of 1.12 there is a separate dockerd binary...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we are suggesting kubeadm instead of the simpler kube?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, @jbeda was concerned that kube is possibly being taken by the hyperkube binary. It probably isn't. But we didn't want to risk it.

@thockin
Copy link
Member

thockin commented Aug 11, 2016

Overall this looks fantastic. Has @kelseyhightower been weighing in on this to keep us honest?

@matchstick

@thockin
Copy link
Member

thockin commented Aug 11, 2016

Where does kube-proxy come in? As an addon? I'm most concerned about the inter-relationships of addons. You can't really use DNS without some form of service VIP. The network plugins in use have some impact on kube-proxy (or replacements thereof) and vice-versa

@lukemarsden
Copy link
Contributor Author

@philips it works :)

@mikedanese
Copy link
Member

This needs a squash, then I will lgtm

@timothysc
Copy link
Member

/cc @detiber @dgoodwin

Example usage:

Create a two-machine cluster with one master (which controls the cluster),
and one node (where workloads, like pods and containers run).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmmm if building a universal bootstrapper, can we start moving away from the term master ? one concrete reason: for HA dont we worry about what a "master" is w/ the scheduler and apiserver and so on being possibly segregated,moving around over time ? and also, we know there may be many apiservers....

@k8s-bot
Copy link

k8s-bot commented Aug 30, 2016

Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test".
(Note: "add to whitelist" is no longer supported. Please update configurations in kubernetes/test-infra/jenkins/job-configs/kubernetes-jenkins-pull instead.)

This message will repeat several times in short succession due to jenkinsci/ghprb-plugin#292. Sorry.

@guybrush
Copy link

guybrush commented Aug 30, 2016

I think this proposal is missing the use-case where 1 single machine fulfills both roles: master and node? One usecase for this would be a developer running a single-node cluster locally. Another usecase would be for small clusters where you want to utilize the free resources on the master-node.

Maybe this could be done with a command like kubeadm init master,node?

Another thing that would be really nice, is to be able to define (and update?) roles (master, node) at runtime.

Anyway this is really huge, thank you so much for your effort!

@dgoodwin
Copy link
Contributor

Would this just be as simple as running a "kubeadm join node" on the same system after initializing it as a master?

I like the concept of roles though, hope we can explore that once we get a little further along so we can do granular operations like adding a new API server or perhaps an etcd node.

@errordeveloper
Copy link
Member

The way I see it is more like 'kuneadm init --also-node', we have
previously discussed some alternatives also...

On Tue, 30 Aug 2016, 13:21 Devan Goodwin, notifications@github.com wrote:

Would this just be as simple as running a "kubeadm join node" on the same
system after initializing it as a master?

I like the concept of roles though, hope we can explore that once we get a
little further along so we can do granular operations like adding a new API
server or perhaps an etcd node.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#30360 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAPWS0xJX--GsmG5xL9mbSCjZAVeQBIYks5qlCAqgaJpZM4JhIGl
.

@lukemarsden lukemarsden mentioned this pull request Aug 30, 2016
11 tasks
@bogdando
Copy link

bogdando commented Sep 5, 2016

Review status: 0 of 1 files reviewed at latest revision, 15 unresolved discussions, some commit checks failed.


docs/proposals/dramatically-simplify-cluster-creation.md, line 91 [r4] (raw file):

    On the first machine
    ====================
    master# kubeadm init master

+1 to not use "master" and do not confuse people immediately start thinking about A/P, active/standby, master/slave. Could it be just?

node1# kubeadm init cluster (or seed or member or apiserver thing)
node2# kubeadm join node --token=


Comments from Reviewable

@errordeveloper
Copy link
Member

@bogdando while working on the initial prototype (#31221), we have ended-up with kubeadm init and kubeadm join, we are not doing HA yet for the MVP, and I'm not sure if how this will project to HA control-plane bootsrap (may be just join --master will end being the right thing)... What do you think?

@bogdando
Copy link

bogdando commented Sep 9, 2016

LGTM, thank you for update


* *Install*: As a potential Kubernetes user, I can deploy a Kubernetes 1.4 cluster on a handful of computers running Linux and Docker by typing two commands on each of those computers. The process is so simple that it becomes obvious to me how to easily automate it if I so wish.

* *Pre-flight check*: If any of the computers don't have working dependencies installed (e.g. bad version of Docker, too-old Linux kernel), I am informed early on and given clear instructions on how to fix it so that I can keep trying until it works.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should cross like the @kubernetes/sig-node feature here.

@errordeveloper
Copy link
Member

We have landed kubeadm and the UX has diverged from what's described here in good ways. Is it worse updating this proposal or may be we should work on a design document or something else? I think this proposal was a great vehicle for our initial discussions, but I am not sure it has any value once now that we have user docs and the implementation itself.

@errordeveloper
Copy link
Member

We still have #30707 to talk about, and I think that is of more value, and can be turned into a design document.

@lukemarsden
Copy link
Contributor Author

Yeah, I'm going to close this as I think the value was extracted during discussion. I'll probably base a plan document for 1.5 on the "Phase II" sections here, amended, unless someone else gets to that first. LMK if anyone disagrees with closing this PR.

@mikedanese
Copy link
Member

@lukemarsden Please let's get this merged. These docs are important for posterity. Just squash and run ./hack/update-mungedocs.sh and this should be good to merge.

@mikedanese mikedanese reopened this Sep 28, 2016
@k8s-ci-robot
Copy link
Contributor

Jenkins verification failed for commit a692dea. Full PR test history.

The magic incantation to run this job again is @k8s-bot verify test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-ci-robot
Copy link
Contributor

Jenkins GKE smoke e2e failed for commit a692dea. Full PR test history.

The magic incantation to run this job again is @k8s-bot gke e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@mikedanese
Copy link
Member

This merged in #33673

k8s-github-robot pushed a commit that referenced this pull request Sep 28, 2016
…ify-cluster-creation

Automatic merge from submit-queue

Proposal: Dramatically Simplify Kubernetes Cluster Creation

repost of #30360
closes #30360
@errordeveloper
Copy link
Member

thanks, @mikedanese!

xingzhou pushed a commit to xingzhou/kubernetes that referenced this pull request Dec 15, 2016
…ally-simplify-cluster-creation

Automatic merge from submit-queue

Proposal: Dramatically Simplify Kubernetes Cluster Creation

repost of kubernetes#30360
closes kubernetes#30360
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/design Categorizes issue or PR as related to design. release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet