Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make it easy to run Kubernetes on top of the Kubelet (aka self-hosting) #246

Closed
smarterclayton opened this issue Jun 25, 2014 · 23 comments

Comments

@smarterclayton
Copy link
Contributor

commented Jun 25, 2014

Pulling this out from #167

I'd love to start using the kubelet in a more "static" mode to handle the master containers. That would consist of getting the kubelet running on the master (with out looking an etcd) and then have it run a pod for the master components by reading out of a manifest file on disk.

Something I'd be happy to work on

@jjhuff

This comment has been minimized.

Copy link
Contributor

commented Jun 25, 2014

The kubelet already supports reading config from a file, does that work?

@lavalamp

This comment has been minimized.

Copy link
Member

commented Jun 25, 2014

Yeah, I think the work to do here is:

  1. Finish up jbeda's work to dockerize all our components.
  2. Adjust our startup scripts to use kubelet to launch apiserver/replication-controller via the config file.
@smarterclayton

This comment has been minimized.

Copy link
Contributor Author

commented Jun 27, 2014

As a further item (possibly as a separate issue), being able to change code on your devenv and either one-line a command (hack/update-local) or have the image/source automatically reloaded would be valuable for reducing code-test loop time. Probably the former though

@bgrant0607

This comment has been minimized.

Copy link
Member

commented Sep 30, 2014

/cc @proppy, since we were discussing this recently.

The Dockerization issue is #19.

@bgrant0607

This comment has been minimized.

Copy link
Member

commented Nov 26, 2014

Self-hosting proposal:

Kubernetes components are just applications, too. They have much the same needs for packaging/image-building, configuration, deployment, auth[nz], naming/discovery, process management, and so on. Leveraging the systems/APIs we build for our users avoids needing to build and maintain 2 systems to do the same thing, and helps us to eat our own dogfood.

The recipe for building a bootstrappable system is fairly straightforward:

  • Minimize the number of dependencies, particularly those required for steady-state operation,
  • Stratify the dependencies that remain via principled layering, and
  • Break any circular dependencies by converting hard dependencies to soft dependencies.
    • Also accept that data from other components from another source, such as local files, which can then be manually populated at bootstrap time and then continuously updated once those other components are available.
    • Make all state rediscoverable and/or reconstructable.
    • Make it easy to run temporary, bootstrap instances of all components in order to create the runtime state needed to run the components in the steady state; use a lock (master election for distributed components, file lock for local components like Kubelet) to coordinate handoff. We call this technique "pivoting".
    • Have a solution to restart dead components. For distributed components, replication works well. For local components such as Kubelet, a process manager or even a simple shell loop works.

A more concrete sketch:

Master components:

  1. Startup instances of the components all on the same host
  2. Create the services and replication controllers needed to run the components "for real" via kubectl
  3. Either kill the bootstrap instances or enable them to be adopted by the replication controllers submitted through the API. Replication controller deliberately does not care how the pods it controls were created.

The only tricky part is transferring the etcd state. We don't have great solutions for stateful services yet, in general (#260, #1515), but if just running on the same host or same PD, etcd would just pick it up from the volume. The data could also be replicated to other instances.

A step towards this could be to run the components on Kubelet using a local pod file.

Kubelet:

  1. Kubelet should support caching the pods read from the registry in a hostdir volume on local disk (#489). This cache would be updated when updates were received from the apiserver or etcd.
  2. Distribute an old, fallback/bootstrap Kubelet to the nodes any way we like: baked into the host/VM image, Salt, Bosh, rsync, whatever.
  3. Use the runonce feature of Kubelet to use the first-step bootstrap Kubelet to pull and start a containerized, kube-ified Kubelet. We may want to extend runonce to read from apiserver as well as from local sources.
  4. Leverage Docker restarts to restart Kubelet when it dies (#816).
  5. Use the per-node controller (#1518, #2491) to deploy new Kubelet images and/or configuration. The new Kubelet must be started before the old one terminates.
@proppy

This comment has been minimized.

Copy link
Contributor

commented Nov 26, 2014

A step towards this could be to run the components on Kubelet using a local pod file.

There was an example for that for 0.4 here: https://github.com/GoogleCloudPlatform/kubernetes/blob/master/build/run-images/bootstrap/run.sh#L22

Use the runonce feature of Kubelet to use the first-step bootstrap Kubelet to pull and start a containerized, kube-ified Kubelet. We may want to extend runonce to read from apiserver as well as from local sources.

Last time I tried you couldn't run a kubelet in a pod w/ kubelet --runonce, because the child kubelet would kill its parent (because of the lack of namespace, he thought it was a leftover k8s container it didn't know about)

@saad-ali

This comment has been minimized.

Copy link
Member

commented Jan 17, 2015

CC: @saad-ali

@vipulnayyar

This comment has been minimized.

Copy link

commented Mar 6, 2015

Hello everyone,

I would like to work on this project during GSoC '15. I've gone through setting up the base dev environment and starting a local cluster. I'm going through the kubelet and apiserver code right now.

Which of the future points suggested by @bgrant0607 and others do you think are critical and should be pursued for this project? The bootstrapping part definitely seems interesting, but there are a lot of suggestions related to kubelet present in the whole issue queue. I'm assuming the bootstrapping process has a lot to do with other components apart from kubelet too.

And how should I proceed in order to contribute to the component dockerization work done until now by @jbeda? I mean, what parts have been covered and where should I focus now so as to further his work through this project?

@bgrant0607

This comment has been minimized.

Copy link
Member

commented Mar 7, 2015

@vipulnayyar Thanks for your interest! Early next week I plan to flesh out more of the details of projects that GSoC candidates express interest in. I'll update this issue.

My current thinking is that this project would focus on running the master components (apiserver, controller-manager, etc.) in pods. A necessary first step is to properly containerize them. We have made previous attempts at that, but didn't push them through to completion, for various reasons.

@vipulnayyar

This comment has been minimized.

Copy link

commented Mar 7, 2015

@bgrant0607 Related to this topic, based on my past experience as a GSoC student, you also need to figure out the application template that students need to follow while submitting their GSoC application for Kubernetes.

@bgrant0607

This comment has been minimized.

Copy link
Member

commented Mar 10, 2015

I created a wiki re. participation: https://github.com/GoogleCloudPlatform/kubernetes/wiki/Google-Summer-of-Code-(GSoC)

GSoC project ideas are labeled kind/gsoc:
https://github.com/GoogleCloudPlatform/kubernetes/labels/kind%2Fgsoc

"Starter project" ideas are labeled help-wanted (note that not all are necessarily small/easy):
https://github.com/GoogleCloudPlatform/kubernetes/labels/status%2Fhelp-wanted

Bootstrapping-related issues are labeled in an obvious way:
https://github.com/GoogleCloudPlatform/kubernetes/labels/area%2Fbootstrapping

Work has started to run etcd in a container/pod (#4442). The apiserver, controller-manager, and scheduler also need to be handled. Pushing pod files via Salt is a reasonable starting point, but I'd love to be able to post the pods to Kubelet and then have the apiserver automatically pick them up (this latter bit is ongoing -- #4090).

Being able to run master components with high availability (#473) is related -- I think self-hosting them is essentially a prerequisite.

Self-hosting Kubelet is almost an entirely separate project.

@bgrant0607

This comment has been minimized.

Copy link
Member

commented Mar 10, 2015

See also #5011.

Sorry, @vipulnayyar. Someone has started working on this one, so I'm going to remove it from the GSoC list.

@mikedanese mikedanese self-assigned this Mar 14, 2016

vishh added a commit to vishh/kubernetes that referenced this issue Apr 6, 2016

Merge pull request kubernetes#246 from satnam6502/master
Fix a minor typo in the README.md file.

keontang pushed a commit to keontang/kubernetes that referenced this issue May 14, 2016

Merge pull request kubernetes#246 from Pendoragon/master
Implement adding new nodes to cluster
@philips

This comment has been minimized.

Copy link
Contributor

commented Jun 28, 2016

I feel like the primary task is largely done and then some. Here is the current status of self-hosted and bootkube thread: https://groups.google.com/d/msg/kubernetes-sig-cluster-lifecycle/p9QFxw-7NKE/jeYJF1hBAwAJ

keontang pushed a commit to keontang/kubernetes that referenced this issue Jul 1, 2016

Merge pull request kubernetes#246 from Pendoragon/master
Implement adding new nodes to cluster

harryge00 pushed a commit to harryge00/kubernetes that referenced this issue Aug 11, 2016

Merge pull request kubernetes#246 from Pendoragon/master
Implement adding new nodes to cluster

mqliang pushed a commit to mqliang/kubernetes that referenced this issue Dec 8, 2016

Merge pull request kubernetes#246 from Pendoragon/master
Implement adding new nodes to cluster

@bgrant0607 bgrant0607 assigned philips and unassigned mikedanese Feb 10, 2017

mqliang pushed a commit to mqliang/kubernetes that referenced this issue Mar 3, 2017

Merge pull request kubernetes#246 from Pendoragon/master
Implement adding new nodes to cluster

@bgrant0607 bgrant0607 added the triaged label Mar 9, 2017

@kargakis

This comment has been minimized.

Copy link
Member

commented Mar 20, 2017

cc: @kubernetes/sig-cluster-lifecycle-misc

@fejta-bot

This comment has been minimized.

Copy link

commented Dec 22, 2017

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

@kargakis

This comment has been minimized.

Copy link
Member

commented Dec 22, 2017

/assign luxas
/remove-lifecycle stale

@luxas

This comment has been minimized.

Copy link
Member

commented Dec 25, 2017

@timothysc has been working on the bootstrapping checkpointing feature for the node to make this work easily with upstream k8s. Also kubeadm has self-hosting support (alpha in v1.9 but expected to be beta in v1.10)
/assign @timothysc

@timothysc

This comment has been minimized.

Copy link
Member

commented Jan 18, 2018

Given the plethora of incantations that exist today, I'm going to close this root issue b/c it no longer tracks the details. We are working in sig-cluster-lifecycle to refine this on our road to GA and this issue no longer tracks state so closing.

@timothysc timothysc closed this Jan 18, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.