Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Upstream Kubernetes-Mesos framework #6676

Closed
jdef opened this issue Apr 10, 2015 · 23 comments
Closed

Proposal: Upstream Kubernetes-Mesos framework #6676

jdef opened this issue Apr 10, 2015 · 23 comments
Assignees
Labels
area/platform/mesos priority/backlog Higher priority than priority/awaiting-more-evidence. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle.

Comments

@jdef
Copy link
Contributor

jdef commented Apr 10, 2015

Overview

Goals

  1. It should be super-easy for any k8s user to deploy a k8s cluster on Mesos.
  2. Immediate (developer) awareness of divergence between existing k8s core and k8s-mesos integration.
  3. k8s-mesos integration no longer vendors k8s, but is a part of the k8s distribution.
  4. Compatibility with existing cluster integration tooling
  5. Phase 1 (see below) integration complete as of k8s 1.0

Dependencies / Assumptions

  1. "mesos" cloud provider implementation has been merged into k8s core
  2. k8s-mesos has achieved adequate test coverage
  3. k8s-mesos has adopted Google's Go code style/standards

Core Integration

  1. Packaging alignment
    • k8s-mesos isn't really an add-on, it's the basis for an alternate distribution/deployment
  2. Releases will consist of the current k8s core components, as well as the k8s-mesos framework
    • Larger distribution size
    • Current k8s-mesos dependencies merged into k8s dependencies (e.g. k8s will vendor mesos-go, et al.)
    • Platform builds will fail upon divergence between k8s core and k8s-mesos framework
  3. Phased approach
    • Phase 1: Build & Deploy + Framework
    • Phase 2: Cluster Provider + Salt
    • Phase 3: Wish List

Build & Deploy

$ git clone https://github.com/GoogleCloudPlatform/kubernetes k8s
$ cd k8s
$ export KUBERNETES_PROVIDER=mesos:gce              # s/:/--/ for hier providers
$ hack/dev-build-and-push.sh                        # (unchanged)
$ cluster/kube-up.sh                                # phase 2: bring up mesos/gce cloud w/ k8s

Framework

github.com/GoogleCloudPlatform/kubernetes/plugins/third_party/mesos
    doc.go                # what is this stuff?!
    pkg/
        scheduler/
        executor/
        offers/
        ...               # additional, utility classes
    cmd/
        km/               # unified framework binary, can run either as mesos scheduler or executor

Cluster Provider

github.com/GoogleCloudPlatform/kubernetes/cluster/mesos--gce
    util.sh        # provide some additional configs for mesos cluster config,
*(other).sh        # but mostly delegate to gce provider.
                   # probably set DISABLE_CORE_K8S (see Salt below)
                   # domain = kubernetes.mesos

Salt

github.com/GoogleCloudPlatform/kubernetes/cluster/salt
    ...         # k8s scheduler plugin isn’t started (via DISABLE_CORE_K8S)
                # kubelet isn’t started (via DISABLE_CORE_K8S)
                #    - no static pod support
                # use existing kube-apiserver support
                # use existing controller-manager support
                # use existing kube-proxy support
                # zk started on master node
                # mesos-slave system service started on all minions
                # mesos-master system service started on master node
                # km framework is deployed to mesos (--run_proxy=false)
                #    - this provides both the scheduler and executor components
                # kube-addons should still function if deployed by kubectl

Wish List

  1. mesos-dns integration (to resolve services in .kubernetes.mesos domain)
    • This would likely become an add-on
  2. Hierarchical cloud provider support for the “mesos” cloud provider API implementation
    • To expose the more feature-rich underlying cloud provider impl, like GCE

cc @bgrant0607 @davidopp @guenter @timothysc @ConnorDoyle

@roberthbailey
Copy link
Contributor

/cc @zmerlynn @alex-mohr

@roberthbailey roberthbailey added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. labels Apr 10, 2015
@timothysc
Copy link
Member

@jdef
re: k8s-mesos has adopted Google's Go code style/standards (gofmt)

Re: wish list - most likely post 1.0 at this point.

@ConnorDoyle
Copy link
Contributor

Could we add the Mesos tag to this issue?

@roberthbailey
Copy link
Contributor

Sure.

@davidopp davidopp added team/master priority/backlog Higher priority than priority/awaiting-more-evidence. and removed priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. labels Apr 11, 2015
@davidopp
Copy link
Member

Thanks for the writeup. My main question is about "no static pod support." We are planning to use static pods for a bunch of things. On the nodes for the monitoring/logging stack (cAdvisor, etc.) and on the master for running the master components (which will run in containers). Can you explain why static pods won't work? Independent of Mesos, we need a way to make static pods work with dynamically-added nodes anyway since we want to support dynamically growing clusters, so I don't think the fact that the Mesos scheduler dynamically adds and removes nodes should prevent static pods from working (once we have all of this working).

cc/ @dchen1107

@jdef
Copy link
Contributor Author

jdef commented Apr 12, 2015

My initial thinking was based on the presumption that static pods are
expected to be present on all mesos slaves. Perhaps this thinking is
incorrect. If static pods are expected to be present on only those mesos
slaves running kubelet-executors, then it's easier to envision a path
forward. Still, mesos would have to account for the resources used by those
static pods somehow.

One way to do this would be to specify the static pod configuration (as a
file or web resource) to the k8sm-scheduler. Upon launching a
kubelet-executor on a slave, the k8sm-scheduler would forward the static
pod configuration resource. As part of (or prior to) launching that
kubelet-executor, the k8sm-scheduler would analyze the static pod
configuration for resources used and update the executor's "launch
manifest" so that mesos can properly account for those resources. It
requires that the k8sm-scheduler is able to parse and peek into a static
pod manifest, but maybe that isn't so bad.

On Sat, Apr 11, 2015 at 1:49 AM, David Oppenheimer <notifications@github.com

wrote:

Thanks for the writeup. My main question is about "no static pod support."
We are planning to use static pods for a bunch of things. On the nodes for
the monitoring/logging stack (cAdvisor, etc.) and on the master for running
the master components (which will run in containers). Can you explain why
static pods won't work? Independent of Mesos, we need a way to make static
pods work with dynamically-added nodes anyway since we want to support
dynamically growing clusters, so I don't think the fact that the Mesos
scheduler dynamically adds and removes nodes should prevent static pods
from working (once we have all of this working).

cc/ @dchen1107 https://github.com/dchen1107


Reply to this email directly or view it on GitHub
#6676 (comment)
.

James DeFelice
585.241.9488 (voice)
650.649.6071 (fax)

@davidopp
Copy link
Member

Yeah, I can't think of any reason why we'd need the static pods to be running on nodes that aren't part of the k8s cluster, i.e. don't have kubelet running.

As for your proposal, I think we should try to get the static pods running using whatever mechanism is closest to how we do it in native Kubernetes. TBH I don't remember how this will work (maybe using file source, in which case it would be somewhat close to what you are proposing). Hopefully @dchen1107 or @yujuhong can comment.

One thing I'm wondering about your question, though, is why is the resource accounting aspect new? The things we're running as static pods are things that already run on the node today, just outside of pods. So the fact that the amount of available resources needs to take them into account doesn't change when they move into static pods.

@davidopp davidopp self-assigned this Apr 12, 2015
@jdef
Copy link
Contributor Author

jdef commented Apr 13, 2015

From what I remember of the code, I think that the file source can be used
for static pods (and something about pod mirrors -- more reading required).
So I think that aligns well with the modifications I proposed.

The resource accounting is important because even though these are static
pods, they are only static from the perspective of the kubernetes-mesos
framework -- not from mesos core (master). So if the mesos slaves are
started on nodes and given (read: configured to consume up to) X resources,
then we secretly start static pods without telling the master that we're
consuming resources, the node may become oversubscribed. In its current
state the k8sm framework still has some work to do with respect to resource
accounting, but accounting for those resources consumed by static pods is
definitely a production-level requirement.

On Sun, Apr 12, 2015 at 3:07 PM, David Oppenheimer <notifications@github.com

wrote:

Yeah, I can't think of any reason why we'd need the static pods to be
running on nodes that aren't part of the k8s cluster, i.e. don't have
kubelet running.

As for your proposal, I think we should try to get the static pods running
using whatever mechanism is closest to how we do it in native Kubernetes.
TBH I don't remember how this will work (maybe using file source, in which
case it would be somewhat close to what you are proposing). Hopefully
@dchen1107 https://github.com/dchen1107 or @yujuhong
https://github.com/yujuhong can comment.

One thing I'm wondering about your question, though, is why is the
resource accounting aspect new? The things we're running as static pods are
things that already run on the node today, just outside of pods. So the
fact that the amount of available resources needs to take them into account
doesn't change when they move into static pods.


Reply to this email directly or view it on GitHub
#6676 (comment)
.

James DeFelice
585.241.9488 (voice)
650.649.6071 (fax)

@vmarmol
Copy link
Contributor

vmarmol commented Apr 13, 2015

@davidopp the Kubelet has a directory it watches for static pod configurations and starts any present there. The plan is to drop the config files for the desired pods there.

It seems that the resource accounting issue is mainly that the information is coming from Mesos and not Kubernetes? In theory Kubernetes could account for its own overhead on the node without over-subscribing the node (it does not today).

@timothysc
Copy link
Member

@jdef, @davidopp - Why do static pods matter in the mesos use case?

@jdef
Copy link
Contributor Author

jdef commented Apr 13, 2015

Not sure about cadvisor (since it now appears to be embedded within the
kubelet) but if there are other supporting services that we want to deploy
on Mesos slaves alongside the kubelet-executor (in the same resource
sandbox) and with the same lifetime of the kubelet-executor, then it makes
sense to run those as static pods. @davidopp do you have any examples other
than logging and cadvisor?

On Mon, Apr 13, 2015 at 4:44 PM, Timothy St. Clair <notifications@github.com

wrote:

@jdef https://github.com/jdef, @davidopp https://github.com/davidopp

  • Why do static pods matter in the mesos use case?


Reply to this email directly or view it on GitHub
#6676 (comment)
.

James DeFelice
585.241.9488 (voice)
650.649.6071 (fax)

@erictune
Copy link
Member

@erickhan

@davidopp
Copy link
Member

Static pods are accounted in the master via the "mirror pods" feature that was recently added. See #4090

kubelet, docker, and kube-proxy don't run at pods at all, so their resources aren't accounted at all.

I'm not sure which pods run as static pods currently vs. regular pods. The candidates are fluentd, elasticsearch, kibana, kube-dns, heapster, influx-grafana. I think it may only be fluentd (as static pods). I'll find out.

@davidopp
Copy link
Member

@yujuhong says that fluentd-to-elasticsearch-kubernetes (image name fluentd-elasticsearch) is the only thing running as a static pod on the nodes (i.e. not master) right now.

@jdef
Copy link
Contributor Author

jdef commented Apr 17, 2015

Thanks for the info. It sounds like static pod support the only blocker. Can I assume that we have a green light for submitting a PR for the phase1 stuff once static pod support is in place?

@yllierop
Copy link
Contributor

👍 I'm looking forward to having this PR submitted.

@jdef
Copy link
Contributor Author

jdef commented Apr 17, 2015

third_party alternatives for github.com/GoogleCloudPlatform/kubernetes/plugins/third_party/mesos, since third_party isn't really what we mean - this is go code that's not being vendored:

  • community
  • contrib
  • dist

@yllierop
Copy link
Contributor

@jdef I vote for contrib. But I'd be curious to know what @brendanburns thinks.

@timothysc
Copy link
Member

+1 to contrib, pretty standard.

@timothysc
Copy link
Member

@jdef any update here?

@jdef
Copy link
Contributor Author

jdef commented May 4, 2015

getting closer to having things ready to upstream. team has been working on
blocking items: improving coverage, cloud provider configuration, and
static pods. i think we'll have something ready for upstream soon.

On Mon, May 4, 2015 at 4:03 PM, Timothy St. Clair notifications@github.com
wrote:

@jdef https://github.com/jdef any update here?


Reply to this email directly or view it on GitHub
#6676 (comment)
.

James DeFelice
585.241.9488 (voice)
650.649.6071 (fax)

@karlkfi
Copy link
Contributor

karlkfi commented Jul 27, 2015

Upstreaming is effectively complete.
We're going to track future improvements in their own issues.

TODO:

@karlkfi karlkfi closed this as completed Jul 27, 2015
@jdef
Copy link
Contributor Author

jdef commented Jul 28, 2015

🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/platform/mesos priority/backlog Higher priority than priority/awaiting-more-evidence. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle.
Projects
None yet
Development

No branches or pull requests

9 participants