Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aggregator API & metrics-server: Access for clusters w/ separation of concerns architectures #55238

Closed
RRAlex opened this issue Nov 7, 2017 · 21 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery.

Comments

@RRAlex
Copy link

RRAlex commented Nov 7, 2017

I think it's a missing feature, which could be seen as a bug, since it breaks commonly used K8s cluster architectures by making them unable to use the new Aggregator API and forcing a less stable or secure cluster architecture by removing the separation of concerns:
/kind feature
/kind bug

What happened:
When trying to use the metrics-server, though it is launching just fine, it seems the API is trying to access a POD directly and thus unable to do so if you have a clean tiered setup where the controllers (api, cm, scheduler) & etcd might be hosted in a more isolated manner.

The problem is, following a cluster architecture suggested in, for example, Kubernetes the Hard Way, the separation between controller nodes & worker nodes means there is no kube-proxy & kubelet and thus no overlay network running on the controllers. I think this is a very sound way to separate concerns and keep the cluster more manageable and stable.
But, it's then not possible for the Aggregator API to reach the overlay network's IPs and therefore the metrics-server's pods. Everything else works perfectly...

This is the current result of running the metrics-server:

Oct 30 10:45:07 localhost docker/kube-apiserver[944]: E1030 14:45:07.828129       1 controller.go:111] loading OpenAPI spec for "v1beta1.metrics.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: Error: 'dial tcp 10.1.192.30:443: i/o timeout'

You can see the previous relevant discussion with @DirectXMan12 starting from this post.

What you expected to happen:
I'm expecting the Aggregator API, residing within the kube-apiserver, to offer a way to access the metrics-server through some other mean, one that is compatible with all K8s architectures:
either through a proxy (in kubelet?), or through some other option to expose a port with a secure TLS access, similar to what is being used for other somewhat similar K8s parts (kubectl exec, prom metrics, etc.).

How to reproduce it (as minimally and precisely as possible):
Installing a cluster, for example following KTHW guide, where the controller nodes & etcd are, for security, stability & resource manageability reasons, not combined with the N worker nodes that can be scaled as a common set of resources where none are a special pet.

Anything else we need to know?:
The cluster is otherwise healthy & working with all the other kinds of resources with this cleanly separated architecture. This is the only missing part for HPA to work.

Environment:

  • Kubernetes version: 1.8.2
  • metrics-server version: 0.2
@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. kind/bug Categorizes issue or PR as related to a bug. labels Nov 7, 2017
@k8s-github-robot k8s-github-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Nov 7, 2017
@DirectXMan12
Copy link
Contributor

@kubernetes/sig-api-machinery-feature-requests

@k8s-ci-robot k8s-ci-robot added the sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. label Nov 7, 2017
@k8s-github-robot k8s-github-robot removed the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Nov 7, 2017
@mbohlool
Copy link
Contributor

mbohlool commented Nov 9, 2017

/cc @lavalamp @cheftako

@yliaog
Copy link
Contributor

yliaog commented Nov 9, 2017

/cc @yliaog

@lavalamp
Copy link
Member

lavalamp commented Nov 9, 2017

We currently recommend placing extension apiservers in an environment where cluster DNS works and routes to pod IPs work.

The main apiserver does not required this. We currently consider extension apiservers to run in a different tier; we can maybe reconsider this, but it wouldn't change instantly.

(see, @deads2k, other people do run like this!)

@deads2k
Copy link
Contributor

deads2k commented Nov 10, 2017

(see, @deads2k, other people do run like this!)

Seen. I guess we'll see how many more come out of the woodwork. This ought to bring them out in a fair hurry.

As I recall, this was because someone was running weave and they didn't have the network set up to route to the service or pod network cleanly. We definitely don't want to try to keep teaching kube-apiserver how to handle network topologies. Network topology lies outside the domain of the kube-apiserver and if you make it impossible to route, you can't have your kube-apiserver using anything self-hosted. The solution is to fix the network topology, not to try to teach kube-apiserver about all the different overlay strategies. For instance, you wouldn't want me coercing kube-apiserver to natively understand the openshift-sdn.

@RRAlex
Copy link
Author

RRAlex commented Nov 10, 2017

@lavalamp
Is there documentation on how to have an "extension apiserver" inside the pod's network and how it works?

@deads2k
I'm running weave too (I need pods to be in an encrypted overlay), but any overlay network, with it's own address range, will not directly be routable from the apiservers if they strictly are separated. That said would a simple service do with a flag to tell the API to use that Node_IP+port rather than a Pod_IP+port?

That's why I was wondering if simply exposing a port might do it, but I'm curious to hear what @lavalamp is doing when controllers are not running kubelet or pods with an overlay network with the extension apiserver.

So, if my cluster's architecture is not to blame, I'm open to suggestions... :)

@lavalamp
Copy link
Member

Just run it as a regular pod on the cluster and it should magically work. Don't run it specially outside of the network.

That said would a simple service do with a flag to tell the API to use that Node_IP+port rather than a Pod_IP+port?

That's actually worse, as it encourages a bad practice of using host networking without solving the problem that apiserver may not have routes to nodes or nodes may have only private IPs, etc.

kube-apiserver right now has some ssh tunneling capabilities but we want to remove them, and definitely don't want to add them to extension apiservers.

@RRAlex
Copy link
Author

RRAlex commented Nov 14, 2017

I understand and agree with not moving forward with more out of band host networking solutions.
That's why I was confused as I was expecting metrics-server to reach for the API to get what it wanted, even if it had to ask the API to proxy to kubelet or some such, but not the other way around where the API was trying to skip a hop and jump directly to a pod...

Just run it as a regular pod on the cluster and it should magically work. Don't run it specially outside of the network.

Sorry for my confusion, but which part do you mean by "it"?

If you mean metrics-server, my tests resulted in a 503 i/o timeout error.
Nothing was running outside the network:
I do have my controller components on the side, but kubelet can reach it by real IP and pods can reach it via the "x.x.x.1" API IP's, on the overlay network, with a service account.
But unless I got something terribly wrong, where everything else works, that KTHW setup doesn't seem to be the issue?

cheers!

(p.s. I'm going to have some time to look at this again, maybe I missed something obvious then, like a networkPolicy...?)

@DirectXMan12
Copy link
Contributor

@lavalamp are you suggesting to run the aggregator as a normal, separate pod, or the whole API server "bundle" (main API server, aggregator, etc) as a pod? The former provides its own set of issues (you've got a chicken-and-egg problem with the controller-manager, for instance).

@nickray
Copy link

nickray commented Jan 13, 2018

Just want to remark that I too did this separation of concern architecture and had the equivalent problem running the dashboard.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 13, 2018
@onitake
Copy link

onitake commented May 2, 2018

/remove-lifecycle stale

This issue it not something that should be dismissed completely.
While the official Kubernetes documentation now explains how to bootstrap a cluster with the control plane running in pods (see https://kubernetes.io/docs/getting-started-guides/scratch/#bootstrapping-the-cluster ), this may not be the desired setup for everybody.

Would it be possible to inject custom host entries into kubedns instead, so they resolve to something the apiserver can reach without access to pod IPs?

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 2, 2018
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 31, 2018
@redbaron
Copy link
Contributor

redbaron commented Aug 1, 2018

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 1, 2018
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 30, 2018
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 29, 2018
@k8s-ci-robot k8s-ci-robot added the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Nov 29, 2018
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@frittentheke
Copy link

/reopen

@k8s-ci-robot
Copy link
Contributor

@frittentheke: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@nirnanaaa
Copy link

I also do not think, that this is entirely irelevant.

Kubermatic - from my understanding at least - is achieving something similar (https://docs.kubermatic.io/concepts/architecture/), yet I am not entirely sure how they do it.

Is there some sort of recommendation on how to run api servers on a decoupled platform with the aggregator API? We're running a similar platform concept as kubermatic does and are trying to solve just that last piece.

Maybe I am just off-topic, but I think that shared cluster for api servers will become more relevant in the future, when more enterprise customers are trying to efficiently scale their setups.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery.
Projects
None yet
Development

No branches or pull requests