Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubernetes SD: Allow restricting namespaces #2280

Closed
brancz opened this Issue Dec 14, 2016 · 21 comments

Comments

Projects
None yet
9 participants
@brancz
Copy link
Member

brancz commented Dec 14, 2016

Currently the Kubernetes SD lists and watches all changes for each role. This is problematic for two reasons:

  • Scalability: Watching and acting on every event of a Pods and Endpoints objects can get problematic due to the sheer amount of events in a large cluster.
  • Security: Advanced setups include role based access to limit what a service account is able to access, and thus will make Prometheus error when it tries to list/watch all namespaces, when it is only allowed to access certain namespaces.

I see two possibilities (this part is up for discussion):

  • An array of namespaces in the Kubernetes SD config, which if empty has the same behavior as it does now, so lists and watches all namespaces, and if specified only watches/lists those in the list.
  • A subsection to discover namespaces, which would also be relabelable.

The second solution seems like it is the desirable one, but besides adding complexity it requires the ability to watch/list namespaces in the first place, I'm not sure if this is ok. Possibly this is not an "or" decision, but both are required to cover everyones needs.

This will make #2191 a bit more complicated for the Kubernetes SD, but still doable.

@fabxc @beorn7 @juliusv @brian-brazil @matthiasr @alexsomesan

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Dec 14, 2016

A subsection to discover namespaces, which would also be relabelable.

Let's try to avoid additional types of relabelling, they cause enough confusion as-is. A simple regex is the most we should need.

@brancz

This comment has been minimized.

Copy link
Member Author

brancz commented Dec 14, 2016

A simple regex is the most we should need.

Agreed.

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Dec 14, 2016

Mh, if I cannot get a list of all namespaces due to access control, how can I apply a regex to it to filter namespaces?

@brancz

This comment has been minimized.

Copy link
Member Author

brancz commented Dec 14, 2016

@fabxc Yes that was my thought with

it requires the ability to watch/list namespaces in the first place, I'm not sure if this is ok.

@matthiasr

This comment has been minimized.

Copy link
Contributor

matthiasr commented Dec 14, 2016

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Dec 14, 2016

@nsams

This comment has been minimized.

Copy link

nsams commented Dec 14, 2016

see #1951

@brancz

This comment has been minimized.

Copy link
Member Author

brancz commented Dec 14, 2016

@fabxc what I mainly meant in terms of the "scalability" issue is that if we have the informers setup to list/watch on all namespaces, then we are filling and updating our cache unnecessarily much when we could restrict it to the objects that we are interested in.

label selection is the way to do it. Any SD-specific mechanism
should we add should also stick with the ideal methods of that particular
SD.

Agreed, however then it would probably make sense to adapt here as well.

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Dec 14, 2016

Yes, I totally agree that there's overhead and it will most likely become a scalability issue eventually. Was just wondering whether it came up it – but probably not given Kubernetes clusters still being fairly small on average.

How do you mean adapt? Just in terms of generating the appropriate config if possible or change the selection field in the operator in some way?

@widgetpl

This comment has been minimized.

Copy link

widgetpl commented Dec 22, 2016

I think i have hit such problem which needs restricting namespaces or sth similar. My problem is described here. node_exporter as well as kube_state_metrics are deployed as daemonset and deployment respectively. I do not know what @brian-brazil had in mind with

For example the configuration for the node exporters should already be a separate scrape_config so splitting it out to a separate Prometheus should be straightforward by splitting the configuration file.

as node_exporter does not have separate configuration as it is a part of kubernetes_sd_config and it is a subject to autodiscovery. So I can not see any possibility to create separate configuration for

  • node_exporter metrics
  • kube_state_metrics
  • all other namespaces metrics
@brancz

This comment has been minimized.

Copy link
Member Author

brancz commented Dec 22, 2016

@widgetpl This is about a different issue. I commented on the stackoverflow thread to keep this on-topic.

@widgetpl

This comment has been minimized.

Copy link

widgetpl commented Dec 23, 2016

@brancz I will not agree that this is a bit different issue, for me it sound a little bit different. One of the reasons of this issue is

Scalability: Watching and acting on every event of a Pods and Endpoints objects can get problematic due to the sheer amount of events in a large cluster.

and i think this is the same problem which I have ( large cluster and big amount of events/metrics ) unless you are talking only about kubernetes cluster itself without annotated services, pods etc. but I think they are included. In my opinion it depends how you look on it. I suppose that prometheus is watching and listing changes inside cluster through the API to know where it can find some metrics and when this cluster grows it has more and more Kubernetes objects to check. Allowing Prometheus to watch and list specific namespeces or even services, deployments, daemonsets inside those namespeces will reduce those events but it will also scrape specific metrics. Then we can be able to do functional sharding inside Kubernetes.

Unfortunately I have not found how i can make functional sharding inside Kubernetes cluster when everything is a Kubernetes object and my question was not about what should I do but how can I do this inside Kubernetes.

One solution which I mentioned is static_config but it is not best solution as my pod can float between Kubernetes nodes.

Another splution would be to set Prometheus target in static config to external IP of service but then Prometheus will not use Kubernetes API for service discovery.

Please correct me if I am wrong.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Dec 23, 2016

@widgetpl This is a different issue, please don't pollute the bug tracker.

@carlpett

This comment has been minimized.

Copy link

carlpett commented Apr 24, 2017

We're in need of being able to run Prometheus without it having access to all namespaces as well. I'm willing to put together a PR, but is there a decision on what approach to take?

Our usecase would be satisfied with just having a list of namespaces (that is, no need for automagically inspecting what namespaces are available to the service account). Is that something everyone is satisfied with? If no config is given by the user, the current NamespaceAll would be used, I assume.

From a quick glance at the code, I'm changes would be around here? Anything in particular that would be done more than just looping over those?

@brancz

This comment has been minimized.

Copy link
Member Author

brancz commented Apr 24, 2017

Yes that's where I would start as well.

I'm thinking that we actually would want both, in fact I'm thinking of something like Namespace discovery similar to Target discovery, where you could either specify a static list of namespaces if they are known at point of configuring, or be able to specify a label-selector for listing namespaces (which would require RBAC roles to access the list namespaces endpoint). Although labeling namespaces seems to be getting more widely adopted it may also be wanted to filter based on the namespace name, although I would hold off on that until it is a wanted feature.

Wdyt @fabxc @brian-brazil @matthiasr ?

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Apr 24, 2017

Do we still need this given RBAC? I'd presume it'll just return the namespaces you are allowed see.

@brancz

This comment has been minimized.

Copy link
Member Author

brancz commented Apr 24, 2017

If you have the role to list namespaces, I would expect to get all namespace objects.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Apr 24, 2017

Can we check what the k8 plans are there? In particular what happens when you don't have that role?

@bakins

This comment has been minimized.

Copy link
Contributor

bakins commented Apr 26, 2017

Currently, you can either list all namespaces or none. This is the same with all resources in Kubernetes - the list action applies to all of that resource type. For namespaced resources (pods, services, etc), a role can be grant access on a per namespace basis, however, namespaces themselves are cluster level.

Unsure of future plans in k8s, though.

@grobie

This comment has been minimized.

Copy link
Member

grobie commented May 25, 2017

Implemented in #2642.

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 23, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 23, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.