Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

documentation: update Kubernetes example for 1.7 #2918

Merged
merged 1 commit into from
Jul 21, 2017

Conversation

unixwitch
Copy link
Contributor

Kubernetes 1.7+ no longer exposes cAdvisor metrics on the Kubelet
metrics endpoint.  Update the example configuration to scrape cAdvisor
in addition to Kubelet.

Also remove the comment about node (Kubelet) CA not matching the master
CA.  Since the example no longer connects directly to the nodes, it
doesn't matter what CA they're using.

Ref: #2916

In #2916 it was suggested this should be split into version-specific examples. In that case, should I replace prometheus-kubernetes.yml with prometheus-kubernetes-1.6.yml and prometheus-kubernetes-1.7.yml? (I'm not sure that's the best way to do it, since this shouldn't need updating for every Kubernetes release, only ones with breaking changes.)

- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/proxy/nodes/${1}:4194/metrics

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

considering that rbac example does not have proxying on cluster level, I would suggest to use /api/v1/nodes/${1}:4194/proxy/metrics instead

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'm getting:

cadvisor_error

...but also for the original /api/v1/proxy/nodes/${1}:4194/metrics here.

This is on a test cluster where I set up permissive bindings via:

kubectl create clusterrolebinding permissive-binding \
  --clusterrole=cluster-admin \
  --user=admin \
  --user=kubelet \
  --group=system:serviceaccounts

My nodes:

$ kubectl get nodes
NAME          STATUS    AGE       VERSION
julius-test   Ready     7d        v1.7.0

My Kubernetes versions:

Client Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.0", GitCommit:"d3ada0119e776222f11ec7945e6d860061339aad", GitTreeState:"clean", BuildDate:"2017-06-29T23:15:59Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.0", GitCommit:"d3ada0119e776222f11ec7945e6d860061339aad", GitTreeState:"clean", BuildDate:"2017-06-29T22:55:19Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}

I wonder what I'm missing?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@juliusv you actually need to enable cAdvisor to listen on port 4194, if you have set up kubernetes with kubeadm or using minikube I have a blog post with some references on how to do that.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@outcoldman i exposed cadvisor via port 4194 and can access it via browser but prometheus still shows me "server returned HTTP status 403 Forbidden". Any Idea?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@outcoldman Thanks, for me that worked! Still waiting for the real fix for this in Kubernetes as discussed in kubernetes/kubernetes#48483 though.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@discostur 403 is some progress comparing to 503. I would assume that something is wrong with your rbac setup, have you used the one from https://github.com/prometheus/prometheus/blob/master/documentation/examples/rbac-setup.yml ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@outcoldman jep, thats exactly the rbac i'm using. kube-apiserver and kube-nodes endpoints are scraped correct; only kube-cadvisor gives me a 403
bildschirmfoto 2017-07-12 um 11 29 59

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@discostur ah, so that is easy to fix, you have not applied change I have suggested originally as my first commentary #2918 (comment)

Use /api/v1/nodes/${1}:4194/proxy/metrics instead of /api/v1/proxy/nodes/${1}:4194/metrics, as the rbac which I referenced above only give proxy access on node level.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@outcoldman perfect, now it works! thanks for your quick help ;)

@matthiasr
Copy link
Contributor

Thank you very much for this, especially for the extensive comments!

I think materially this is ready to go, but the last few releases have all had changes that broke the example config, so I agree that we should split them out. Eventually we need to think about cleaning out old versions, but not right now (all the versions in question are still supported by GKE).

This file is references here, could you also send a PR to update that when you split the files?

How will we deal with new Kubernetes releases that don't require a change in Prometheus? If, say, prometheus-kubernetes-1.7.yaml applies to 1.7+ until there is a new file, where can we write this down?

@grobie
Copy link
Member

grobie commented Jul 11, 2017

This issue is currently being discussed in kubernetes/kubernetes#48483, the combination of to changes in Kubernetes resulted in not exposing cAdvisor Prometheus metrics by default anymore. It might be that this will be changed back.

@unixwitch
Copy link
Contributor Author

The final decision on this from Kubernetes is to expose the metrics on the same port as before but under a new path (/metrics/cadvisor). I'll try to update the PR today for that, but the change won't be in a Kubernetes release until 1.7.3. (ref: kubernetes/kubernetes#49079)

@matthiasr
Copy link
Contributor

great, thank you for sorting this out! I'd just mention 1.7.3 in a comment in the example config, then we can merge it right away. There's no good way for 1.7.0…1.7.2 so I don't see the point in making an extra special config example for those.

Kubernetes 1.7+ no longer exposes cAdvisor metrics on the Kubelet
metrics endpoint.  Update the example configuration to scrape cAdvisor
in addition to Kubelet.  The provided configuration works for 1.7.3+
and commented notes are given for 1.7.2 and earlier versions.

Also remove the comment about node (Kubelet) CA not matching the master
CA.  Since the example no longer connects directly to the nodes, it
doesn't matter what CA they're using.

References:

- kubernetes/kubernetes#48483
- kubernetes/kubernetes#49079
@unixwitch unixwitch force-pushed the 2916-kubernetes-cadvisor-docs branch from 0c8e724 to 94103aa Compare July 19, 2017 14:20
@unixwitch
Copy link
Contributor Author

Okay, I've pushed a new update to the PR. This includes config that will work for 1.7.3+ and brief notes on working with earlier versions. (1.5 and 1.6 are still supported, so those should definitely be mentioned; adding a two-line note on 1.7.0-1.7.2 seems harmless.)

I don't think there's any point splitting this into multiple files for different Kubernetes versions now, since this hopefully won't break again now.

I haven't actually tested this since I don't have a build of 1.7.3 handy to use, but it's a fairly trivial modification of the previous version that I did test on our own clusters.

@juliusv
Copy link
Member

juliusv commented Jul 20, 2017

@unixwitch Awesome, this looks great to me. @matthiasr all good?

@brancz
Copy link
Member

brancz commented Jul 21, 2017

Completely agree with @matthiasr, I think this looks good to me.

@juliusv
Copy link
Member

juliusv commented Jul 21, 2017

Awesome, merging 👍 Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants