Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't access default k3s metrics server from Prometheus-Adapter #6263

Closed
alimoezzi opened this issue Oct 12, 2022 · 11 comments
Closed

Can't access default k3s metrics server from Prometheus-Adapter #6263

alimoezzi opened this issue Oct 12, 2022 · 11 comments

Comments

@alimoezzi
Copy link

I have installed Prometheus-adapter along with the default metrics-server that comes with k3s securely on port 4443.

Unfortunately, I get no resources when I query custom.metrics.k8s.io

$  kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1" | jq .
{
  "kind": "APIResourceList",
  "apiVersion": "v1",
  "groupVersion": "custom.metrics.k8s.io/v1beta1",
  "resources": []
}

When I look at the logs of Prometheus-adapter I get unable to update list of all metrics: unable to fetch metrics for query ...: x509: certificate is valid for localhost, localhost, not metrics-server.kube-system

I have also added ClusterRoleBinding for system:aggregated-metrics-reader role but no difference

apiVersion: rbac.authorization.k8s.io/v1                                                                                                                                                      
kind: ClusterRoleBinding                                                                                                                                                                      
metadata:                                                                                                                                             
  labels:                                                                                                                                                                                     
    app.kubernetes.io/component: metrics                                                                                                                                                      
    app.kubernetes.io/instance: custom-metrics                                                                                                                                                
    app.kubernetes.io/managed-by: Helm                                                                                                                                                        
    app.kubernetes.io/name: prometheus-adapter                                                                                                                                                
    app.kubernetes.io/part-of: prometheus-adapter                                                                                                                                             
    app.kubernetes.io/version: v0.10.0                                                                                                                                                        
    argocd.argoproj.io/instance: custom-metrics                                                                                                                                               
    helm.sh/chart: prometheus-adapter-3.4.0                                                                                                                                                   
  name: prometheus-adapter-aggregated-reader                                                                                                                                                           
roleRef:                                                                                                                                                                                      
  apiGroup: rbac.authorization.k8s.io                                                                                                                                                         
  kind: ClusterRole                                                                                                                                                                           
  name: system:aggregated-metrics-reader                                                                                                                                                      
subjects:                                                                                                                                                                                     
- kind: ServiceAccount                                                                                                                                                                        
  name: custom-metrics-prometheus-adapter                                                                                                                                                     
  namespace: monitoring

My k3s version: v1.24.6+k3s1

@brandond
Copy link
Member

brandond commented Oct 13, 2022

I get no resources when I query custom.metrics.k8s.io

I'm not sure where you got this group from? Metrics-server uses the "metrics.k8s.io" group.

brandond@dev01:~$ kubectl get node -o wide
NAME           STATUS   ROLES                  AGE     VERSION        INTERNAL-IP   EXTERNAL-IP   OS-IMAGE   KERNEL-VERSION    CONTAINER-RUNTIME
k3s-server-1   Ready    control-plane,master   2m43s   v1.24.6+k3s1   172.17.0.2    <none>        K3s dev    5.19.0-1005-aws   containerd://1.6.8-k3s1

brandond@dev01:~$ kubectl api-resources --api-group=metrics.k8s.io
NAME    SHORTNAMES   APIVERSION               NAMESPACED   KIND
nodes                metrics.k8s.io/v1beta1   false        NodeMetrics
pods                 metrics.k8s.io/v1beta1   true         PodMetrics

brandond@dev01:~$ kubectl get PodMetrics -A -o wide
NAMESPACE     NAME                                      CPU        MEMORY    WINDOW
kube-system   coredns-b96499967-4mcmc                   1310411n   13456Ki   17s
kube-system   local-path-provisioner-7b7dc8d6f5-s64dp   264u       6964Ki    14s
kube-system   metrics-server-668d979685-29p2r           3645952n   17948Ki   21s
kube-system   svclb-traefik-8443bdc1-tc94w              0          580Ki     17s
kube-system   traefik-7cd4fcff68-bv9ph                  165933n    19116Ki   15s

If these commands don't work for you, check the metrics-server pod logs to see why it's failing to collect metrics from your nodes.

@alimoezzi
Copy link
Author

@brandond I installed https://github.com/kubernetes-sigs/prometheus-adapter inside the cluster

@brandond
Copy link
Member

Ok? I'm not sure how to help with that, as it's not something we package. From the error it sounds like maybe you need to disable certificate validation or something so that it can talk to the metrics server?

@alimoezzi
Copy link
Author

The endpoint registered and HPA can access Prometheus-Adapter but only Prometheus-Adapter is not able to access metrics server although it has the required permission.
Even I managed to create a TLS certificate (https://kubernetes.io/docs/tasks/tls/managing-tls-in-a-cluster/) with kubernetes.io/kube-apiserver-client as the signer but still I get the same error

@brandond
Copy link
Member

brandond commented Oct 14, 2022

localhost, localhost, not metrics-server.kube-system

The metrics-server pod just uses a self-signed certificate, and the apiserver does not validate that certificate when connecting to it. Have you tried disabling certificate validation on the adapter, as I suggested above?

Even I managed to create a TLS certificate

Where did you use this certificate? On the metrics-server, or on your adapter? The problem is the cert on the metrics-server; telling the adapter not to check it is probably the easiest fix...

@alimoezzi
Copy link
Author

alimoezzi commented Oct 14, 2022

I created certs for the Prometheus-Adapter as I thought metrics-server certificate are signed by the cluster signer.
This is my csr.json

{
  "hosts": [
    "prometheus-adapter",
    "prometheus-adapter.monitoring",
    "prometheus-adapter.monitoring.svc",
    "prometheus-adapter.monitoring.pod",
    "prometheus-adapter.monitoring.svc.cluster.local",
    "prometheus-adapter.monitoring.pod.cluster.local",
    "<service ip>",
    "<pods ip>"
  ],
  "CN": "prometheus-adapter.monitoring.pod.cluster.local",
  "key": {
    "algo": "ecdsa",
    "size": 256
  }
}

and my CertificateSigningRequest

apiVersion: certificates.k8s.io/v1
kind: CertificateSigningRequest
metadata:
  name: custom-metrics-certs-req
spec:
  request: $(cat server.csr | base64 | tr -d '\n')
  signerName: kubernetes.io/kube-apiserver-client
  usages:
  - digital signature
  - key encipherment
  - client auth

and then I created a tls secret and mount to Prometheus-Adapter.
Should I create another certificate for metrics-server and mount it to metrics-server?
I didn't disable certificate validation as I wanted to do it securely

@brandond
Copy link
Member

I thought metrics-server certificate are signed by the cluster signer.

No, as I said above it uses a self-signed cert. Certificate validation is already disabled for the metrics-server's core use in providing metrics for the cluster autoscaler and kubectl top functionality, see https://github.com/kubernetes-sigs/metrics-server/blob/master/manifests/base/apiservice.yaml#L12. If you want to go through the work of creating and mounting a valid cert for it you're welcome to do so, and it would probably fix your problem, but I don't think many folks do that.

@alimoezzi
Copy link
Author

@brandond
Copy link
Member

brandond commented Oct 14, 2022

The README for prometheus-adapter indicates that it can be used to replace the default metrics-server. I'm confused why you're trying to get the adapter to pull stats from the metrics-server when it appears it serves the same purpose, and could replace it. What is your end goal in linking the two?

@stale
Copy link

stale bot commented Apr 12, 2023

This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 180 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions.

@caroline-suse-rancher
Copy link
Contributor

Closing this as it's stale, and the component isn't within our scope.

@github-project-automation github-project-automation bot moved this from New to Done Issue in K3s Development Apr 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

No branches or pull requests

3 participants