Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Latest version metrics-server failed with go coroutine error #637

Closed
system-dev-formations opened this issue Nov 13, 2020 · 12 comments
Closed

Comments

@system-dev-formations
Copy link

I installed metrics-server on K8s 1.19.3 using
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.4.0/components.yaml

What happened:

What you expected to happen:
Pod ok
Anything else we need to know?:

Environment:

  • Kubernetes distribution (GKE, EKS, Kubeadm, the hard way, etc.): on-prem
  • Container Network Setup (flannel, calico, etc.): calico
  • Kubernetes version (use kubectl version): 1.19.3

k logs -n kube-system metrics-server-844d9574cf-9nbd9 E1113 09:06:37.506543 1 pathrecorder.go:107] registered "/metrics" from goroutine 1 [running]: runtime/debug.Stack(0x1942e80, 0xc000830240, 0x1bb58b5) /usr/local/go/src/runtime/debug/stack.go:24 +0x9d k8s.io/apiserver/pkg/server/mux.(*PathRecorderMux).trackCallers(0xc000255b20, 0x1bb58b5, 0x8) /go/pkg/mod/k8s.io/apiserver@v0.19.2/pkg/server/mux/pathrecorder.go:109 +0x86 k8s.io/apiserver/pkg/server/mux.(*PathRecorderMux).Handle(0xc000255b20, 0x1bb58b5, 0x8, 0x1e96f00, 0xc0005169c0) /go/pkg/mod/k8s.io/apiserver@v0.19.2/pkg/server/mux/pathrecorder.go:173 +0x84 k8s.io/apiserver/pkg/server/routes.MetricsWithReset.Install(0xc000255b20) /go/pkg/mod/k8s.io/apiserver@v0.19.2/pkg/server/routes/metrics.go:43 +0x5d k8s.io/apiserver/pkg/server.installAPI(0xc00000a1e0, 0xc000441440) /go/pkg/mod/k8s.io/apiserver@v0.19.2/pkg/server/config.go:711 +0x6c k8s.io/apiserver/pkg/server.completedConfig.New(0xc000441440, 0x1f099c0, 0xc0006bfe00, 0x1bbdb5a, 0xe, 0x1ef29e0, 0x2cef248, 0x0, 0x0, 0x0) /go/pkg/mod/k8s.io/apiserver@v0.19.2/pkg/server/config.go:657 +0xb45 sigs.k8s.io/metrics-server/pkg/server.Config.Complete(0xc000441440, 0xc000523440, 0xc000523b00, 0xdf8475800, 0xc92a69c00, 0x0, 0x0, 0xdf8475800) /go/src/sigs.k8s.io/metrics-server/pkg/server/config.go:52 +0x312 sigs.k8s.io/metrics-server/cmd/metrics-server/app.runCommand(0xc0002dadc0, 0xc00009ca20, 0x0, 0x0) /go/src/sigs.k8s.io/metrics-server/cmd/metrics-server/app/start.go:66 +0x157 sigs.k8s.io/metrics-server/cmd/metrics-server/app.NewMetricsServerCommand.func1(0xc00067e580, 0xc000132a00, 0x0, 0x4, 0x0, 0x0) /go/src/sigs.k8s.io/metrics-server/cmd/metrics-server/app/start.go:37 +0x33 github.com/spf13/cobra.(*Command).execute(0xc00067e580, 0xc00004c0b0, 0x4, 0x4, 0xc00067e580, 0xc00004c0b0) /go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:842 +0x453 github.com/spf13/cobra.(*Command).ExecuteC(0xc00067e580, 0xc00008a180, 0x0, 0x0) /go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:950 +0x349 github.com/spf13/cobra.(*Command).Execute(...) /go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:887 main.main() /go/src/sigs.k8s.io/metrics-server/cmd/metrics-server/metrics-server.go:38 +0xae I1113 09:06:37.637776 1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController I1113 09:06:37.637802 1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController I1113 09:06:37.637831 1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::client-ca-file I1113 09:06:37.637837 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file I1113 09:06:37.637872 1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file I1113 09:06:37.637877 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file I1113 09:06:37.638850 1 secure_serving.go:197] Serving securely on [::]:4443 I1113 09:06:37.639030 1 dynamic_serving_content.go:130] Starting serving-cert::/tmp/apiserver.crt::/tmp/apiserver.key I1113 09:06:37.639158 1 tlsconfig.go:240] Starting DynamicServingCertificateController E1113 09:06:37.644033 1 server.go:132] unable to fully scrape metrics: [unable to fully scrape metrics from node k8s-pink-node-1: unable to fetch metrics from node k8s-pink-node-1: Get "https://172.20.14.156:10250/stats/summary?only_cpu_and_memory=true": x509: cannot validate certificate for 172.20.14.156 because it doesn't contain any IP SANs, unable to fully scrape metrics from node k8s-pink-node-2: unable to fetch metrics from node k8s-pink-node-2: Get "https://172.20.14.206:10250/stats/summary?only_cpu_and_memory=true": x509: cannot validate certificate for 172.20.14.206 because it doesn't contain any IP SANs, unable to fully scrape metrics from node k8s-pink-master: unable to fetch metrics from node k8s-pink-master: Get "https://172.20.14.125:10250/stats/summary?only_cpu_and_memory=true": x509: cannot validate certificate for 172.20.14.125 because it doesn't contain any IP SANs] I1113 09:06:37.738064 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file I1113 09:06:37.738098 1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController I1113 09:06:37.738148 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file I1113 09:07:06.822467 1 configmap_cafile_content.go:223] Shutting down client-ca::kube-system::extension-apiserver-authentication::client-ca-file I1113 09:07:06.822670 1 requestheader_controller.go:183] Shutting down RequestHeaderAuthRequestController I1113 09:07:06.822720 1 configmap_cafile_content.go:223] Shutting down client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file I1113 09:07:06.823078 1 dynamic_serving_content.go:145] Shutting down serving-cert::/tmp/apiserver.crt::/tmp/apiserver.key I1113 09:07:06.823095 1 tlsconfig.go:255] Shutting down DynamicServingCertificateController I1113 09:07:06.823167 1 secure_serving.go:241] Stopped listening on [::]:4443

@serathius
Copy link
Contributor

serathius commented Nov 13, 2020

Thanks for reporting this. Issue was already triaged and fixed in #630. This stacktrace is totally harmless
Will release 0.4.1 soon with fix

@lboix
Copy link

lboix commented Nov 30, 2020

@system-dev-formations when you described that pod, was it unhealthy and restarting in loop because of events Readiness probe failed: HTTP probe failed with statuscode: 403 and Liveness probe failed: HTTP probe failed with statuscode: 403 ?

I am currently facing this usecase with 0.4.1, pod logs are:

kubectl logs -f metrics-server-7644b87b46-rx962 -n kube-system
I1130 21:21:37.127526       1 serving.go:325] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key)
I1130 21:21:38.378578       1 secure_serving.go:197] Serving securely on [::]:4443
I1130 21:21:38.378794       1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I1130 21:21:38.378861       1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController
I1130 21:21:38.378938       1 dynamic_serving_content.go:130] Starting serving-cert::/tmp/apiserver.crt::/tmp/apiserver.key
I1130 21:21:38.379015       1 tlsconfig.go:240] Starting DynamicServingCertificateController
I1130 21:21:38.379116       1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I1130 21:21:38.379175       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I1130 21:21:38.379254       1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I1130 21:21:38.379318       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I1130 21:21:38.479382       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file 
I1130 21:21:38.479412       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file 
I1130 21:21:38.479382       1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController 

# a few seconds logging nothing, then pod is restarting right after logging those last lines:

I1130 21:22:04.475279       1 configmap_cafile_content.go:223] Shutting down client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I1130 21:22:04.475333       1 tlsconfig.go:255] Shutting down DynamicServingCertificateController
I1130 21:22:04.475339       1 dynamic_serving_content.go:145] Shutting down serving-cert::/tmp/apiserver.crt::/tmp/apiserver.key
I1130 21:22:04.475365       1 secure_serving.go:241] Stopped listening on [::]:4443

The container args on my metrics-server.yaml are:

    spec:
      containers:
        - args:
            - --cert-dir=/tmp
            - --secure-port=4443
            - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
            - --kubelet-use-node-status-port
            - --kubelet-insecure-tls
          image: k8s.gcr.io/metrics-server/metrics-server:v0.4.1

All is working properly with metrics-server using 0.3.2 version and setup (my Kubernetes version is 1.14) but I would love to use this 0.4.1 up-to-date version of course.

I am trying to Google everything I can regarding "metrics-server Readiness (or Liveness) probe failed" but most of the time it's not because of a 403 HTTP code, but rather 500 or 503. Do you have any advice @system-dev-formations @serathius ?

Thanks for your time reading this, and I wish both of you a great day!

@serathius
Copy link
Contributor

Please file separate issue

@lboix
Copy link

lboix commented Dec 1, 2020

@serathius I will you are right :)

@jeff-french
Copy link

For anyone landing here from a search: I had the exact same 403 issue as @lboix above. Metric server was saying that the liveness and readiness requests to /livez and /readyz were unauthorized. I fixed it by adding --authorization-always-allow-paths=/livez,/readyz to the args in the component.yml.

@lboix
Copy link

lboix commented Dec 4, 2020

Works like a charm, thank for your help and time here @jeff-french !

@charstal
Copy link

@jeff-french that works for me, Thank you!

@icy
Copy link

icy commented Mar 16, 2021

For anyone landing here from a search: I had the exact same 403 issue as @lboix above. Metric server was saying that the liveness and readiness requests to /livez and /readyz were unauthorized. I fixed it by adding --authorization-always-allow-paths=/livez,/readyz to the args in the component.yml.

Thanks a ton! I have spent more than 2 hours on the issue before reaching out to your note. Oh my update!

@VladoPortos
Copy link

@lboix Did you solve the issue ? I'm facing exact the same error message. I mean its not even error message it just shutdown without one.. even on log level 10, I can see it scraping the data and then shutdown... its frustrating as hell.

@lboix
Copy link

lboix commented Aug 31, 2021

@lboix Did you solve the issue ? I'm facing exact the same error message. I mean its not even error message it just shutdown without one.. even on log level 10, I can see it scraping the data and then shutdown... its frustrating as hell.

Yes I did, following the tip above of @jeff-french : #637 (comment)

If you are running a kubectl logs --previous POD on your metrics-server pod that keeps restarting : do you see at the bottom of the output an error related to the readiness or liveness probe ? That was exactly the symptom of my issue.

@VladoPortos
Copy link

@lboix that did not help me, what finally got me working is changing these:

          initialDelaySeconds: 300
          periodSeconds: 30

@lboix
Copy link

lboix commented Sep 1, 2021

OK I see! I am glad you worked it out @VladoPortos :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants