Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash in Kube Controller Manager #40

Closed
DaspawnW opened this issue Mar 24, 2019 · 6 comments · Fixed by #67
Closed

Crash in Kube Controller Manager #40

DaspawnW opened this issue Mar 24, 2019 · 6 comments · Fixed by #67

Comments

@DaspawnW
Copy link

Actual Behavior

Kube-Metrics-adapter produces kernel Panic in Kube-Controller

Steps to Reproduce the Problem

  1. Configured Secret with serving.crt and serving.key that has common_name kube-metrics-adapter and alt_names kube-metrics-adapter.kube-system,kube-metrics-adapter.kube-system.svc,kube-metrics-adapter.kube-system.svc.cluster.local
  2. Create all the files as described in docs, but remove --skipper-ingress-metrics --aws-external-metrics and add instead --tls-cert-file=/var/run/serving-cert/serving.crt, --tls-private-key-file=/var/run/serving-cert/serving.key and ving-cert/serving.key
  3. Now you can see the following logs for kube-metrics-deployment:
time="2019-03-24T15:24:44Z" level=info msg="Looking for HPAs" provider=hpa
I0324 15:24:44.156768       1 serve.go:96] Serving securely on [::]:443
time="2019-03-24T15:24:44Z" level=info msg="Found 6 new/updated HPA(s)" provider=hpa
time="2019-03-24T15:24:44Z" level=info msg="Event(v1.ObjectReference{Kind:\"HorizontalPodAutoscaler\", Namespace:\"microservices\", Name:\"ms-1\", UID:\"f861d0ed-4ce2-11e9-b661-025217a46e36\", APIVersion:\"autoscaling/v2beta1\", ResourceVersion:\"2296595\", FieldPath:\"\"}): type: 'Warning' reason: 'CreateNewMetricsCollector' Failed to create new metrics collector: format '' not supported"
E0324 15:24:51.243240       1 writers.go:149] apiserver was unable to write a JSON response: expected pointer, but got nil
E0324 15:24:51.243267       1 status.go:64] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"expected pointer, but got nil"}
  1. And for Kube-Controller-Manager:
I0324 15:24:37.134887       6 replica_set.go:477] Too few replicas for ReplicaSet kube-system/kube-metrics-adapter-f6cb64c84, need 1, creating 1
I0324 15:24:37.138649       6 event.go:221] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"kube-system", Name:"kube-metrics-adapter", UID:"ef50f882-4e48-11e9-b661-025217a46e36", APIVersion:"apps/v1", ResourceVersion:"2300163", FieldPath:""}): type: 'Normal' reason: 'ScalingReplicaSet' Scaled up replica set kube-metrics-adapter-f6cb64c84 to 1
I0324 15:24:37.150869       6 deployment_controller.go:484] Error syncing deployment kube-system/kube-metrics-adapter: Operation cannot be fulfilled on deployments.apps "kube-metrics-adapter": the object has been modified; please apply your changes to the latest version and try again
I0324 15:24:37.161632       6 event.go:221] Event(v1.ObjectReference{Kind:"ReplicaSet", Namespace:"kube-system", Name:"kube-metrics-adapter-f6cb64c84", UID:"ef530b22-4e48-11e9-b661-025217a46e36", APIVersion:"apps/v1", ResourceVersion:"2300164", FieldPath:""}): type: 'Normal' reason: 'SuccessfulCreate' Created pod: kube-metrics-adapter-f6cb64c84-jt5sf
W0324 15:24:37.731700       6 garbagecollector.go:647] failed to discover some groups: map[custom.metrics.k8s.io/v1beta1:the server is currently unable to handle the request external.metrics.k8s.io/v1beta1:the server is currently unable to handle the request]
I0324 15:24:51.233692       6 horizontal.go:777] Successfully updated status for istio-telemetry-autoscaler
E0324 15:24:51.245935       6 runtime.go:69] Observed a panic: &runtime.TypeAssertionError{_interface:(*runtime._type)(0x334f2e0), concrete:(*runtime._type)(0x39236e0), asserted:(*runtime._type)(0x390c980), missingMethod:""} (interface conversion: runtime.Object is *v1.Status, not *v1beta2.MetricValueList)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:76
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:65
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:51
/usr/local/go/src/runtime/asm_amd64.s:522
/usr/local/go/src/runtime/panic.go:513
/usr/local/go/src/runtime/iface.go:248
/usr/local/go/src/runtime/iface.go:258
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/metrics/pkg/client/custom_metrics/versioned_client.go:269
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/metrics/pkg/client/custom_metrics/multi_client.go:136
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/controller/podautoscaler/metrics/rest_metrics_client.go:113
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/controller/podautoscaler/replica_calculator.go:158
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/controller/podautoscaler/horizontal.go:347
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/controller/podautoscaler/horizontal.go:274
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/controller/podautoscaler/horizontal.go:550
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/controller/podautoscaler/horizontal.go:318
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/controller/podautoscaler/horizontal.go:210
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/controller/podautoscaler/horizontal.go:198
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/controller/podautoscaler/horizontal.go:164
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88
/usr/local/go/src/runtime/asm_amd64.s:1333
panic: interface conversion: runtime.Object is *v1.Status, not *v1beta2.MetricValueList [recovered]
	panic: interface conversion: runtime.Object is *v1.Status, not *v1beta2.MetricValueList

goroutine 3060 [running]:
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:58 +0x108
panic(0x3315300, 0xc006b86030)
	/usr/local/go/src/runtime/panic.go:513 +0x1b9
k8s.io/kubernetes/vendor/k8s.io/metrics/pkg/client/custom_metrics.(*namespacedMetrics).GetForObjects(0xc006916660, 0x0, 0x0, 0x3a216bc, 0x3, 0x3efb4a0, 0xc005cb7e00, 0xc005614e20, 0x19, 0x3efb500, ...)
	/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/metrics/pkg/client/custom_metrics/versioned_client.go:269 +0x4c6
k8s.io/kubernetes/vendor/k8s.io/metrics/pkg/client/custom_metrics.(*multiClientInterface).GetForObjects(0xc004c448b0, 0x0, 0x0, 0x3a216bc, 0x3, 0x3efb4a0, 0xc005cb7e00, 0xc005614e20, 0x19, 0x3efb500, ...)
	/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/metrics/pkg/client/custom_metrics/multi_client.go:136 +0x118
k8s.io/kubernetes/pkg/controller/podautoscaler/metrics.(*customMetricsClient).GetRawMetric(0xc0003f44d0, 0xc005614e20, 0x19, 0xc0047b20a0, 0xd, 0x3efb4a0, 0xc005cb7e00, 0x3efb500, 0x6747b10, 0x0, ...)
	/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/controller/podautoscaler/metrics/rest_metrics_client.go:113 +0x102
k8s.io/kubernetes/pkg/controller/podautoscaler.(*ReplicaCalculator).GetMetricReplicas(0xc00095af40, 0x1, 0x2710, 0xc005614e20, 0x19, 0xc0047b20a0, 0xd, 0x3efb4a0, 0xc005cb7e00, 0x3efb500, ...)
	/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/controller/podautoscaler/replica_calculator.go:158 +0xb0
k8s.io/kubernetes/pkg/controller/podautoscaler.(*HorizontalController).computeStatusForPodsMetric(0xc0005f2780, 0x1, 0xc0064a8918, 0x4, 0x0, 0xc004d34cc0, 0x0, 0x0, 0xc0056ed340, 0x3efb4a0, ...)
	/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/controller/podautoscaler/horizontal.go:347 +0xd8
k8s.io/kubernetes/pkg/controller/podautoscaler.(*HorizontalController).computeReplicasForMetrics(0xc0005f2780, 0xc0056ed340, 0xc00690e500, 0xc004802cf0, 0x1, 0x1, 0x11, 0x3af8df0, 0x3d, 0x0, ...)
	/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/controller/podautoscaler/horizontal.go:274 +0xb10
k8s.io/kubernetes/pkg/controller/podautoscaler.(*HorizontalController).reconcileAutoscaler(0xc0005f2780, 0xc0002dab30, 0xc0046c96a0, 0x12, 0x0, 0x0)
	/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/controller/podautoscaler/horizontal.go:550 +0x1678
k8s.io/kubernetes/pkg/controller/podautoscaler.(*HorizontalController).reconcileKey(0xc0005f2780, 0xc0046c96a0, 0x12, 0x30439c0, 0xc004798750, 0x0)
	/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/controller/podautoscaler/horizontal.go:318 +0x278
k8s.io/kubernetes/pkg/controller/podautoscaler.(*HorizontalController).processNextWorkItem(0xc0005f2780, 0x3eb7700)
	/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/controller/podautoscaler/horizontal.go:210 +0xdf
k8s.io/kubernetes/pkg/controller/podautoscaler.(*HorizontalController).worker(0xc0005f2780)
	/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/controller/podautoscaler/horizontal.go:198 +0x2b
k8s.io/kubernetes/pkg/controller/podautoscaler.(*HorizontalController).worker-fm()
	/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/controller/podautoscaler/horizontal.go:164 +0x2a
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc004819940)
	/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x54
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc004819940, 0x3b9aca00, 0x0, 0x1, 0xc000394900)
	/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134 +0xbe
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.Until(0xc004819940, 0x3b9aca00, 0xc000394900)
	/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88 +0x4d
created by k8s.io/kubernetes/pkg/controller/podautoscaler.(*HorizontalController).Run
	/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/controller/podautoscaler/horizontal.go:164 +0x1c6

Specifications

  • Version: registry.opensource.zalan.do/teapot/kube-metrics-adapter:latest
  • Platform: Kubernetes 1.13.4
  • Subsystem: Ubuntu
@szuecs
Copy link
Member

szuecs commented Mar 24, 2019

I guess the problem is the kubernstes version. We run 1.12, right now and I bet it doesn’t work because of that.
Thanks for reporting!

@mikkeloscar
Copy link
Contributor

mikkeloscar commented Apr 30, 2019

@DaspawnW What does your HPAs look like? We are running kube-metrics-adapter on a few v1.13.5 clusters and haven't noticed this, so would be cool to know how to replicate the issue.

@szuecs
Copy link
Member

szuecs commented May 2, 2019

I remember seeing this issue in our early 1.13 tries. Maybe it was fixed in v1.13.5. @DaspawnW can you test it on v1.13.5, please ?

@shyamjvs
Copy link

We were facing similar issue. I've explained the root-cause for that here - kubernetes/kubernetes#80392 (comment)

@shyamjvs
Copy link

shyamjvs commented Jul 24, 2019

There seems to be a bug in the kube-metrics-adapter that it's responding with 200 status code even though it failed to serve the metrics and it's instead silently setting 500 inside the response body. So the response body is actually a *v1.Status which fails to be typecasted to v1.MetricValueList inside client and therefore causes panic. kubernetes/kubernetes#80392 should fix the client-side panic.

And on the adapter side, there needs a fix to properly set the response code.

mikkeloscar added a commit that referenced this issue Jul 26, 2019
Fixes the response from `GetMetricsBySelector` in case no metrics are
found. This issue caused a panic in kube-controller-manager:
kubernetes/kubernetes#80392

Fix #40

Signed-off-by: Mikkel Oscar Lyderik Larsen <mikkel.larsen@zalando.de>
@mikkeloscar
Copy link
Contributor

@shyamjvs thanks a lot for looking into this. I was able to understand the issue based on your description and created a fix in #67

mikkeloscar added a commit that referenced this issue Jul 26, 2019
Fixes the response from `GetMetricsBySelector` in case no metrics are
found. This issue caused a panic in kube-controller-manager:
kubernetes/kubernetes#80392

Fix #40

Signed-off-by: Mikkel Oscar Lyderik Larsen <mikkel.larsen@zalando.de>
szuecs pushed a commit that referenced this issue Jul 26, 2019
Fixes the response from `GetMetricsBySelector` in case no metrics are
found. This issue caused a panic in kube-controller-manager:
kubernetes/kubernetes#80392

Fix #40

Signed-off-by: Mikkel Oscar Lyderik Larsen <mikkel.larsen@zalando.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants