New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

K8s kube-apiserver 1.12.2 nil pointer crash #70806

Open
RRAlex opened this Issue Nov 8, 2018 · 14 comments

Comments

Projects
None yet
6 participants
@RRAlex

RRAlex commented Nov 8, 2018

What happened:
The cluster started giving a lot of NODE DENY error in the log corresponding to
https://github.com/kubernetes/kubernetes/blob/master/plugin/pkg/auth/authorizer/node/node_authorizer.go#L198
restarting kube-apiserver make another one crash in a 3 apiserver setup.
(say systemctl restart kube-apiserver on controller-1 will crash controller 2 or 3)

It's as if node were show as Ready but were not registered anymore?!

What you expected to happen:
The cluster to work as it did the weeks before

How to reproduce it (as minimally and precisely as possible):
I haven't fixed this yet and pod are stock in pending most of the time even after restarting controllers, workers, etc.

Anything else we need to know?:

2018-11-08T10:47:33-05:00 localhost docker/kube-apiserver[1131]: I1108 15:47:33.350953       1 log.go:172] http: TLS handshake error from 172.27.39.86:42434: EOF
2018-11-08T10:47:33-05:00 localhost docker/kube-apiserver[1131]: I1108 15:47:33.356022       1 log.go:172] http: TLS handshake error from 172.27.39.91:41646: EOF
2018-11-08T10:47:33-05:00 localhost docker/kube-apiserver[1131]: I1108 15:47:33.360645       1 log.go:172] http: TLS handshake error from 172.27.39.82:51580: EOF
2018-11-08T10:47:33-05:00 localhost docker/kube-apiserver[1131]: I1108 15:47:33.365578       1 log.go:172] http: TLS handshake error from 172.27.39.86:42436: EOF
2018-11-08T10:47:33-05:00 localhost docker/kube-apiserver[1131]: I1108 15:47:33.371563       1 log.go:172] http: TLS handshake error from 172.27.39.91:41648: EOF
2018-11-08T10:47:33-05:00 localhost docker/kube-apiserver[1131]: I1108 15:47:33.377407       1 node_authorizer.go:198] NODE DENY: unknown node "staging-worker-001" cannot get secret kube-system-monitoring/fluentd-es-token-dblxs
2018-11-08T10:47:33-05:00 localhost docker/kube-apiserver[1131]: I1108 15:47:33.392709       1 node_authorizer.go:198] NODE DENY: unknown node "staging-monitoring-001" cannot get secret kube-system/weave-net-token-dxhg6
2018-11-08T10:47:33-05:00 localhost docker/kube-apiserver[1131]: I1108 15:47:33.409043       1 log.go:172] http2: server: error reading preface from client 172.27.39.92:42038: read tcp 172.27.39.86:6443->172.27.39.92:42038: read: connection reset by peer
2018-11-08T10:47:33-05:00 localhost docker/kube-apiserver[1131]: I1108 15:47:33.441335       1 node_authorizer.go:198] NODE DENY: unknown node "staging-worker-007" cannot get secret kube-system-monitoring/default-token-2hg82
2018-11-08T10:47:33-05:00 localhost docker/kube-apiserver[1131]: I1108 15:47:33.462884       1 controller_utils.go:1034] Caches are synced for crd-autoregister controller
2018-11-08T10:47:33-05:00 localhost docker/kube-apiserver[1131]: I1108 15:47:33.552797       1 cache.go:39] Caches are synced for autoregister controller
2018-11-08T10:47:33-05:00 localhost docker/kube-apiserver[1131]: I1108 15:47:33.554138       1 cache.go:39] Caches are synced for APIServiceRegistrationController controller
2018-11-08T10:47:33-05:00 localhost docker/kube-apiserver[1131]: I1108 15:47:33.562426       1 cache.go:39] Caches are synced for AvailableConditionController controller
2018-11-08T10:47:33-05:00 localhost docker/kube-apiserver[1131]: W1108 15:47:33.581930       1 lease.go:226] Resetting endpoints for master service "kubernetes" to [172.27.39.79 172.27.39.85 172.27.39.86]
2018-11-08T10:47:33-05:00 localhost docker/kube-apiserver[1131]: E1108 15:47:33.639877       1 memcache.go:134] couldn't get resource list for custom.metrics.k8s.io/v1beta1: the server could not find the requested resource
2018-11-08T10:47:33-05:00 localhost docker/kube-apiserver[1131]: E1108 15:47:33.643202       1 memcache.go:134] couldn't get resource list for metrics.k8s.io/v1beta1: the server could not find the requested resource
2018-11-08T10:47:33-05:00 localhost docker/kube-apiserver[1131]: I1108 15:47:33.664684       1 storage_scheduling.go:100] all system priority classes are created successfully or already exist.
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: E1108 15:47:34.651498       1 runtime.go:66] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: /workspace/anago-v1.12.2-beta.0.59+17c77c78982180/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:72
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: /workspace/anago-v1.12.2-beta.0.59+17c77c78982180/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:65
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: /workspace/anago-v1.12.2-beta.0.59+17c77c78982180/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:51
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: /usr/local/go/src/runtime/asm_amd64.s:573
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: /usr/local/go/src/runtime/panic.go:502
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: /workspace/anago-v1.12.2-beta.0.59+17c77c78982180/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters/timeout.go:105
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: /workspace/anago-v1.12.2-beta.0.59+17c77c78982180/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters/waitgroup.go:47
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: /usr/local/go/src/net/http/server.go:1947
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: /workspace/anago-v1.12.2-beta.0.59+17c77c78982180/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filters/requestinfo.go:39
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: /usr/local/go/src/net/http/server.go:1947
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: /workspace/anago-v1.12.2-beta.0.59+17c77c78982180/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters/wrap.go:41
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: /usr/local/go/src/net/http/server.go:1947
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: /workspace/anago-v1.12.2-beta.0.59+17c77c78982180/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/handler.go:189
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: /usr/local/go/src/net/http/server.go:2697
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: /usr/local/go/src/net/http/server.go:3263
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: /usr/local/go/src/net/http/h2_bundle.go:5475
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: /workspace/anago-v1.12.2-beta.0.59+17c77c78982180/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/golang.org/x/net/http2/server.go:2053
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: /workspace/anago-v1.12.2-beta.0.59+17c77c78982180/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/golang.org/x/net/http2/server.go:2053
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: /usr/local/go/src/runtime/asm_amd64.s:2361
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: E1108 15:47:34.651767       1 wrap.go:34] apiserver panic'd on GET /apis/autoscaling/v2beta1/horizontalpodautoscalers?limit=500&resourceVersion=0: runtime error: invalid memory address or nil pointer dereference
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: goroutine 8074 [running]:
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: runtime/debug.Stack(0x54819e0, 0xc42ee859d0, 0x3ed4407)
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: #011/usr/local/go/src/runtime/debug/stack.go:24 +0xa7
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters.WithPanicRecovery.func1.1(0x360fc20, 0x739e7b0)
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: #011/workspace/anago-v1.12.2-beta.0.59+17c77c78982180/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters/wrap.go:34 +0x74
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0xc42fe91cf0, 0x1, 0x1)
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: #011/workspace/anago-v1.12.2-beta.0.59+17c77c78982180/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:54 +0xc7
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: panic(0x360fc20, 0x739e7b0)
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: #011/usr/local/go/src/runtime/panic.go:502 +0x229
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters.(*timeoutHandler).ServeHTTP(0xc428379c40, 0x54819e0, 0xc42ee859d0, 0xc425fa9800)
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: #011/workspace/anago-v1.12.2-beta.0.59+17c77c78982180/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters/timeout.go:105 +0x404
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters.WithWaitGroup.func1(0x54819e0, 0xc42ee859d0, 0xc425fa9700)
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: #011/workspace/anago-v1.12.2-beta.0.59+17c77c78982180/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters/waitgroup.go:47 +0xd4
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: net/http.HandlerFunc.ServeHTTP(0xc424fdf140, 0x54819e0, 0xc42ee859d0, 0xc425fa9700)
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: #011/usr/local/go/src/net/http/server.go:1947 +0x44
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filters.WithRequestInfo.func1(0x54819e0, 0xc42ee859d0, 0xc425fa8b00)
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: #011/workspace/anago-v1.12.2-beta.0.59+17c77c78982180/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filters/requestinfo.go:39 +0x181
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: net/http.HandlerFunc.ServeHTTP(0xc424fdf170, 0x54819e0, 0xc42ee859d0, 0xc425fa8b00)
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: #011/usr/local/go/src/net/http/server.go:1947 +0x44
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters.WithPanicRecovery.func1(0x54819e0, 0xc42ee859d0, 0xc425fa8b00)
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: #011/workspace/anago-v1.12.2-beta.0.59+17c77c78982180/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters/wrap.go:41 +0x108
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: net/http.HandlerFunc.ServeHTTP(0xc428379c80, 0x5476ca0, 0xc42d199c80, 0xc425fa8b00)
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: #011/usr/local/go/src/net/http/server.go:1947 +0x44
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server.(*APIServerHandler).ServeHTTP(0xc424fdf1a0, 0x5476ca0, 0xc42d199c80, 0xc425fa8b00)
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: #011/workspace/anago-v1.12.2-beta.0.59+17c77c78982180/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/handler.go:189 +0x51
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: net/http.serverHandler.ServeHTTP(0xc424005e10, 0x5476ca0, 0xc42d199c80, 0xc425fa8b00)
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: #011/usr/local/go/src/net/http/server.go:2697 +0xbc
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: net/http.initNPNRequest.ServeHTTP(0xc422834000, 0xc424005e10, 0x5476ca0, 0xc42d199c80, 0xc425fa8b00)
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: #011/usr/local/go/src/net/http/server.go:3263 +0x9a
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: net/http.(Handler).ServeHTTP-fm(0x5476ca0, 0xc42d199c80, 0xc425fa8b00)
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: #011/usr/local/go/src/net/http/h2_bundle.go:5475 +0x4d
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: k8s.io/kubernetes/vendor/golang.org/x/net/http2.(*serverConn).runHandler(0xc42337ba40, 0xc42d199c80, 0xc425fa8b00, 0xc42590fbe0)
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: #011/workspace/anago-v1.12.2-beta.0.59+17c77c78982180/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/golang.org/x/net/http2/server.go:2053 +0x89
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: created by k8s.io/kubernetes/vendor/golang.org/x/net/http2.(*serverConn).processHeaders
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: #011/workspace/anago-v1.12.2-beta.0.59+17c77c78982180/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/golang.org/x/net/http2/server.go:1787 +0x46b
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]:
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: I1108 15:47:34.651887       1 log.go:172] http2: panic serving 172.27.39.91:46918: runtime error: invalid memory address or nil pointer dereference
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: goroutine 8074 [running]:
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: k8s.io/kubernetes/vendor/golang.org/x/net/http2.(*serverConn).runHandler.func1(0xc42d199c80, 0xc42fe91faf, 0xc42337ba40)
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: #011/workspace/anago-v1.12.2-beta.0.59+17c77c78982180/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/golang.org/x/net/http2/server.go:2046 +0x190
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: panic(0x360fc20, 0x739e7b0)
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: #011/usr/local/go/src/runtime/panic.go:502 +0x229
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0xc42fe91cf0, 0x1, 0x1)
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: #011/workspace/anago-v1.12.2-beta.0.59+17c77c78982180/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:58 +0x107
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: panic(0x360fc20, 0x739e7b0)
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: #011/usr/local/go/src/runtime/panic.go:502 +0x229
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters.(*timeoutHandler).ServeHTTP(0xc428379c40, 0x54819e0, 0xc42ee859d0, 0xc425fa9800)
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: #011/workspace/anago-v1.12.2-beta.0.59+17c77c78982180/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters/timeout.go:105 +0x404
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters.WithWaitGroup.func1(0x54819e0, 0xc42ee859d0, 0xc425fa9700)
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: #011/workspace/anago-v1.12.2-beta.0.59+17c77c78982180/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters/waitgroup.go:47 +0xd4
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: net/http.HandlerFunc.ServeHTTP(0xc424fdf140, 0x54819e0, 0xc42ee859d0, 0xc425fa9700)
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: #011/usr/local/go/src/net/http/server.go:1947 +0x44
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filters.WithRequestInfo.func1(0x54819e0, 0xc42ee859d0, 0xc425fa8b00)
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: #011/workspace/anago-v1.12.2-beta.0.59+17c77c78982180/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filters/requestinfo.go:39 +0x181
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: net/http.HandlerFunc.ServeHTTP(0xc424fdf170, 0x54819e0, 0xc42ee859d0, 0xc425fa8b00)
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: #011/usr/local/go/src/net/http/server.go:1947 +0x44
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters.WithPanicRecovery.func1(0x54819e0, 0xc42ee859d0, 0xc425fa8b00)
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: #011/workspace/anago-v1.12.2-beta.0.59+17c77c78982180/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters/wrap.go:41 +0x108
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: net/http.HandlerFunc.ServeHTTP(0xc428379c80, 0x5476ca0, 0xc42d199c80, 0xc425fa8b00)
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: #011/usr/local/go/src/net/http/server.go:1947 +0x44
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server.(*APIServerHandler).ServeHTTP(0xc424fdf1a0, 0x5476ca0, 0xc42d199c80, 0xc425fa8b00)
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: #011/workspace/anago-v1.12.2-beta.0.59+17c77c78982180/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/handler.go:189 +0x51
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: net/http.serverHandler.ServeHTTP(0xc424005e10, 0x5476ca0, 0xc42d199c80, 0xc425fa8b00)
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: #011/usr/local/go/src/net/http/server.go:2697 +0xbc
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: net/http.initNPNRequest.ServeHTTP(0xc422834000, 0xc424005e10, 0x5476ca0, 0xc42d199c80, 0xc425fa8b00)
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: #011/usr/local/go/src/net/http/server.go:3263 +0x9a
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: net/http.(Handler).ServeHTTP-fm(0x5476ca0, 0xc42d199c80, 0xc425fa8b00)
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: #011/usr/local/go/src/net/http/h2_bundle.go:5475 +0x4d
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: k8s.io/kubernetes/vendor/golang.org/x/net/http2.(*serverConn).runHandler(0xc42337ba40, 0xc42d199c80, 0xc425fa8b00, 0xc42590fbe0)
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: #011/workspace/anago-v1.12.2-beta.0.59+17c77c78982180/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/golang.org/x/net/http2/server.go:2053 +0x89
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: created by k8s.io/kubernetes/vendor/golang.org/x/net/http2.(*serverConn).processHeaders
2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: #011/workspace/anago-v1.12.2-beta.0.59+17c77c78982180/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/golang.org/x/net/http2/server.go:1787 +0x46b
2018-11-08T10:47:35-05:00 localhost docker/kube-apiserver[1131]: E1108 15:47:35.930814       1 runtime.go:66] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
2018-11-08T10:47:35-05:00 localhost docker/kube-apiserver[1131]: /workspace/anago-v1.12.2-beta.0.59+17c77c78982180/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:72

Environment:

  • Kubernetes version (use kubectl version):
    Client Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.2", GitCommit:"17c77c7898218073f14c8d573582e8d2313dc740", GitTreeState:"clean", BuildDate:"2018-10-24T06:54:59Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}
    Server Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.2", GitCommit:"17c77c7898218073f14c8d573582e8d2313dc740", GitTreeState:"clean", BuildDate:"2018-10-24T06:43:59Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}

  • Cloud provider or hardware configuration:
    10 workers with 64GB on 3 controllers VM.

  • OS (e.g. from /etc/os-release):
    NAME="Ubuntu"
    VERSION="16.04.5 LTS (Xenial Xerus)"
    ID=ubuntu
    ID_LIKE=debian
    PRETTY_NAME="Ubuntu 16.04.5 LTS"
    VERSION_ID="16.04"
    HOME_URL="http://www.ubuntu.com/"
    SUPPORT_URL="http://help.ubuntu.com/"
    BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
    VERSION_CODENAME=xenial
    UBUNTU_CODENAME=xenial

  • Kernel (e.g. uname -a):
    Linux staging-controller-delve-002 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

  • Install tools:
    ansible

  • Others:

/kind bug

@RRAlex RRAlex changed the title from K8s 1.12.2 nil pointer crash to K8s kube-apiserver 1.12.2 nil pointer crash Nov 8, 2018

@RRAlex

This comment has been minimized.

RRAlex commented Nov 8, 2018

/sig api-machinery

@yue9944882

This comment has been minimized.

Member

yue9944882 commented Nov 9, 2018

/cc @liggitt

seems that it's the graph in node authz is not ready to hande requests. probably respecting infomer's HasSynced func can make it initialized gracefully. while the log you provides didn't show why the apiserver's crushed. could you provide more detail?

@liggitt

This comment has been minimized.

Member

liggitt commented Nov 9, 2018

seems that it's the graph in node authz is not ready to hande requests. probably respecting infomer's HasSynced func can make it initialized gracefully

if this occurred right at server start, it is expected the graph will take a short amount of time to populate

@RRAlex

This comment has been minimized.

RRAlex commented Nov 9, 2018

If you tell me what more info you need, I'll be glad to provide you with it, maybe the apiserver flags used?

/usr/bin/docker run --rm \
  -v /var/lib/kubernetes:/var/lib/kubernetes \
  -v /etc/hosts:/etc/hosts \
  -p 6443:6443 \
  -p 8080:8080 \
  --net=host \
  --name kube-apiserver \
  gcr.io/google-containers/kube-apiserver-amd64:v1.12.2 \
   /usr/local/bin/kube-apiserver \
    --admission-control=Initializers,NamespaceLifecycle,NodeRestriction,LimitRanger,ServiceAccount,PersistentVolumeLabel,DefaultStorageClass,ResourceQuota,DefaultTolerationSeconds \
    --advertise-address=172.27.39.86 \
    --allow-privileged=true \
    --apiserver-count=3 \
    --event-ttl=1h \
    --audit-log-maxage=30 \
    --audit-log-maxbackup=3 \
    --audit-log-maxsize=100 \
    --audit-log-path=/var/log/audit.log \
    --authorization-mode=Node,RBAC \
    --bind-address=0.0.0.0 \
    --insecure-bind-address=127.0.0.1 \
    --client-ca-file=/var/lib/kubernetes/ca.pem \
    --etcd-cafile=/var/lib/kubernetes/ca.pem \
    --etcd-certfile=/var/lib/kubernetes/kubernetes-ca-bundle.pem \
    --etcd-keyfile=/var/lib/kubernetes/kubernetes-key.pem \
    --etcd-servers=https://172.27.39.86:2379,https://172.27.39.79:2379,https://172.27.39.85:2379 \
    --experimental-encryption-provider-config=/var/lib/kubernetes/encryption-config.yml \
    --kubelet-certificate-authority=/var/lib/kubernetes/ca.pem \
    --kubelet-client-certificate=/var/lib/kubernetes/kubernetes-ca-bundle.pem \
    --kubelet-client-key=/var/lib/kubernetes/kubernetes-key.pem \
    --kubelet-https=true \
    --runtime-config=api/all \
    --feature-gates=PersistentLocalVolumes=true \
    --service-account-key-file=/var/lib/kubernetes/ca-key.pem \
    --service-cluster-ip-range=10.0.0.0/16 \
    --service-node-port-range=80-32767 \
    --tls-cert-file=/var/lib/kubernetes/kubernetes-ca-bundle.pem \
    --tls-private-key-file=/var/lib/kubernetes/kubernetes-key.pem \
    --v=2 \
    --requestheader-client-ca-file=/var/lib/kubernetes/aggregator-ca/aggregator-ca.pem \
    --requestheader-allowed-names=system:auth-proxy \
    --requestheader-extra-headers-prefix=X-Remote-Extra- \
    --requestheader-group-headers=X-Remote-Group \
    --requestheader-username-headers=X-Remote-User \
    --proxy-client-cert-file=/var/lib/kubernetes/aggregator-ca/aggregator-proxy-client.pem \
    --proxy-client-key-file=/var/lib/kubernetes/aggregator-ca/aggregator-proxy-client-key.pem

Also, I restart the one crash-dumping, one of the other two will start outputting the same dump.

Looking back at the logs, this seems to have started when I moved from 1.10.3 to 1.12.1 (now 1.12.2), I just hadn't realized it until recently as we had a big cluster wide problem yesterday and had to investigate some more.
But it is still outputting this same dump non-stop even though it seems to be working at the moment and that API server does respond to kubectl commands without issues.

Any specific info you'd like me to get?

@yue9944882

This comment has been minimized.

Member

yue9944882 commented Nov 14, 2018

from the stack trace, sadly we can't find out which exact line trigger that nil pointer panic. the behavior of apiserver could be unstable on the fresh start (due to rbac authz, crd-apiservice registration, etc). are we planing a graceful start mechanism for the server? @kubernetes/sig-api-machinery-misc

@liggitt

This comment has been minimized.

Member

liggitt commented Nov 14, 2018

are we planing a graceful start mechanism for the server?

the apiserver /healthz endpoint is used to determine if the server is ready to accept traffic, and will not return a 200 response until those startup tasks have completed

@RRAlex

This comment has been minimized.

RRAlex commented Nov 14, 2018

/healthz does respond ok (200), while still crash-dump-looping.

EDIT: anything I can test specifically to help differentiate between RBAC authz, crd-apiservice reg, etc?

@liggitt

This comment has been minimized.

Member

liggitt commented Nov 14, 2018

/healthz does respond ok (200), while still crash-dump-looping.

right, I was saying that is the mechanism for graceful startup.

unfortunately, the way golang works, if you propagate a panic (as we do in the timeout handler), you lose all stack information from it by default. Some of our panic propagation logs or preserves that, but not all. I'll open a fix to preserve that information.

we can see the request that errored, though:

2018-11-08T10:47:34-05:00 localhost docker/kube-apiserver[1131]: E1108 15:47:34.651767 1 wrap.go:34] apiserver panic'd on GET /apis/autoscaling/v2beta1/horizontalpodautoscalers?limit=500&resourceVersion=0: runtime error: invalid memory address or nil pointer dereference

can you see if the same resource (autoscaling/v2beta1/horizontalpodautoscalers) is what causes the panic each time? if so, can you check that resource per namespace to isolate a particular problematic object?

EDIT: anything I can test specifically to help differentiate between RBAC authz, crd-apiservice reg, etc?

those are not related to the panic, as far as I can tell

@lavalamp

This comment has been minimized.

Member

lavalamp commented Nov 14, 2018

Still reading, but:

when I moved from 1.10.3 to 1.12.1 (now 1.12.2)

Did you skip 1.11? Skipping releases isn't actually supported.

@RRAlex

This comment has been minimized.

RRAlex commented Nov 14, 2018

Right now, having been up for a while, it alternates almost perfectly (E114 sometime coming twice in a row) between E114 and I114, basically once per second.
These two errors correspond to the one you quoted (HPA) & the other to the HTTP2 panic.
The cluster was upgraded from 1.10.3 to 1.12.1 (to 1.12.2) and from what I could gather last week, this started after the upgrade to 1.12.x branch but we didn't realize it until we had to investigate a cluster issue 3 weeks after the upgrade.

@RRAlex

This comment has been minimized.

RRAlex commented Nov 14, 2018

Still reading, but:

when I moved from 1.10.3 to 1.12.1 (now 1.12.2)

Did you skip 1.11? Skipping releases isn't actually supported.

Yes, I did, I didn't know that...
should I roll back to 1.11 and than back again to 1.12?
Or what are the effects and what can be done?

@lavalamp

This comment has been minimized.

Member

lavalamp commented Nov 14, 2018

There's at least two things going wrong, one is: eec1b52

...and the other is whatever is actually crashing.

At this point I'm not sure if rolling back to 1.11 would make things worse or better, if your cluster is functioning I probably wouldn't do; if your cluster is nonfunctional and you need it fixed then I might roll the dice.

I think we should be able to get a fix for the first thing out soon, but it'll be a while before you'd be able to deploy it unless you're willing to build your own apiserver & run it.

@RRAlex

This comment has been minimized.

RRAlex commented Nov 14, 2018

So, any way to diagnose what should have been updated in, I guess, etcd?
I mean, is this similar to a schema change that is missing some steps?
Thanks!

@liggitt

This comment has been minimized.

Member

liggitt commented Nov 15, 2018

callstack logging fixed in #71067 and #71076

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment