Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kube-apiserver: failover on multi-member etcd cluster fails certificate check on DNS mismatch #83028

Closed
nerzhul opened this issue Sep 23, 2019 · 30 comments · Fixed by #83735 or #83801

Comments

@nerzhul
Copy link

@nerzhul nerzhul commented Sep 23, 2019

What happened: Kubernetes APIServer connects to etcd in HTTPS but the certificate check is invalid

Sep 23 18:36:42 kube-control-plane-to6oho0e kube-apiserver[18881]: W0923 18:36:42.109767 18881 clientconn.go:1120] grpc: addrConn.createTransport failed to connect to {https://kube-control-plane-mo2phooj.k8s.lan:2379 0 <nil>}. Err :connection error: desc = "transport: authentication handshake failed: x509: certificate is valid for localhost, kube-control-plane-mo2phooj.k8s.lan, not kube-control-plane-baeg4ahr.k8s.lan". Reconnecting...

What you expected to happen: When kube-apiserver connect to kube-control-plane-mo2phooj with the correct certificate it should not fail because it search for another etcd node certificate.

How to reproduce it (as minimally and precisely as possible): do a etcd 3.4 HTTPS setup with 3 https nodes with each node with its own SSL certificate

Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version): 1.16.0
  • Cloud provider or hardware configuration: None
  • OS (e.g: cat /etc/os-release): Debian 9/arm64
  • Kernel (e.g. uname -a): 4.4.167-1213-rockchip-ayufan-g34ae07687fce
  • Install tools: N/A
  • Network plugin and version (if this is a network-related bug): N/A
  • Others: N/A
@nerzhul

This comment has been minimized.

Copy link
Author

@nerzhul nerzhul commented Sep 23, 2019

/sig api-machinery

@liggitt

This comment has been minimized.

Copy link
Member

@liggitt liggitt commented Sep 23, 2019

do you have your full apiserver invocation (specifically the --etcd-servers argument)?

/cc @jpbetz

@nerzhul

This comment has been minimized.

Copy link
Author

@nerzhul nerzhul commented Sep 23, 2019

Hello @liggitt here it is:

/usr/bin/kube-apiserver --apiserver-count=3 --allow-privileged=true --enable-admission-plugins=DefaultStorageClass,DefaultTolerationSeconds,LimitRanger,NamespaceLifecycle,PersistentVolumeLabel,PodNodeSelector,PodSecurityPolicy,ResourceQuota,ServiceAccount --authorization-mode=Node,RBAC --secure-port=6443 --bind-address=0.0.0.0 --advertise-address=172.31.25.243 --insecure-port=8080 --insecure-bind-address=127.0.0.1 --etcd-cafile=/etc/kubernetes/pki/etcd-ca.crt --etcd-certfile=/etc/kubernetes/pki/etcd.crt --etcd-keyfile=/etc/kubernetes/pki/etcd.key --audit-log-maxage=30 --audit-log-maxbackup=3 --audit-log-maxsize=100 --audit-log-path=/var/log/kube-audit.log --client-ca-file=/etc/kubernetes/pki/ca.crt --etcd-servers https://kube-control-plane-baeg4ahr.k8s.lan:2379,https://kube-control-plane-mo2phooj.k8s.lan:2379,https://kube-control-plane-to6oho0e.k8s.lan:2379 --service-account-key-file=/etc/kubernetes/pki/sa.crt --service-cluster-ip-range=10.152.0.0/16 --service-node-port-range=30000-32767 --tls-cert-file=/etc/kubernetes/pki/kube-apiserver.crt --tls-private-key-file=/etc/kubernetes/pki/kube-apiserver.key --enable-bootstrap-token-auth=true --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt --requestheader-username-headers=X-Remote-User --requestheader-group-headers=X-Remote-Group --requestheader-allowed-names=front-proxy-client --requestheader-extra-headers-prefix=X-Remote-Extra- --target-ram-mb=256 --v=3

Please note i have the same command (on amd64 production) but in kubernetes 1.15.4 without any problem. It seems there is a regression in 1.16.0 code

@liggitt

This comment has been minimized.

Copy link
Member

@liggitt liggitt commented Sep 23, 2019

this seems like the same issue as #72102 (comment) which was supposed to be resolved by #81434 in 1.16

/assign @jpbetz @gyuho

@liggitt liggitt added this to the v1.16 milestone Sep 23, 2019
@nerzhul

This comment has been minimized.

Copy link
Author

@nerzhul nerzhul commented Sep 23, 2019

@liggitt note the api server works, but if kube-control-plane-baeg4ahr.k8s.lan (which is the currently searched) apiserver is broken, /healthz return 500 but apiserver works anyway in degraded etcd mode.
Also please note in the regular mode (all nodes up), i have this logs many times in the logs:

Sep 23 22:27:05 kube-control-plane-to6oho0e.k8s.lan etcd[17438]: rejected connection from "172.31.25.243:60930" (error "remote error: tls: bad certificate", ServerName "kube-control-plane-baeg4ahr.k8s.lan")

It seems etcd client wants first node certificate on each node

@jpbetz

This comment has been minimized.

Copy link
Contributor

@jpbetz jpbetz commented Sep 23, 2019

Agree with @liggitt that #72102 (comment) is the mostly likely cause. I'd check if that resolves the problem first.

@liggitt

This comment has been minimized.

Copy link
Member

@liggitt liggitt commented Sep 24, 2019

@jpbetz doesn't the error message seem strange to you?

failed to connect to {https://kube-control-plane-mo2phooj.k8s.lan:2379 0 }. Err :connection error: desc = "transport: authentication handshake failed: x509: certificate is valid for localhost, kube-control-plane-mo2phooj.k8s.lan, not kube-control-plane-baeg4ahr.k8s.lan"

The error message says the certificate is valid for the hostname we're trying to connect to, and is not valid for the first hostname listed in --etcd-servers. That certainly sounds like the bug #72102 that was supposed to be fixed in #81434 in 1.16

@jpbetz

This comment has been minimized.

Copy link
Contributor

@jpbetz jpbetz commented Sep 24, 2019

Oh wait. The problem cluster already 1.16. Looking

@gyuho

This comment has been minimized.

Copy link
Member

@gyuho gyuho commented Sep 24, 2019

@jpbetz I don't think our fix handles DNS names in failover, since we can only get target IP from remote connection

ref. etcd-io/etcd@db61ee1

@liggitt

This comment has been minimized.

Copy link
Member

@liggitt liggitt commented Sep 24, 2019

matching priority in original bug (#72102)

@liggitt

This comment has been minimized.

Copy link
Member

@liggitt liggitt commented Sep 24, 2019

Please note i have the same command (on amd64 production) but in kubernetes 1.15.4 without any problem. It seems there is a regression in 1.16.0 code

We need to verify whether this was actually a regression in 1.16, or if the same issue existed in 1.15.4 and just never got hit because the first server specified in --etcd-servers was available on startup

@jpbetz

This comment has been minimized.

Copy link
Contributor

@jpbetz jpbetz commented Sep 24, 2019

We need to verify whether this was actually a regression in 1.16, or if the same issue existed in 1.15.4 and just never got hit because the first server specified in --etcd-servers was available on startup

k8s 1.15 uses etcd 3.3.13 and k8s 1.16 uses etcd 3.3.15. In etcd 3.3.14 we switched over to the new grpc based client side load balancer implementation. etcd-io/etcd@db61ee1 that @gyuho mentioned fixed one failover issue introduced by the new balancer (that @gyuho fixed), but this is different. It's quite probable this is a regression but I agree we should verify.

@jpbetz

This comment has been minimized.

Copy link
Contributor

@jpbetz jpbetz commented Sep 24, 2019

cc @dims

@dims

This comment has been minimized.

Copy link
Member

@dims dims commented Sep 24, 2019

ack @jpbetz will follow along :)

@jpbetz

This comment has been minimized.

Copy link
Contributor

@jpbetz jpbetz commented Sep 24, 2019

I've created a reproduction of the issue, it appears to be an issue on both 1.16 and 1.15:https://github.com/jpbetz/etcd/blob/etcd-lb-dnsname-failover/reproduction.md

I'm not 100% certain I've gotten the reproduction correct, so extra eyes on it are welcome.

@liggitt liggitt removed this from the v1.16 milestone Sep 24, 2019
@liggitt liggitt changed the title apiserver (1.16.0): etcd certificate mismatch kube-apiserver: failover on multi-member etcd cluster fails certificate check on DNS mismatch Sep 24, 2019
@liggitt

This comment has been minimized.

Copy link
Member

@liggitt liggitt commented Sep 24, 2019

This was reproduced on 1.15.x as well, so it doesn't appear to be a 1.16 regression. The fix in 1.16 resolved IP TLS validation, but not hostname/DNS validation.

@liggitt

This comment has been minimized.

Copy link
Member

@liggitt liggitt commented Sep 24, 2019

Removing milestone, but leaving at critical. If a contained fix is developed, I'd recommend it be picked to 1.16.x if possible

@jpbetz

This comment has been minimized.

Copy link
Contributor

@jpbetz jpbetz commented Sep 24, 2019

If the 1st etcd member in the --etcd-servers endpoint list is unavailable during startup, the kube-apiserver terminates and reports an error like:

Unable to create storage backend: config (&{ /registry {[https://member1.etcd.local:2379 https://member2.etcd.local:22379 https://member3.etcd.local:32379] /usr/local/google/home/jpbetz/projects/etcd-io/src/go.etcd.io/etcd/integration/fixtures/ca.crt} true 0xc000832480 apiextensions.k8s.io/v1beta1 5m0s 1m0s}), err (dial tcp 127.0.0.1:2379: connect: connection refused)

If the etcd member becomes unavailable after the kube-apiserver is started, the kube-apiserver will continue to run but will report the issue in the logs repeatedly, e.g.:

W0924 14:44:57.825711 250871 clientconn.go:1120] grpc: addrConn.createTransport failed to connect to {https://member2.etcd.local:22379 0 }. Err :connection error: desc = "transport: authentication handshake failed: x509: certificate is valid for member2.etcd.local, not member1.etcd.local"

@jpbetz

This comment has been minimized.

Copy link
Contributor

@jpbetz jpbetz commented Sep 25, 2019

Created grpc/grpc-go#3038 to discuss the issue with the gPRC team

@jpbetz

This comment has been minimized.

Copy link
Contributor

@jpbetz jpbetz commented Oct 8, 2019

etcd backports of fix:

@nerzhul

This comment has been minimized.

Copy link
Author

@nerzhul nerzhul commented Oct 10, 2019

thanks for your time. Can we have a backport on 1.15 too please ?

@liggitt

This comment has been minimized.

Copy link
Member

@liggitt liggitt commented Oct 10, 2019

Can we have a backport on 1.15 too please ?

unfortunately the transitive dependencies make a backport to 1.15 prohibitive. see #72102 (comment) and #72102 (comment)

@seh

This comment has been minimized.

Copy link
Contributor

@seh seh commented Oct 16, 2019

Given that we missed the cut for Kubernetes version 1.16.2, I decided to find a workaround for this problem, to allow my API servers to talk to etcd servers (running version 3.3.17). My etcd server certificates include SANs both for the machine's DNS name and the subdomain within which they sit for DNS discovery. My API servers start with a set of URLs that mention those per-machine DNS names.

Here's what turned out to work well enough for now: Use a wildcard SAN in the etcd server certificates in place of the per-machine SAN. Given a subdomain for these machines like cluster-1.kubernetes.local and etcd DNS names like etcd0.cluster-1.kubernetes.local, the certificates normally have DNS name SANS as follows:

  • etcd0.cluster-1.kubernetes.local
  • cluster-1.kubernetes.local

I instead created certificates with the wildcard:

  • *.cluster-1.kubernetes.local
  • cluster-1.kubernetes.local

Restarting the etcd servers with these temporary certificates satisfied the Kubernetes API servers—tested at both version 1.16.1 and 1.16.2.

@victorgp

This comment has been minimized.

Copy link
Contributor

@victorgp victorgp commented Oct 16, 2019

I was under the impression that just by upgrading etcd to versoin 3.3.17 the issue was fixed. I don't see where it says that we need to upgrade to 1.16.2.
In the changelog, there are not notes for this version (https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.16.md#changelog-since-v1161)

Can someone give me a hit of the k8s and etcd versions needed to fix this issue?

@seh

This comment has been minimized.

Copy link
Contributor

@seh seh commented Oct 16, 2019

I'm not saying that upgrading Kubernetes is necessary. I had already upgraded to version 1.16.1 when I first noticed this problem. The question was whether to roll back to our previous version of 1.15.1, which predates this problem.

Since I was already running version 1.16.1, and had been preparing to upgrade to 1.16.2—falsely assuming that what became #83968 would be included—so I figured I'd test both of these versions, and share my findings with others contemplating using either of these today.

@vgarcia-te

This comment has been minimized.

Copy link

@vgarcia-te vgarcia-te commented Oct 16, 2019

Ok got it. The explanation is here #72102 (comment) we still need to wait for #83968 . Targeting 1.16.3 for the fix

@liggitt

This comment has been minimized.

Copy link
Member

@liggitt liggitt commented Oct 16, 2019

The question was whether to roll back to our previous version of 1.15.1, which predates this problem.

The DNS certificate check issue existed in 1.15.x as well. The handling of ipv6 addresses was what regressed in 1.16 (#83550)

@seh

This comment has been minimized.

Copy link
Contributor

@seh seh commented Oct 16, 2019

The DNS certificate check issue existed in 1.15.x as well.

I don't see our API servers running version 1.15.1 complaining like this, so perhaps it wasn't until later in the 1.15 patch sequence.

@liggitt

This comment has been minimized.

Copy link
Member

@liggitt liggitt commented Oct 16, 2019

I don't see our API servers running version 1.15.1 complaining like this, so perhaps it wasn't until later in the 1.15 patch sequence.

The cert verification issue is #72102 and has existed for many releases. It only appears when the API server fails over to a server other than the first one passed to --etcd-servers, so if the first one is available, no error is observed.

@seh

This comment has been minimized.

Copy link
Contributor

@seh seh commented Oct 16, 2019

so if the first one is available, no error is observed.

Well, we must be very lucky around here!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.