Certificate error after upgrading CDK to 1.5.2 -> 1.5.3 #43209

Closed
jonathanmarsaud opened this Issue Mar 16, 2017 · 10 comments

Comments

Projects
None yet
6 participants

Is this a request for help?
No.

What keywords did you search in Kubernetes issues before filing this one? (If you have found any duplicates, you should instead reply there.):
, 1.5.3, upgrade, cdk, easyrsa, exec, logs


Is this a BUG REPORT or FEATURE REQUEST? (choose one):
BUG REPORT

Kubernetes version (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.3", GitCommit:"029c3a408176b55c30846f0faedf56aae5992e9b", GitTreeState:"clean", BuildDate:"2017-02-23T22:48:32Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.3", GitCommit:"029c3a408176b55c30846f0faedf56aae5992e9b", GitTreeState:"clean", BuildDate:"2017-02-23T22:28:16Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}

Environment:

  • Cloud provider or hardware configuration: VMware on-premise for control plane (easyrsa, kube-api-loadbalancer, master, etcd), heavy baremetal servers for nodes.
  • OS (e.g. from /etc/os-release): Ubuntu 16.04.2 LTS
  • Kernel (e.g. uname -a): Linux mth-k8smaster-01 4.4.0-67-generic #88-Ubuntu SMP Wed Mar 8 16:34:45 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
  • Install tools: Canonical Distribution of Kubernetes
  • Others:

What happened:
After upgrading our CDK cluster in 1.5.2 to latest versions of every charms of the bundle (to go to Kubernetes 1.5.3), we discovered that "kubectl exec" and "kubectl logs" versions return certificate error.
"exec/logs" are the only subcommands affected by this error, all others seems to be OK for what we discovered for now.

Below are the kind of error we got:
$ kubectl -exec -it phpbackend-w0bnv /bin/bash
Error from server: error dialing backend: x509: certificate signed by unknown authority
$ kubectl logs phpbackend-w0bnv
Error from server: Get https://ig1-k8s-03:10250/containerLogs/development/phpbackend-w0bnv/phpbackend: x509: certificate signed by unknown authority

(For info ig1-k8s-03 is one of our nodes.)

What you expected to happen:
As usual, exec should enter in a container of a pod, logs should display logs of pods in stdout.

How to reproduce it (as minimally and precisely as possible):
Deploy a 1.5.2 CDK Cluster, follow https://kubernetes.io/docs/getting-started-guides/ubuntu/upgrades/ to upgrade charms in correct order (EasyRSA at the end).

Anything else we need to know:
root@ig1-k8s-03:/srv/kubernetes# ls -l
total 40
-rwxrwx--- 1 root root 1179 Feb 15 12:32 ca.crt
-rwxrwx--- 1 root root 4367 Feb 15 12:32 client.crt
-rwxrwx--- 1 root root 1703 Feb 15 12:32 client.key
-rw------- 1 root root 10023 Feb 15 12:33 config
-rwxrwx--- 1 root root 4637 Mar 15 10:11 server.crt
-rwxrwx--- 1 root root 1704 Mar 15 10:11 server.key

-> March 15th 10:11 is the exact date of the CDK upgrades.

root@ig1-k8s-03:/srv/kubernetes# openssl x509 -in server.crt -text -noout
[...]
X509v3 extensions:
X509v3 Basic Constraints:
CA:FALSE
X509v3 Subject Key Identifier:
5D:C1:97:0C:F9:51:8E:D8:FF:70:37:F2:B9:6A:3A:BC:CF:F3:A8:FF
X509v3 Authority Key Identifier:
keyid:B7:DE:65:69:D5:47:3B:42:E0:6D:27:1D:BE:4B:DE:B8:EF:30:38:4C
DirName:/CN=ig1-k8srsa-01
serial:AF:F9:82:DA:3B:7F:74:B5

        X509v3 Extended Key Usage: 
            TLS Web Client Authentication, TLS Web Server Authentication
        X509v3 Key Usage: 
            Digital Signature, Key Encipherment
        X509v3 Subject Alternative Name: 
            DNS:ig1-k8s-03, DNS:ig1-k8s-03, DNS:ig1-k8s-03

[...]

Regards,

Contributor

Cynerva commented Mar 16, 2017

Thanks for reporting this.

This looks related to #41919 where we added server certificates to kubernetes-worker. I'm not sure why but it looks like they were signed with a different CA cert.

jonathanmarsaud commented Mar 16, 2017

@Cynerva Hmm, just for additionnal infos:

I tried to switch my local ~/.kube/config to a https://master:6443 directly instead of https://kube-api-loadbalancer:443 and I got the same error. I don't know so if it's tied only to the certificates tied to the kube-api-loadbalancer?

Contributor

Cynerva commented Mar 16, 2017

I tried to switch my local ~/.kube/config to a https://master:6443 directly instead of https://kube-api-loadbalancer:443 and I got the same error

Yeah, I think the x509: certificate signed by unknown authority error is actually coming from kube-apiserver trying to talk to kubelet. So this is a problem with communication between the services on kubernetes-master and kubernetes-worker. I wouldn't expect kube-api-loadbalancer's presence to change much here.

Hi,

I just discovered and solved my bug!
-> It seems that /etc/default/kubelet was not properly rendered after CDK 1.5.3 (from 1.5.2) upgrades. Maybe it have a conflict between juju upgrade-charm (which I run firstly) and apt update && apt upgrade (which I run secondly after juju).

KUBELET_ARGS before upgrading to 1.5.3 from 1.5.2:

KUBELET_ARGS="--tls-private-key-file=/srv/kubernetes/server.key --cluster-domain=cluster.local --kubeconfig=/srv/kubernetes/config --client-ca-file=/srv/kubernetes/ca.crt --require-kubeconfig --network-plugin=cni --anonymous-auth=false --tls-cert-file=/srv/kubernetes/server.crt --cluster-dns=10.152.183.10"

KUBELET_ARGS after upgrading to 1.5.3 from 1.5.2:

KUBELET_ARGS="--network-plugin=cni --cluster-dns=10.152.183.10 --kubeconfig=/srv/kubernetes/config --cluster-domain=cluster.local --require-kubeconfig"

So no configuration about CA & certificates/keys (+some other parameters).

I just replaced my KUBELET_ARGS and systemctl restart kubelet on all my nodes for now.

Contributor

castrojo commented Jun 14, 2017

/sig cluster-lifecycle
/assign @Cynerva

Collaborator

k8s-ci-robot commented Jun 14, 2017

@castrojo: GitHub didn't allow me to assign the following users: Cynerva.

Note that only kubernetes members can be assigned.

In response to this:

/sig cluster-lifecycle
/assign @Cynerva

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Contributor

Cynerva commented Jun 14, 2017

This issue was fixed in #44500, via the addition of the kubernetes-worker.restart-needed state seen here.

Contributor

castrojo commented Jun 14, 2017

/assign

Contributor

castrojo commented Jun 14, 2017

/close

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment