Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix --kubelet-certificate-authority not defined #2496

Closed
wants to merge 3 commits into from

Conversation

woopstar
Copy link
Member

--kubelet-certificate-authority is currently undefined on the kube apiserver. Enabling it causes the following error to happen, as nodes are not in the signed ip range in certs:

Error attaching, falling back to logs: error dialing backend: x509: certificate is valid for 10.50.63.11, 10.50.63.11, 10.50.63.12, 10.50.63.12, 10.50.63.13, 10.50.63.13, 10.248.0.1, 10.50.63.10, 127.0.0.1, 10.50.63.10, not 10.50.64.11
Error from server: Get https://10.50.64.11:10250/containerLogs/default/load-generator-5c4d59d5dd-mcg9h/load-generator: x509: certificate is valid for 10.50.63.11, 10.50.63.11, 10.50.63.12, 10.50.63.12, 10.50.63.13, 10.50.63.13, 10.248.0.1, 10.50.63.10, 127.0.0.1, 10.50.63.10, not 10.50.64.11

Looking at the openssl.conf file on a master, it reveals that no node ip adresses are actually in any certs for the nodes:

# cat /etc/kubernetes/openssl.conf
[req]
req_extensions = v3_req
distinguished_name = req_distinguished_name
[req_distinguished_name]
[ v3_req ]
basicConstraints = CA:FALSE
keyUsage = nonRepudiation, digitalSignature, keyEncipherment
subjectAltName = @alt_names
[alt_names]
DNS.1 = kubernetes
DNS.2 = kubernetes.default
DNS.3 = kubernetes.default.svc
DNS.4 = kubernetes.default.svc.cluster.local
DNS.5 = localhost
DNS.6 = odn1-kube-cluster01-master01
DNS.7 = odn1-kube-cluster01-master02
DNS.8 = odn1-kube-cluster01-master03
DNS.9 = odn1-kube-lb.privatedns.zone
IP.1 = 10.50.63.11
IP.2 = 10.50.63.11
IP.3 = 10.50.63.12
IP.4 = 10.50.63.12
IP.5 = 10.50.63.13
IP.6 = 10.50.63.13
IP.7 = 10.248.0.1
IP.8 = 10.50.63.10
IP.9 = 127.0.0.1
IP.10 = 10.50.63.10

Like here, 10.50.64.11 is missing which is a node, not a master.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Mar 19, 2018
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Mar 19, 2018
{% set idx = idx + 1 %}
{% for host in groups['kube-node'] %}
{% if hostvars[host]['access_ip'] is defined %}
IP.{{ counter["ip"] }} = {{ hostvars[host]['access_ip'] }}{{ increment(counter, 'ip') }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't scale. It means recreating kube-apiserver cert every time you want to add/remove a node. This is a destructive process which means forcing regeneration of service account secrets and restarting affected pods.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Didn't think of that. Any suggestion

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, we should only generate a unique certificate per node and apply a openssl.conf ony with the node's ip/dns in it. And then sign it by the CA file.

That would be the right way to do it

@NicolasT
Copy link
Contributor

Just ran into a similar problem, trying to replace a 'crash-and-burn'ed master node by a 'new' one, but using a different IP:

fatal: [node-04]: FAILED! => {"changed": false, "msg": "error running kubectl (/usr/local/bin/kubectl apply --force --filename=/etc/kubernetes/node-crb.yml) command (rc=1), out='', err='Unable to connect to the server: x509: certificate is valid for 10.100.2.70, 10.100.2.70, 10.100.1.44, 10.100.1.44, 10.100.2.160, 10.100.2.160, 10.233.0.1, 127.0.0.1, not 10.100.2.142\n'"}

Indeed, the list of 'valid' IPs are the original nodes, including the crashed one, and not the new one.

Creating a single CA once, then having per-server keys & certs, generated when necessary or even for every run, seems like a more correct approach, no?

@woopstar
Copy link
Member Author

You are totally right. I just need to update the PR to do so. Or feel free to make a PR that does and we can close this.

@woopstar woopstar changed the title Fix --kubelet-client-certificate not working Fix --kubelet-certificate-authority not working Mar 22, 2018
@woopstar woopstar changed the title Fix --kubelet-certificate-authority not working Update openssl.conf to count better and work with Jinja 2.9 Mar 28, 2018
@woopstar woopstar changed the title Update openssl.conf to count better and work with Jinja 2.9 Fix --kubelet-certificate-authority not defined Mar 28, 2018
@woopstar woopstar closed this Mar 28, 2018
@andrewb3000
Copy link

Have this been implemented in some other PR maybe?

@woopstar
Copy link
Member Author

Have this been implemented in some other PR maybe?

--kubelet-certificate-authority is defined after we switched to kubeadm-deployment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants