Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kube-apiserver and TLS etc: kubectl reports unhealthy cluster #29330

Closed
xgerman opened this issue Jul 20, 2016 · 16 comments
Closed

kube-apiserver and TLS etc: kubectl reports unhealthy cluster #29330

xgerman opened this issue Jul 20, 2016 · 16 comments
Labels
area/apiserver needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one.

Comments

@xgerman
Copy link

xgerman commented Jul 20, 2016

I am running v1.3.0-beta.1 and have set up etc, k8 with TLS. Here is the relevant part for the kube-apiserver:
--etcd_servers=https://192.168.200.2:2379
--etcd-cafile=/srv/kubernetes/ca.crt
--etcd-certfile=/srv/kubernetes/kubecfg.crt
--etcd-keyfile=/srv/kubernetes/kubecfg.key \

I tried it running without those parameters and kube-apiserver didn't start but failed with (from the logs):
etcd cluster is unavailable or misconfigured

So it can do something with those certs.

Now, when I run
vagrant@k8-master:~$ kubectl get cs
NAME STATUS MESSAGE ERROR
scheduler Healthy ok
controller-manager Healthy ok
etcd-0 Unhealthy Get https://192.168.200.2:2379/health: remote error: bad certificate

But when I run etcdctl using the SAM certificates:
vagrant@k8-master:~$ sudo etcdctl --debug --endpoints https://192.168.200.2:2379 --ca-file /srv/kubernetes/ca.crt --key-file /srv/kubernetes/kubecfg.key --cert-file /srv/kubernetes/kubecfg.crt cluster-health
Cluster-Endpoints: https://192.168.200.2:2379
cURL Command: curl -X GET https://192.168.200.2:2379/v2/members
member ce2a822cea30bfca is healthy: got healthy result from https://192.168.200.2:2379
cluster is healthy

Furthermore:
sudo etcdctl --debug --endpoints https://192.168.200.2:2379 --ca-file /srv/kubernetes/ca.crt --key-file /srv/kubernetes/kubecfg.key --cert-file /srv/kubernetes/kubecfg.crt ls registry
start to sync cluster using endpoints(https://192.168.200.2:2379)
cURL Command: curl -X GET https://192.168.200.2:2379/v2/members
got endpoints(https://192.168.200.2:2379) after sync
Cluster-Endpoints: https://192.168.200.2:2379
cURL Command: curl -X GET https://192.168.200.2:2379/v2/keys/registry?quorum=false&recursive=false&sorted=false
/registry/ranges
/registry/namespaces
/registry/events
/registry/services
/registry/serviceaccounts
/registry/secrets
/registry/deployments
/registry/replicasets
/registry/pods

I suspect that some code path in kube-apiserver isn't aware of the certificates?

@elcct
Copy link

elcct commented Aug 24, 2016

I have the same issue.
I am running Kubernetes 1.3.5 with Etcd 2.3.7 on Ubuntu 16.04.

root@gc01:/opt/kubernetes/certs# $GOPATH/bin/etcdctl -ca-file=ca.pem -cert-file=client.pem -key-file=client-key.pem -endpoints=https://gc01.xxxxx:2379 ls registry
/registry/ranges
/registry/namespaces
/registry/services
/registry/serviceaccounts
root@gc01:/opt/kubernetes/certs# $GOPATH/bin/etcdctl -ca-file=ca.pem -cert-file=client.pem -key-file=client-key.pem -endpoints=https://gc01.xxxxx:2379 cluster-health
member 12fabe43e0b7020b is healthy: got healthy result from https://gc01.xxxxx:2379
member 3e9e6038432c0da7 is healthy: got healthy result from https://gc03.xxxxx:2379
member 5223f27d945df07a is healthy: got healthy result from https://gc05.xxxxx:2379
member 6c3dd0c34e4e7bee is healthy: got healthy result from https://gc02.xxxxx:2379
member e32849a38113a4f2 is healthy: got healthy result from https://gc04.xxxxx:2379
cluster is healthy
root@gc01:/opt/kubernetes/certs# /opt/kubernetes/bin/kubectl get cs
NAME                 STATUS      MESSAGE                                                                         ERROR
controller-manager   Healthy     ok
scheduler            Healthy     ok
etcd-0               Unhealthy   Get https://gc01.xxxxx:2379/health: remote error: bad certificate
etcd-3               Unhealthy   Get https://gc04.xxxxx:2379/health: remote error: bad certificate
etcd-4               Unhealthy   Get https://gc05.xxxxx:2379/health: remote error: bad certificate
etcd-2               Unhealthy   Get https://gc03.xxxxx:2379/health: remote error: bad certificate
etcd-1               Unhealthy   Get https://gc02.xxxxx:2379/health: remote error: bad certificate

My etcd part of apiserver configuration is the same is in the post above:

  --etcd-cafile=/opt/kubernetes/certs/ca.pem \
  --etcd-certfile=/opt/kubernetes/certs/client.pem \
  --etcd-keyfile=/opt/kubernetes/certs/client-key.pem \
  --etcd-servers=https://gc01.xxxxx:2379,https://gc02.xxxxx:2379,https://gc03.xxxxx:2379,https://gc04.xxxxx:2379,https://gc05.xxxxx:2379 \

What to do?

@elcct
Copy link

elcct commented Aug 25, 2016

It seems like this is the same problem: #27343 (comment)

@a9b3
Copy link

a9b3 commented Oct 8, 2016

@elcct Hello did you ever find a solution to this?

@elcct
Copy link

elcct commented Nov 7, 2016

@esayemm sadly not. Also tried Kubernetes 1.4.5 - doesn't work :(

@sergeyfd
Copy link

Still same issue in 1.5.1

@upolymorph
Copy link

Yes I have the same issue too with 1.5.1 and checked that curl with correct --cacert --cert --key option is able to get from etcd url/heath: {"health": "true"}

@yawboateng
Copy link

seeing the same issue in 1.5.2

@elcct
Copy link

elcct commented Jan 27, 2017

It seems like the fix is in the making, there is a pull request:

#39716

@strugglingyouth
Copy link

Yes I have the same issue too .

etcdctl version: 3.1.3
API version: 3.1
Kubernetes v1.6.0-beta.1 (master and node)

config apiserver as follow:

--etcd-cafile='/var/run/kubernetes/ca.pem' --etcd-certfile='/var/run/kubernetes/client.pem' --etcd-keyfile='/var/run/kubernetes/client-key.pem' --client-ca-file='/var/run/kubernetes/ca.pem'

create deployment and other is right, etcd is ok.

$ kubectl get cs
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-0 Unhealthy Get https://xxxx:2379/health: remote error: tls: bad certificate

@javapapo
Copy link

javapapo commented Mar 24, 2017

+1

Kubernetes : 1.5.3
Cloud Provider: AWS
Installed with : kube-aws

etcd-2               Unhealthy   Get https://xxxx.compute.internal:2379/health: remote error: tls: bad certificate
etcd-1               Unhealthy   Get https://xxxxxx.compute.internal:2379/health: remote error: tls: bad certificate
etcd-0               Unhealthy   Get https://xxxxxx.compute.internal:2379/health: remote error: tls: bad certificate

@kgrvamsi
Copy link

Does Kubectl have a option to provide the etcd key when trying to query the component status?
something like

kubectl --etcd-key key.pem get cs

@k8s-github-robot
Copy link

@xgerman There are no sig labels on this issue. Please add a sig label by:
(1) mentioning a sig: @kubernetes/sig-<team-name>-misc
(2) specifying the label manually: /sig <label>

Note: method (1) will trigger a notification to the team. You can find the team list here.

@k8s-github-robot k8s-github-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label May 31, 2017
@xgerman
Copy link
Author

xgerman commented Jun 6, 2017

@kubernetes/sig-ui

@x1957
Copy link
Contributor

x1957 commented Jun 16, 2017

[root@c3-cloudml-srv-ct01 etcd]# kubectl version
Client Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.4", GitCommit:"d6f433224538d4f9ca2f7ae19b252e6fcb66a3ae", GitTreeState:"clean", BuildDate:"2017-05-19T18:44:27Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.4", GitCommit:"d6f433224538d4f9ca2f7ae19b252e6fcb66a3ae", GitTreeState:"clean", BuildDate:"2017-05-19T18:33:17Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"linux/amd64"}
[root@c3-cloudml-srv-ct01 etcd]# etcdctl --ca-file=/root/ssl/etcd/ca.pem --cert-file=/root/ssl/etcd/etcd.pem --key-file=/root/ssl/etcd/etcd-key.pem --endpoints=https://xxxx:2379,https://xxxx:2379,https://xxxx:2379 cluster-health
2017-06-16 09:41:36.277313 I | warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
2017-06-16 09:41:36.278113 I | warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
member 269592983f1bbc6c is healthy: got healthy result from https://xxxx:2379
member 961ff05c1312685d is healthy: got healthy result from https://xxxx:2379
member d64dfe4c6ed3bddd is healthy: got healthy result from https://xxxx:2379
cluster is healthy

[root@c3-cloudml-srv-ct01 etcd]# kubectl get cs
NAME                 STATUS      MESSAGE                                                                    ERROR
controller-manager   Healthy     ok                                                                         
scheduler            Healthy     ok                                                                         
etcd-0               Unhealthy   Get https://xxxx:2379/health: remote error: tls: bad certificate   
etcd-2               Unhealthy   Get https://xxxx:2379/health: remote error: tls: bad certificate   
etcd-1               Unhealthy   Get https://xxxx:2379/health: remote error: tls: bad certificate   

@antoineco
Copy link
Contributor

Fixed in #39716. Will be in Kubernetes 1.7.

zultron added a commit to zultron/freeipa-cloud-prov that referenced this issue Aug 2, 2017
- Start distinguishing `master_host` into FreeIPA and k8s API servers
  - Why:
    - Dogtag CA and API server are memory-heavy; put on different machines
    - Eventual API server redundancy
  - Redo groups and hosts.yaml file
  - Replace `master_host` etc. with `freeipa_master_host`
- Parallel IPA requests breaking
  - Multiple, parallel requests to IPA server result in "Unauthorized" errors
  - No clean way to serialize; separate IPA operations into plays and
    use `serial: 1`
  - IPA cert operations:  put into a proper role
  - Other operations:  handle individually
- Update to k8s version 1.7.0
  - Earlier versions have trouble with TLS to etcd
  - kubernetes/kubernetes#29330
- Update dns and dashboard addons and manifests
- Generalize use of `etcd_cluster_token` -> `cluster_id`
- README notes
- Download kubeadm
zultron added a commit to zultron/freeipa-cloud-prov that referenced this issue Aug 2, 2017
- Start distinguishing `master_host` into FreeIPA and k8s API servers
  - Why:
    - Dogtag CA and API server are memory-heavy; put on different machines
    - Eventual API server redundancy
  - Redo groups and hosts.yaml file
  - Replace `master_host` etc. with `freeipa_master_host`
- Parallel IPA requests breaking
  - Multiple, parallel requests to IPA server result in "Unauthorized" errors
  - No clean way to serialize; separate IPA operations into plays and
    use `serial: 1`
  - IPA cert operations:  put into a proper role
  - Other operations:  handle individually
- Update to k8s version 1.7.0
  - Earlier versions have trouble with TLS to etcd
  - kubernetes/kubernetes#29330
- Update dns and dashboard addons and manifests
- Generalize use of `etcd_cluster_token` -> `cluster_id`
- README notes
- Download kubeadm
trevor-vaughan added a commit to jeefberkey/pupmod-simp-simp_kubernetes that referenced this issue Jan 2, 2018
* Updated to work with the bumped stdlib
* Updated to use the new "proper" certs from the latest
  simp-beaker-helpers
* Tests will fail on checking 'componentstatus' using 'kubectl' due to a
  known bug in Kubernetes < 1.7 per
  kubernetes/kubernetes#29330
trevor-vaughan pushed a commit to simp/pupmod-simp-simp_kubernetes that referenced this issue Jan 2, 2018
* Adds parameters and code to manage ports related to
  kubernetes with simp/iptables
* Tests will fail on checking 'componentstatus' using 'kubectl' due to a
  known bug in Kubernetes < 1.7 per
  kubernetes/kubernetes#29330

SIMP-4158 #close
SIMP-4187 #close
@avalonzst
Copy link

problem still occurs for kubenetes1.6.0 with etcd 3.3.8
[root@umsk8s-master kubernetes]# etcdctl --version
etcdctl version: 3.3.8
API version: 2
[root@umsk8s-master kubernetes]# kube-apiserver --version
Kubernetes v1.6.0
[root@umsk8s-master kubernetes]# kubectl get cs
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-0 Unhealthy Get https://172.30.251.200:20079/health: remote error: tls: bad certificate
etcd-1 Unhealthy Get https://172.30.251.201:20179/health: remote error: tls: bad certificate
etcd-4 Unhealthy Get https://172.30.251.204:20479/health: remote error: tls: bad certificate
etcd-2 Unhealthy Get https://172.30.251.202:20279/health: remote error: tls: bad certificate
etcd-3 Unhealthy Get https://172.30.251.203:20379/health: remote error: tls: bad certificate
[root@umsk8s-master kubernetes]#

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/apiserver needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one.
Projects
None yet
Development

No branches or pull requests