New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"oc cluster up" fails for OpenShift 3.10 on clean CentOS 7 install #21127

Open
leoluk opened this Issue Sep 28, 2018 · 1 comment

Comments

Projects
None yet
4 participants
@leoluk

leoluk commented Sep 28, 2018

oc cluster up fails to provision a working cluster on CentOS 7.

Version

cat /etc/centsos-release

CentOS Linux release 7.5.1804 (Core)

oc version

oc v3.10.0+0c4577e-1
kubernetes v1.10.0+b81c8f8
features: Basic-Auth GSSAPI Kerberos SPNEGO
Steps To Reproduce
yum -y upgrade
yum -y install git docker vim centos-release-openshift-origin310
yum -y install origin-clients

cat <<EOF > /etc/sysconfig/docker
OPTIONS="--log-driver=journald --insecure-registry 172.30.0.0/16 --signature-verification=false"
EOF

systemctl enable docker

reboot

oc cluster up \
    --public-hostname $(hostname -f) \
    --routing-suffix apps.$(hostname --ip-address).nip.io
Current Result
Getting a Docker client ...
Checking if image openshift/origin-control-plane:v3.10 is available ...
Pulling image openshift/origin-control-plane:v3.10
[...]
Extracting
Image pull complete
Pulling image openshift/origin-cli:v3.10
[...]
Extracting
Image pull complete
Pulling image openshift/origin-node:v3.10
Pulled 5/6 layers, 97% complete
Pulled 6/6 layers, 100% complete
Extracting
Image pull complete
Checking type of volume mount ...
Determining server IP ...
Checking if OpenShift is already running ...
Checking for supported Docker version (=>1.22) ...
Checking if insecured registry is configured properly in Docker ...
Checking if required ports are available ...
Checking if OpenShift client is configured properly ...
Checking if image openshift/origin-control-plane:v3.10 is available ...
Starting OpenShift using openshift/origin-control-plane:v3.10 ...
I0928 20:35:15.213960    1491 config.go:42] Running "create-master-config"
I0928 20:35:17.523458    1491 config.go:46] Running "create-node-config"
I0928 20:35:18.630506    1491 flags.go:30] Running "create-kubelet-flags"
I0928 20:35:19.170574    1491 run_kubelet.go:48] Running "start-kubelet"
I0928 20:35:19.371456    1491 run_self_hosted.go:172] Waiting for the kube-apiserver to be ready ...
[times out]

docker ps:

CONTAINER ID        IMAGE                                                                                                              COMMAND                  CREATED             STATUS              PORTS               NAMES
b228481f7b73        docker.io/openshift/origin-hyperkube@sha256:f85d84c28708124384f9590328760a38e6143ba1bb86747d38c7e5f2ce146eb3       "hyperkube kube-sc..."   2 minutes ago       Up 2 minutes                            k8s_scheduler_kube-scheduler-localhost_kube-system_feea0d296a38857e456efdfa89eca489_0
4f2ed0f11f14        docker.io/openshift/origin-hyperkube@sha256:f85d84c28708124384f9590328760a38e6143ba1bb86747d38c7e5f2ce146eb3       "hyperkube kube-co..."   2 minutes ago       Up 2 minutes                            k8s_controllers_kube-controller-manager-localhost_kube-system_e421ebb130b9ea5294d8bb1fc800f8ba_0
0949fd6700c0        docker.io/openshift/origin-control-plane@sha256:7b50902e49af679c5d81ad0f8d556ed83abbce41abe8ac9a458ff3d2bbea29e1   "/bin/bash -c '#!/..."   6 minutes ago       Up 6 minutes                            k8s_etcd_master-etcd-localhost_kube-system_903a600ce36860ec6a254ae466567268_0
94e7c1f8d354        openshift/origin-pod:v3.10                                                                                         "/usr/bin/pod"           6 minutes ago       Up 6 minutes                            k8s_POD_master-api-localhost_kube-system_c447e6a5a1f3a6bd9b13a247b700d66d_0
bd1a3f8cc537        openshift/origin-pod:v3.10                                                                                         "/usr/bin/pod"           6 minutes ago       Up 6 minutes                            k8s_POD_kube-scheduler-localhost_kube-system_feea0d296a38857e456efdfa89eca489_0
8524a0ea9b54        openshift/origin-pod:v3.10                                                                                         "/usr/bin/pod"           6 minutes ago       Up 6 minutes                            k8s_POD_master-etcd-localhost_kube-system_903a600ce36860ec6a254ae466567268_0
aa76cf854c5b        openshift/origin-pod:v3.10                                                                                         "/usr/bin/pod"           6 minutes ago       Up 6 minutes                            k8s_POD_kube-controller-manager-localhost_kube-system_e421ebb130b9ea5294d8bb1fc800f8ba_0
e045ea576de6        openshift/origin-node:v3.10                                                                                        "hyperkube kubelet..."   6 minutes ago       Up 6 minutes                            origin

lsof -i -P -n

COMMAND    PID   USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
chronyd    529 chrony    1u  IPv4   9918      0t0  UDP 127.0.0.1:323 
chronyd    529 chrony    2u  IPv6   9919      0t0  UDP [::1]:323 
dhclient   788   root    6u  IPv4  10105      0t0  UDP *:68 
sshd       864   root    3u  IPv4  14767      0t0  TCP *:22 (LISTEN)
sshd       864   root    4u  IPv6  14769      0t0  TCP *:22 (LISTEN)
master     967   root   13u  IPv4  16558      0t0  TCP 127.0.0.1:25 (LISTEN)
sshd       993   root    3u  IPv4  14145      0t0  TCP 5.83.163.19:22->194.169.211.233:54924 (ESTABLISHED)
hyperkube 2124   root   18u  IPv6  29971      0t0  TCP *:10250 (LISTEN)
hyperkube 2124   root   21u  IPv4  45156      0t0  TCP 127.0.0.1:41794->127.0.0.1:8443 (ESTABLISHED)
openshift 2428   root    3u  IPv6  34316      0t0  TCP *:7001 (LISTEN)
openshift 2428   root    5u  IPv6  34317      0t0  TCP *:4001 (LISTEN)
openshift 2428   root    9u  IPv6  45158      0t0  TCP 127.0.0.1:4001->127.0.0.1:56258 (ESTABLISHED)
[...]
openshift 2428   root   74u  IPv6  43558      0t0  TCP 127.0.0.1:4001->127.0.0.1:56416 (ESTABLISHED)
hyperkube 2621   root    3u  IPv6  41013      0t0  TCP *:10252 (LISTEN)
hyperkube 2621   root    5u  IPv4  44500      0t0  TCP 127.0.0.1:41908->127.0.0.1:8443 (ESTABLISHED)
hyperkube 2621   root    6u  IPv4  45237      0t0  TCP 127.0.0.1:42028->127.0.0.1:8443 (ESTABLISHED)
hyperkube 2676   root    5u  IPv6  38363      0t0  TCP *:10251 (LISTEN)
hyperkube 2676   root    6u  IPv4  45165      0t0  TCP 127.0.0.1:41818->127.0.0.1:8443 (ESTABLISHED)
hypershif 2848   root    3u  IPv4  44466      0t0  TCP *:8443 (LISTEN)
hypershif 2848   root    5u  IPv4  43486      0t0  TCP 127.0.0.1:56258->127.0.0.1:4001 (ESTABLISHED)
[...]
hypershif 2848   root  100u  IPv4  45208      0t0  TCP 127.0.0.1:41974->127.0.0.1:8443 (ESTABLISHED)
hypershif 2848   root  107u  IPv4  45214      0t0  TCP 127.0.0.1:8443->127.0.0.1:41974 (ESTABLISHED)

oc login -u system:admin

Server [https://localhost:8443]: 
The server uses a certificate signed by an unknown authority.
You can bypass the certificate check, but any data you send to the server could be intercepted by others.
Use insecure connections? (y/n): y

Error from server (InternalError): Internal error occurred: unexpected response: 400

docker logs origin

I0928 18:35:19.773902    2124 remote_runtime.go:43] Connecting to runtime service unix:///var/run/dockershim.sock
I0928 18:35:19.775212    2124 kuberuntime_manager.go:186] Container runtime docker initialized, version: 1.13.1, apiVersion: 1.26.0
W0928 18:35:19.775818    2124 probe.go:218] Flexvolume plugin directory at /usr/libexec/kubernetes/kubelet-plugins/volume/exec/ does not exist. Recreating.
I0928 18:35:19.780427    2124 csi_plugin.go:63] kubernetes.io/csi: plugin initializing...
I0928 18:35:19.781389    2124 server.go:129] Starting to listen on 0.0.0.0:10250
E0928 18:35:19.781892    2124 kubelet.go:1290] Image garbage collection failed once. Stats initialization may not have completed yet: failed to get imageFs info: unable to find data for container /
E0928 18:35:19.782005    2124 event.go:209] Unable to write event: 'Post https://localhost:8443/api/v1/namespaces/default/events: dial tcp 127.0.0.1:8443: getsockopt: connection refused' (may retry after sleeping)
I0928 18:35:19.782072    2124 server.go:304] Adding debug handlers to kubelet server.
I0928 18:35:19.782453    2124 server.go:957] Started kubelet
I0928 18:35:19.782524    2124 fs_resource_analyzer.go:66] Starting FS ResourceAnalyzer
I0928 18:35:19.782551    2124 status_manager.go:140] Starting to sync pod status with apiserver
I0928 18:35:19.782563    2124 kubelet.go:1790] Starting kubelet main sync loop.
I0928 18:35:19.782581    2124 kubelet.go:1807] skipping pod synchronization - [container runtime is down PLEG is not healthy: pleg was last seen active 2562047h47m16.854775807s ago; threshold is 3m0s]
I0928 18:35:19.782911    2124 volume_manager.go:247] Starting Kubelet Volume Manager
I0928 18:35:19.783476    2124 desired_state_of_world_populator.go:129] Desired state populator starts to run
I0928 18:35:19.882704    2124 kubelet.go:1807] skipping pod synchronization - [container runtime is down]
I0928 18:35:19.883155    2124 kubelet_node_status.go:294] Setting node annotation to enable volume controller attach/detach
I0928 18:35:19.885156    2124 kubelet_node_status.go:82] Attempting to register node localhost
E0928 18:35:19.885661    2124 kubelet_node_status.go:106] Unable to register node "localhost" with API server: Post https://localhost:8443/api/v1/nodes: dial tcp 127.0.0.1:8443: getsockopt: connection refused
I0928 18:35:20.082957    2124 kubelet.go:1807] skipping pod synchronization - [container runtime is down]
I0928 18:35:20.085778    2124 kubelet_node_status.go:294] Setting node annotation to enable volume controller attach/detach
I0928 18:35:20.087888    2124 kubelet_node_status.go:82] Attempting to register node localhost
E0928 18:35:20.088310    2124 kubelet_node_status.go:106] Unable to register node "localhost" with API server: Post https://localhost:8443/api/v1/nodes: dial tcp 127.0.0.1:8443: getsockopt: connection refused
I0928 18:35:20.483228    2124 kubelet.go:1807] skipping pod synchronization - [container runtime is down]
I0928 18:35:20.488509    2124 kubelet_node_status.go:294] Setting node annotation to enable volume controller attach/detach
I0928 18:35:20.490798    2124 kubelet_node_status.go:82] Attempting to register node localhost
E0928 18:35:20.491202    2124 kubelet_node_status.go:106] Unable to register node "localhost" with API server: Post https://localhost:8443/api/v1/nodes: dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0928 18:35:20.698036    2124 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://localhost:8443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0928 18:35:20.699579    2124 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:452: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0928 18:35:20.700615    2124 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:461: Failed to list *v1.Node: Get https://localhost:8443/api/v1/nodes?fieldSelector=metadata.name%3Dlocalhost&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused
I0928 18:35:21.283473    2124 kubelet.go:1807] skipping pod synchronization - [container runtime is down]
I0928 18:35:21.291413    2124 kubelet_node_status.go:294] Setting node annotation to enable volume controller attach/detach
I0928 18:35:21.293669    2124 kubelet_node_status.go:82] Attempting to register node localhost
E0928 18:35:21.294081    2124 kubelet_node_status.go:106] Unable to register node "localhost" with API server: Post https://localhost:8443/api/v1/nodes: dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0928 18:35:21.698837    2124 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://localhost:8443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0928 18:35:21.700264    2124 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:452: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused
E0928 18:35:21.701296    2124 reflector.go:205] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/kubelet/kubelet.go:461: Failed to list *v1.Node: Get https://localhost:8443/api/v1/nodes?fieldSelector=metadata.name%3Dlocalhost&limit=500&resourceVersion=0: dial tcp 127.0.0.1:8443: getsockopt: connection refused
I0928 18:35:21.955777    2124 kubelet_node_status.go:294] Setting node annotation to enable volume controller attach/detach
I0928 18:35:21.958076    2124 cpu_manager.go:155] [cpumanager] starting with none policy
I0928 18:35:21.958095    2124 cpu_manager.go:156] [cpumanager] reconciling every 10s
I0928 18:35:21.958105    2124 policy_none.go:42] [cpumanager] none policy: Start
Starting Device Plugin manager
E0928 18:35:22.015699    2124 eviction_manager.go:246] eviction manager: failed to get get summary stats: failed to get node info: node "localhost" not found

iptables -S

-P INPUT ACCEPT
-P FORWARD DROP
-P OUTPUT ACCEPT
-N DOCKER
-N DOCKER-ISOLATION
-N KUBE-FIREWALL
-A INPUT -j KUBE-FIREWALL
-A FORWARD -j DOCKER-ISOLATION
-A FORWARD -o docker0 -j DOCKER
-A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -i docker0 ! -o docker0 -j ACCEPT
-A FORWARD -i docker0 -o docker0 -j ACCEPT
-A OUTPUT -j KUBE-FIREWALL
-A DOCKER-ISOLATION -j RETURN
-A KUBE-FIREWALL -m comment --comment "kubernetes firewall for dropping marked packets" -m mark --mark 0x8000/0x8000 -j DROP

Both hostname -f and hostname --ip-address return the right address.

It worked fine with 3.9. The server has been freshly installed.

Happy to upload more logs, but it's cleanly reproducible.

Possibly a duplicate of #20726 and #20964.

Is anyone actually using oc cluster up?

Expected Result

Functional cluster.

@jwforres

This comment has been minimized.

Show comment
Hide comment
@jwforres
Member

jwforres commented Oct 11, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment