DEBUG Still waiting for the Kubernetes API: Get https://mydomain.kz:6443/version?timeout=32s: EOF #2615

Nurlan199206 · 2019-11-02T19:55:17Z

I wanna build Openshift Container Platform cluster on bare metal. I am using GCP ComputeEngine for this.

RHEL 7 on VM instances...

i have:
1 bootstrap
3 masters
2 workers
1 LB for API (haproxy)

Version

4.2

$ openshift-install version
openshift-install v4.2.0
built from commit 90ccb37ac1f85ae811c50a29f9bb7e779c5045fb
release image quay.io/openshift-release-dev/ocp-release@sha256:c5337afd85b94c93ec513f21c8545e3f9e36a227f55d41bc1dfb8fcc3f2be129

Platform:

What happened?

DEBUG OpenShift Installer v4.2.0                   
DEBUG Built from commit 90ccb37ac1f85ae811c50a29f9bb7e779c5045fb 
INFO Waiting up to 30m0s for the Kubernetes API at https://api.ocp.sysadm.kz:6443... 
DEBUG Still waiting for the Kubernetes API: Get https://api.ocp.sysadm.kz:6443/version?timeout=32s: EOF 
DEBUG Still waiting for the Kubernetes API: Get https://api.ocp.sysadm.kz:6443/version?timeout=32s: EOF 
DEBUG Still waiting for the Kubernetes API: Get https://api.ocp.sysadm.kz:6443/version?timeout=32s: EOF 
DEBUG Still waiting for the Kubernetes API: Get https://api.ocp.sysadm.kz:6443/version?timeout=32s: EOF 
DEBUG Still waiting for the Kubernetes API: Get https://api.ocp.sysadm.kz:6443/version?timeout=32s: EOF 
DEBUG Still waiting for the Kubernetes API: Get https://api.ocp.sysadm.kz:6443/version?timeout=32s: EOF 
DEBUG Still waiting for the Kubernetes API: Get https://api.ocp.sysadm.kz:6443/version?timeout=32s: EOF 
DEBUG Still waiting for the Kubernetes API: Get https://api.ocp.sysadm.kz:6443/version?timeout=32s: EOF 
DEBUG Still waiting for the Kubernetes API: Get https://api.ocp.sysadm.kz:6443/version?timeout=32s: EOF 
DEBUG Still waiting for the Kubernetes API: Get https://api.ocp.sysadm.kz:6443/version?timeout=32s: EOF 
DEBUG Still waiting for the Kubernetes API: Get https://api.ocp.sysadm.kz:6443/version?timeout=32s: EOF 
DEBUG Still waiting for the Kubernetes API: Get https://api.ocp.sysadm.kz:6443/version?timeout=32s: EOF

Enter text here.
See the troubleshooting documentation for ideas about what information to collect.
For example, if the installer fails to create resources, attach the relevant portions of your .openshift_install.log.

What you expected to happen?

Openshift can't find API....
Enter text here.

How to reproduce it (as minimally and precisely as possible)?

./openshift-install wait-for bootstrap-complete --log-level debug

$ ./openshift-install wait-for bootstrap-complete --log-level debug

Anything else we need to know?

my DNS.

my LB config

listen stats
    bind :9000
    mode http
    stats enable
    stats uri /
    monitor-uri /healthz
frontend openshift-api-server
    bind 10.172.0.3:6443
    default_backend openshift-api-server
    mode tcp
    option tcplog
backend openshift-api-server
    balance source
    mode tcp
    server bootstrap 10.132.0.2:6443 check
    server master0 10.166.0.2:6443 check
    server master1 10.164.0.23:6443 check
    server master2 10.166.0.6:6443 check
    
frontend machine-config-server
    bind 10.172.0.3:22623
    default_backend machine-config-server
    mode tcp
    option tcplog
backend machine-config-server
    balance source
    mode tcp
    server bootstrap 10.132.0.2:22623 check
    server master0 10.166.0.2:22623 check
    server master1 10.164.0.23:22623 check
    server master2 10.166.0.6:22623 check
  
frontend ingress-http
    bind 10.172.0.3:80
    default_backend ingress-http
    mode tcp
    option tcplog
backend ingress-http
    balance source
    mode tcp
    server worker0 10.166.0.4:80 check
    server worker1 10.166.0.5:80 check
   
frontend ingress-https
    bind 10.172.0.3:443
    default_backend ingress-https
    mode tcp
    option tcplog
backend ingress-https
    balance source
    mode tcp
    server worker0 10.166.0.4:443 check
    server worker1 10.166.0.5:443 check

Enter text here.

References

enter text here.

The text was updated successfully, but these errors were encountered:

Nurlan199206 · 2019-11-04T11:11:51Z

ANY HELP????

abhinavdahiya · 2019-11-04T17:31:05Z

Make sure you have the DNS, LB, conenctivity setup correctly based on
https://docs.openshift.com/container-platform/4.2/installing/installing_bare_metal/installing-bare-metal.html#installation-network-user-infra_installing-bare-metal
https://docs.openshift.com/container-platform/4.2/installing/installing_bare_metal/installing-bare-metal.html#installation-dns-user-infra_installing-bare-metal

Also, you can capture the failure logs by using

openshift-install gather bootstrap --bootstrap <bootstrap-host-ip> --master <control-plane-host-ip> [--master <control-plane-host-ip> ...]

which will provide us the necessary logs to debug the failure.

Nurlan199206 · 2019-11-05T06:05:31Z

@abhinavdahiya i need buy something from here? https://cloud.redhat.com/openshift/install/metal/user-provisioned for example pull secret? []

abhinavdahiya · 2019-11-05T20:42:10Z

@abhinavdahiya i need buy something from here? https://cloud.redhat.com/openshift/install/metal/user-provisioned for example pull secret? []

i'm not sure what you mean by buy something from here, you need the pullsecret so that you can pull container images for the redhat components.

redmark-redhat · 2019-11-15T15:26:15Z

I'm seeing the same error here, an solution?

fatal: [192.168.79.2]: FAILED! => {"changed": true, "cmd": "openshift-install --dir=pwd wait-for bootstrap-complete --log-level debug", "delta": "0:30:00.132730", "end": "2019-11-15 10:11:17.169260", "msg": "non-zero return code", "rc": 1, "start": "2019-11-15 09:41:17.036530", "stderr": "level=debug msg="OpenShift Installer unreleased-master-1805-g425e4ff0037487e32571258640b39f56d5ee5572"\nlevel=debug msg="Built from commit 425e4ff"\nlevel=info msg="Waiting up to 30m0s for the Kubernetes API at https://api.ocp-ppc64le-test-099bdc.redhat.com:6443...\"\nlevel=debug msg="Still waiting for the Kubernetes API: Get https://api.ocp-ppc64le-test-099bdc.redhat.com:6443/version?timeout=32s: x509: certificate signed by unknown authority (possibly because of \"crypto/rsa: verification error\" while trying to verify candidate authority certificate \"kube-apiserver-lb-signer\")"

Also tried wget

wget https://api.ocp-ppc64le-test-099bdc.redhat.com:6443
--2019-11-15 10:27:53-- https://api.ocp-ppc64le-test-099bdc.redhat.com:6443/
Resolving api.ocp-ppc64le-test-099bdc.redhat.com (api.ocp-ppc64le-test-099bdc.redhat.com)... 192.168.122.168
Connecting to api.ocp-ppc64le-test-099bdc.redhat.com (api.ocp-ppc64le-test-099bdc.redhat.com)|192.168.122.168|:6443... connected.
ERROR: The certificate of ‘api.ocp-ppc64le-test-099bdc.redhat.com’ is not trusted.
ERROR: The certificate of ‘api.ocp-ppc64le-test-099bdc.redhat.com’ hasn't got a known issuer.

abhinavdahiya · 2019-11-15T17:05:41Z

@redmark-alt

I'm seeing the same error here, an solution?

it isn't the same error..

DEBUG Still waiting for the Kubernetes API: Get https://api.ocp.sysadm.kz:6443/version?timeout=32s: EOF

vs yours

Still waiting for the Kubernetes API: Get https://api.ocp-ppc64le-test-099bdc.redhat.com:6443/version?timeout=32s: x509: certificate signed by unknown authority (possibly because of \"crypto/rsa: verification error\" while trying to verify candidate authority certificate \"kube-apiserver-lb-signer\")

is this the same platform as above ie GCP
how are you creating the cluster?

and are you using layer-4 LB and hopefully your LB is not doing the tls termination.

redmark-redhat · 2019-11-15T18:40:07Z

No, the platform is RHEL 8 with the OpenShift cluster configured in a KVM environment. We have a set of ansible playbooks configuring the cluster. This the command that fails

name: wait for bootstrap complete
  tags: config
  shell: openshift-install --dir=`pwd` wait-for bootstrap-complete --log-level debug
  args:
    chdir: "{{ workdir }}"
  retries: 1
  delay: 0

Yesterday the error message was a little different as seen here.

Still waiting for the Kubernetes API: Get https://api.ocp-ppc64le-test-099bdc.redhat.com:6443/version?timeout=32s: EOF\"\nlevel=debug msg=\"Still waiting for the Kubernetes API: Get https://api.ocp-ppc64le-test-099bdc.redhat.com:6443/version?timeout=32s: EOF\"\nlevel=debug

I don't remember making a change to any of the install playbooks. Let me run it again.

Nurlan199206 · 2019-11-21T16:36:27Z

@abhinandan13jan

./openshift-install gather bootstrap --bootstrap 10.132.0.2 --master ocp-master01.sysadm.kz
INFO Pulling debug logs from the bootstrap machine 
FATAL failed to create SSH client, ensure the proper ssh key is in your keyring or specify with --key: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain

Nurlan199206 · 2019-11-21T16:38:10Z

but SSH via ssh root@ocp-master01.sysadm.kz it works between bootstrap and master01 nodes..

Nurlan199206 · 2019-11-23T14:03:07Z

Still endless :6443/version?timeout=32s: EOF HELP!!!! LB,DNS settings correct!!!

Nurlan199206 · 2019-11-23T15:35:20Z

Openshift 4.x supports only RedHat CoreOS? becuase i'm using RHEL 7 for cluster.

ChrystianDuarte · 2019-11-26T23:18:58Z

Still endless :6443/version?timeout=32s: EOF HELP!!!! LB,DNS settings correct!!!

I have the same problem
Any ideas?

jomeier · 2019-12-01T08:07:30Z

I had the same problem yesterday. I often create / delete VMs for tests.

Restart the load Balancer. In my case that helped.

abhinavdahiya · 2019-12-02T21:52:51Z

but SSH via ssh root@ocp-master01.sysadm.kz it works between bootstrap and master01 nodes..

Make sure you are using RHCOS for control-plane that's the only supported OS.
and the user used by installer gather is core and not root.

if you specified the public SSH key during installation, the machines should already have that.

And as for the error. the only way we can help debug is if you provide the log bundle using openshift-install gather bootstrap --bootstrap <bootstrap-host-ip> --master <control-plane-0-ip> [--master <control-plane-$idx-ip>]

you can run openshift-install gather bootstrap --help for information on how to specify the SSH key, otherwise it tries to use an already running SSH agent..

whls · 2019-12-19T07:33:32Z

@abhinavdahiya I Ihave the some error:
[root@api ocp4]# ./openshift-install wait-for bootstrap-complete --log-level debug
DEBUG OpenShift Installer v4.2.1
DEBUG Built from commit e349157
INFO Waiting up to 30m0s for the Kubernetes API at https://api.ocp4.whls.com:6443...
DEBUG Still waiting for the Kubernetes API: Get https://api.ocp4.whls.com:6443/version?timeout=32s: EOF
DEBUG Still waiting for the Kubernetes API: Get https://api.ocp4.whls.com:6443/version?timeout=32s: EOF

I already collect logs with command:
[root@api log]# /root/ocp4/openshift-install gather bootstrap --bootstrap bootstrap.ocp4.whls.com --master master0.ocp4.whls.com
INFO Pulling debug logs from the bootstrap machine
INFO Bootstrap gather logs captured here "log-bundle-20191219151525.tar.gz"

Could you please help to debug this problem?
log-bundle-20191219151525.tar.gz

jomeier · 2019-12-19T08:49:04Z

Do you have a load balancer (HAProxy) before your Bootstrap and Master servers? I also had this problem. A restart of the load balancer solved it.

…

Am 19.12.2019 um 08:33 schrieb whls ***@***.***>: @abhinavdahiya I Ihave the some error: ***@***.*** ocp4]# ./openshift-install wait-for bootstrap-complete --log-level debug DEBUG OpenShift Installer v4.2.1 DEBUG Built from commit e349157 INFO Waiting up to 30m0s for the Kubernetes API at https://api.ocp4.whls.com:6443... DEBUG Still waiting for the Kubernetes API: Get https://api.ocp4.whls.com:6443/version?timeout=32s: EOF DEBUG Still waiting for the Kubernetes API: Get https://api.ocp4.whls.com:6443/version?timeout=32s: EOF I already collect logs with command: ***@***.*** log]# /root/ocp4/openshift-install gather bootstrap --bootstrap bootstrap.ocp4.whls.com --master master0.ocp4.whls.com INFO Pulling debug logs from the bootstrap machine INFO Bootstrap gather logs captured here "log-bundle-20191219151525.tar.gz" Could you please help to debug this problem? log-bundle-20191219151525.tar.gz — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

whls · 2019-12-19T09:10:38Z

@jomeier Thanks for you reply.
Yes, I have a haproxy server for LB.
Here is my HAproxy server configuration

[root@api ocp4]# cat /etc/haproxy/haproxy.cfg

log         127.0.0.1 local2

chroot      /var/lib/haproxy
pidfile     /var/run/haproxy.pid
maxconn     4000
user        haproxy
group       haproxy
daemon


stats socket /var/lib/haproxy/stats

defaults
mode                    http
log                     global
option                  httplog
option                  dontlognull
option http-server-close

option                  redispatch
retries                 3
timeout http-request    10s
timeout queue           1m
timeout connect         10s
timeout client          1m
timeout server          1m
timeout http-keep-alive 10s
timeout check           10s
maxconn                 3000

listen stats
bind :9000
mode http
stats enable
stats uri /
monitor-uri /healthz


frontend openshift-api-server
bind *:6443
default_backend openshift-api-server
mode tcp
option tcplog

backend openshift-api-server
balance source
mode tcp
server bootstrap 9.98.30.45:6443 check
server master0 9.98.30.46:6443 check
server master1 9.98.30.47:6443 check
server master2 9.98.30.48:6443 check

frontend machine-config-server
bind *:22623
default_backend machine-config-server
mode tcp
option tcplog

backend machine-config-server
balance source
mode tcp
server bootstrap 9.98.30.45:22623 check
server master0 9.98.30.46:22623 check
server master1 9.98.30.47:22623 check
server master2 9.98.30.48:22623 check

frontend ingress-http
bind *:80
default_backend ingress-http
mode tcp
option tcplog

backend ingress-http
balance source
mode tcp
server worker0 9.98.30.54:80 check
server worker1 9.98.30.55:80 check
server worker2 9.98.30.56:80 check

frontend ingress-https
bind *:443
default_backend ingress-https
mode tcp
option tcplog

backend ingress-https
balance source
mode tcp
server worker0 9.98.30.54:443 check
server worker1 9.98.30.55:443 check
server worker2 9.98.30.56:443 check

The HAproxy service is running ,and the port is opening

[root@api ocp4]# netstat -tunlp |grep 80
tcp        0      0 0.0.0.0:80              0.0.0.0:*               LISTEN      5294/haproxy
udp        0      0 0.0.0.0:67              0.0.0.0:*                           7780/dnsmasq
[root@api ocp4]# netstat -tunlp |grep 443
tcp        0      0 0.0.0.0:6443            0.0.0.0:*               LISTEN      5294/haproxy
tcp        0      0 0.0.0.0:443             0.0.0.0:*               LISTEN      5294/haproxy
[root@api ocp4]# netstat -tunlp |grep 22623
tcp        0      0 0.0.0.0:22623           0.0.0.0:*               LISTEN      5294/haproxy

whls · 2019-12-19T09:21:43Z

Here is my DNS configuration

[root@ns1 ignition]# cat /var/named/data/whls.com.zone
$TTL 1W
@       IN      SOA     ns1.whls.com.   root (
                        2019070700      ; serial
                        3H              ; refresh (3 hours)
                        30M             ; retry (30 minutes)
                        2W              ; expiry (2 weeks)
                        1W )            ; minimum (1 week)
        IN      NS      ns1.whls.com.
        IN      MX 10   smtp.whls.com.
;
;
ns1     IN      A       9.98.30.44
smtp    IN      A       9.98.30.44
;
; The api points to the IP of your load balancer
api.ocp4                IN      A       9.98.30.59
api-int.ocp4            IN      A       9.98.30.59
;
; The wildcard also points to the load balancer
*.apps.ocp4             IN      A       9.98.30.59
;
; Create entry for the bootstrap host
bootstrap.ocp4  IN      A       9.98.30.45
;
; Create entries for the master hosts
master0.ocp4            IN      A       9.98.30.46
master1.ocp4            IN      A       9.98.30.47
master2.ocp4            IN      A       9.98.30.48
;
; Create entries for the worker hosts
worker0.ocp4            IN      A       9.98.30.54
worker1.ocp4            IN      A       9.98.30.55
worker2.ocp4            IN      A       9.98.30.56
;
; The ETCd cluster lives on the masters...so point these to the IP of the masters
etcd-0.ocp4     IN      A       9.98.30.46
etcd-1.ocp4     IN      A       9.98.30.47
etcd-2.ocp4     IN      A       9.98.30.48
;
; The SRV records are IMPORTANT....make sure you get these right...note the trailing dot at the end...
_etcd-server-ssl._tcp.ocp4.whls.com     IN      SRV     0 10 2380 etcd-0.ocp4.whls.com.
_etcd-server-ssl._tcp.ocp4.whls.com     IN      SRV     0 10 2380 etcd-1.ocp4.whls.com.
_etcd-server-ssl._tcp.ocp4.whls.com     IN      SRV     0 10 2380 etcd-2.ocp4.whls.com.
;
;EOF


[root@ns1 ignition]# cat /var/named/data/named.whls.zone
$TTL 1W
@       IN      SOA     ns1.whls.com.   root (
                        2019070700      ; serial
                        3H              ; refresh (3 hours)
                        30M             ; retry (30 minutes)
                        2W              ; expiry (2 weeks)
                        1W )            ; minimum (1 week)
        IN      NS      ns1.whls.com.
;
; syntax is "last octet" and the host must have fqdn with trailing dot
46      IN      PTR     master0.ocp4.whls.com.
47      IN      PTR     master1.ocp4.whls.com.
48      IN      PTR     master2.ocp4.whls.com.
;
45      IN      PTR     bootstrap.ocp4.whls.com.
;
59      IN      PTR     api.ocp4.whls.com.
59      IN      PTR     api-int.ocp4.whls.com.
;
54      IN      PTR     worker0.ocp4.whls.com.
55      IN      PTR     worker1.ocp4.whls.com.
56      IN      PTR     worker2.ocp4.whls.com.
;
;EOF

jomeier · 2019-12-19T11:56:40Z

Have you restarted HAProxy right after the bootstrap server has finished / after the control plane with the masters was ready?

abhinavdahiya · 2019-12-19T23:01:20Z

@abhinavdahiya I Ihave the some error:
[root@api ocp4]# ./openshift-install wait-for bootstrap-complete --log-level debug
DEBUG OpenShift Installer v4.2.1
DEBUG Built from commit e349157
INFO Waiting up to 30m0s for the Kubernetes API at https://api.ocp4.whls.com:6443...
DEBUG Still waiting for the Kubernetes API: Get https://api.ocp4.whls.com:6443/version?timeout=32s: EOF
DEBUG Still waiting for the Kubernetes API: Get https://api.ocp4.whls.com:6443/version?timeout=32s: EOF

I already collect logs with command:
[root@api log]# /root/ocp4/openshift-install gather bootstrap --bootstrap bootstrap.ocp4.whls.com --master master0.ocp4.whls.com
INFO Pulling debug logs from the bootstrap machine
INFO Bootstrap gather logs captured here "log-bundle-20191219151525.tar.gz"

Could you please help to debug this problem?
log-bundle-20191219151525.tar.gz

from bootstrap/journals/release-image.service

Dec 19 05:58:46 bootstrap.ocp4.whls.com release-image-download.sh[1602]: Error: error pulling image "quay.io/openshift-release-dev/ocp-release@sha256:dc782b44cac3d59101904cc5da2b9d8bdb90e55a07814df50ea7a13071b0f5f0": unable to pull quay.io/openshift-release-dev/ocp-release@sha256:dc782b44cac3d59101904cc5da2b9d8bdb90e55a07814df50ea7a13071b0f5f0: unable to pull image: Error initializing source docker://quay.io/openshift-release-dev/ocp-release@sha256:dc782b44cac3d59101904cc5da2b9d8bdb90e55a07814df50ea7a13071b0f5f0: pinging docker registry returned: Get https://quay.io/v2/: dial tcp: lookup quay.io on 9.98.30.44:53: server misbehaving

The bootstrap-host cannot connect to quay.io to download the release-image. That seems to be the cause for failure..

whls · 2019-12-20T02:53:12Z

@abhinavdahiya I Ihave the some error:
[root@api ocp4]# ./openshift-install wait-for bootstrap-complete --log-level debug
DEBUG OpenShift Installer v4.2.1
DEBUG Built from commit e349157
INFO Waiting up to 30m0s for the Kubernetes API at https://api.ocp4.whls.com:6443...
DEBUG Still waiting for the Kubernetes API: Get https://api.ocp4.whls.com:6443/version?timeout=32s: EOF
DEBUG Still waiting for the Kubernetes API: Get https://api.ocp4.whls.com:6443/version?timeout=32s: EOF
I already collect logs with command:
[root@api log]# /root/ocp4/openshift-install gather bootstrap --bootstrap bootstrap.ocp4.whls.com --master master0.ocp4.whls.com
INFO Pulling debug logs from the bootstrap machine
INFO Bootstrap gather logs captured here "log-bundle-20191219151525.tar.gz"
Could you please help to debug this problem?
log-bundle-20191219151525.tar.gz

from bootstrap/journals/release-image.service
Dec 19 05:58:46 bootstrap.ocp4.whls.com release-image-download.sh[1602]: Error: error pulling image "quay.io/openshift-release-dev/ocp-release@sha256:dc782b44cac3d59101904cc5da2b9d8bdb90e55a07814df50ea7a13071b0f5f0": unable to pull quay.io/openshift-release-dev/ocp-release@sha256:dc782b44cac3d59101904cc5da2b9d8bdb90e55a07814df50ea7a13071b0f5f0: unable to pull image: Error initializing source docker://quay.io/openshift-release-dev/ocp-release@sha256:dc782b44cac3d59101904cc5da2b9d8bdb90e55a07814df50ea7a13071b0f5f0: pinging docker registry returned: Get https://quay.io/v2/: dial tcp: lookup quay.io on 9.98.30.44:53: server misbehaving
The bootstrap-host cannot connect to quay.io to download the release-image. That seems to be the cause for failure..

Thanks for your help.
Yes, I checked my DNS server. It can't be resolved quay.io.
Must all nodes be able to access quay.io? include bootstrap, master and worker？

jomeier · 2019-12-20T05:53:14Z

Yes Von meinem iPhone gesendet

…

Am 20.12.2019 um 03:53 schrieb whls ***@***.***>: @abhinavdahiya I Ihave the some error: ***@***.*** ocp4]# ./openshift-install wait-for bootstrap-complete --log-level debug DEBUG OpenShift Installer v4.2.1 DEBUG Built from commit e349157 INFO Waiting up to 30m0s for the Kubernetes API at https://api.ocp4.whls.com:6443... DEBUG Still waiting for the Kubernetes API: Get https://api.ocp4.whls.com:6443/version?timeout=32s: EOF DEBUG Still waiting for the Kubernetes API: Get https://api.ocp4.whls.com:6443/version?timeout=32s: EOF I already collect logs with command: ***@***.*** log]# /root/ocp4/openshift-install gather bootstrap --bootstrap bootstrap.ocp4.whls.com --master master0.ocp4.whls.com INFO Pulling debug logs from the bootstrap machine INFO Bootstrap gather logs captured here "log-bundle-20191219151525.tar.gz" Could you please help to debug this problem? log-bundle-20191219151525.tar.gz from bootstrap/journals/release-image.service Dec 19 05:58:46 bootstrap.ocp4.whls.com release-image-download.sh[1602]: Error: error pulling image ***@***.***:dc782b44cac3d59101904cc5da2b9d8bdb90e55a07814df50ea7a13071b0f5f0": unable to pull ***@***.***:dc782b44cac3d59101904cc5da2b9d8bdb90e55a07814df50ea7a13071b0f5f0: unable to pull image: Error initializing source ***@***.***:dc782b44cac3d59101904cc5da2b9d8bdb90e55a07814df50ea7a13071b0f5f0: pinging docker registry returned: Get https://quay.io/v2/: dial tcp: lookup quay.io on 9.98.30.44:53: server misbehaving The bootstrap-host cannot connect to quay.io to download the release-image. That seems to be the cause for failure.. Thanks for your help. Yes, I checked my DNS server. It can't be resolved quay.io. Must all nodes be able to access quay.io? include bootstrap, master and worker？ — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

whls · 2019-12-20T07:53:03Z

@abhinavdahiya @jomeier
Thanks for all your help!
After setup DNS forward to public, I have completed the cluster installation. :)
Another question:
I configuration 3 worker nodes for cluster, but after installation, only 2 worker nodes joined cluster, So whether only two work nodes can join automatically by default, If you want more work nodes, you need to join the cluster manually?

Nurlan199206 · 2020-01-25T18:59:08Z

openshift-install gather bootstrap --bootstrap 10.166.0.2 --master 10.132.0.2
INFO Pulling debug logs from the bootstrap machine 
FATAL failed to run remote command: Process exited with status 127

abhinavdahiya · 2020-01-31T22:01:37Z

openshift-install gather bootstrap --bootstrap 10.166.0.2 --master 10.132.0.2
INFO Pulling debug logs from the bootstrap machine 
FATAL failed to run remote command: Process exited with status 127

What Image are you using to boot your bootstrap, control plane and compute?

Dennys503 · 2020-01-31T23:33:49Z

I have the same problem: openshift-install wait-for bootstrap-complete --log-level debug
2020-01-22T17:22:24-06:00" level=debug msg="OpenShift Installer v4.2.13"
level=debug msg="Built from commit 46f909e"
level=info msg="Waiting up to 30m0s for the Kubernetes API at https://api.openshift.empresa.com:6443..."
level=debug msg="Still waiting for the Kubernetes API: Get https://api.openshift.empresa.com:6443/version?timeout=32s: EOF"
level=debug msg="Still waiting for the Kubernetes API: Get https://api.openshift.empresa.com:6443/version?timeout=32s: EOF"

openshift-install gather bootstrap --bootstrap bootstrap.openshift.empresa.com --master master.openshift.empresa.com
INFO Pulling debug logs from the bootstrap machine
FATAL failed to create SSH client, ensure the proper ssh key is in your keyring or specify with --key: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain

Dennys503 · 2020-01-31T23:36:01Z

@abhinavdahiya I Ihave the some error:
[root@api ocp4]# ./openshift-install wait-for bootstrap-complete --log-level debug
DEBUG OpenShift Installer v4.2.1
DEBUG Built from commit e349157
INFO Waiting up to 30m0s for the Kubernetes API at https://api.ocp4.whls.com:6443...
DEBUG Still waiting for the Kubernetes API: Get https://api.ocp4.whls.com:6443/version?timeout=32s: EOF
DEBUG Still waiting for the Kubernetes API: Get https://api.ocp4.whls.com:6443/version?timeout=32s: EOF
I already collect logs with command:
[root@api log]# /root/ocp4/openshift-install gather bootstrap --bootstrap bootstrap.ocp4.whls.com --master master0.ocp4.whls.com
INFO Pulling debug logs from the bootstrap machine
INFO Bootstrap gather logs captured here "log-bundle-20191219151525.tar.gz"
Could you please help to debug this problem?
log-bundle-20191219151525.tar.gz

from bootstrap/journals/release-image.service
Dec 19 05:58:46 bootstrap.ocp4.whls.com release-image-download.sh[1602]: Error: error pulling image "quay.io/openshift-release-dev/ocp-release@sha256:dc782b44cac3d59101904cc5da2b9d8bdb90e55a07814df50ea7a13071b0f5f0": unable to pull quay.io/openshift-release-dev/ocp-release@sha256:dc782b44cac3d59101904cc5da2b9d8bdb90e55a07814df50ea7a13071b0f5f0: unable to pull image: Error initializing source docker://quay.io/openshift-release-dev/ocp-release@sha256:dc782b44cac3d59101904cc5da2b9d8bdb90e55a07814df50ea7a13071b0f5f0: pinging docker registry returned: Get https://quay.io/v2/: dial tcp: lookup quay.io on 9.98.30.44:53: server misbehaving
The bootstrap-host cannot connect to quay.io to download the release-image. That seems to be the cause for failure..
Thanks for your help.
Yes, I checked my DNS server. It can't be resolved quay.io.
Must all nodes be able to access quay.io? include bootstrap, master and worker？

how did you test your dns connectivity with quay.io

Nurlan199206 · 2020-02-01T18:43:41Z

how to bypass this? i'm stuck on endless unable to get REST mapping for
log-bundle-20200201134119.tar.gz

Feb 01 18:41:21 localhost bootkube.sh[6878]: "99_openshift-machineconfig_99-worker-ssh.yaml": unable to get REST mapping for "99_openshift-machineconfig_99-worker-ssh.yaml": no matches for kind "MachineConfig" in version "machineconfiguration.openshift.io/v1" Feb 01 18:41:21 localhost bootkube.sh[6878]: [#2652] failed to create some manifests: Feb 01 18:41:21 localhost bootkube.sh[6878]: "99_openshift-machineconfig_99-master-ssh.yaml": unable to get REST mapping for "99_openshift-machineconfig_99-master-ssh.yaml": no matches for kind "MachineConfig" in version "machineconfiguration.openshift.io/v1" Feb 01 18:41:21 localhost bootkube.sh[6878]: "99_openshift-machineconfig_99-worker-ssh.yaml": unable to get REST mapping for "99_openshift-machineconfig_99-worker-ssh.yaml": no matches for kind "MachineConfig" in version "machineconfiguration.openshift.io/v1" Feb 01 18:41:21 localhost bootkube.sh[6878]: [#2653] failed to create some manifests: Feb 01 18:41:21 localhost bootkube.sh[6878]: "99_openshift-machineconfig_99-master-ssh.yaml": unable to get REST mapping for "99_openshift-machineconfig_99-master-ssh.yaml": no matches for kind "MachineConfig" in version "machineconfiguration.openshift.io/v1" Feb 01 18:41:21 localhost bootkube.sh[6878]: "99_openshift-machineconfig_99-worker-ssh.yaml": unable to get REST mapping for "99_openshift-machineconfig_99-worker-ssh.yaml": no matches for kind "MachineConfig" in version "machineconfiguration.openshift.io/v1" Feb 01 18:41:21 localhost bootkube.sh[6878]: [#2654] failed to create some manifests: Feb 01 18:41:21 localhost bootkube.sh[6878]: "99_openshift-machineconfig_99-master-ssh.yaml": unable to get REST mapping for "99_openshift-machineconfig_99-master-ssh.yaml": no matches for kind "MachineConfig" in version "machineconfiguration.openshift.io/v1" Feb 01 18:41:21 localhost bootkube.sh[6878]: "99_openshift-machineconfig_99-worker-ssh.yaml": unable to get REST mapping for "99_openshift-machineconfig_99-worker-ssh.yaml": no matches for kind "MachineConfig" in version "machineconfiguration.openshift.io/v1" Feb 01 18:41:22 localhost bootkube.sh[6878]: [#2655] failed to create some manifests: Feb 01 18:41:22 localhost bootkube.sh[6878]: "99_openshift-machineconfig_99-master-ssh.yaml": unable to get REST mapping for "99_openshift-machineconfig_99-master-ssh.yaml": no matches for kind "MachineConfig" in version "machineconfiguration.openshift.io/v1" Feb 01 18:41:22 localhost bootkube.sh[6878]: "99_openshift-machineconfig_99-worker-ssh.yaml": unable to get REST mapping for "99_openshift-machineconfig_99-worker-ssh.yaml": no matches for kind "MachineConfig" in version "machineconfiguration.openshift.io/v1"

vrutkovs · 2020-02-01T19:00:22Z

CVO doesn't have a place to run:

I0201 18:41:21.143569       1 apps.go:115] Deployment cluster-version-operator is not ready. status: (replicas: 1, updated: 1, ready: 0, unavailable: 1, reason: MinimumReplicasUnavailable, message: Deployment does not have minimum availability.)

log bundle contains only one master, which is not sufficient for install. You'd need 3 masters + 2 workers, see https://docs.openshift.com/container-platform/4.2/installing/installing_bare_metal/installing-bare-metal.html#machine-requirements_installing-bare-metal

/close

openshift-ci-robot · 2020-02-01T19:00:24Z

@vrutkovs: Closing this issue.

In response to this:

CVO doesn't have a place to run:
I0201 18:41:21.143569       1 apps.go:115] Deployment cluster-version-operator is not ready. status: (replicas: 1, updated: 1, ready: 0, unavailable: 1, reason: MinimumReplicasUnavailable, message: Deployment does not have minimum availability.)
log bundle contains only one master, which is not sufficient for install. You'd need 3 masters + 2 workers, see https://docs.openshift.com/container-platform/4.2/installing/installing_bare_metal/installing-bare-metal.html#machine-requirements_installing-bare-metal

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

milan-dikkumburage · 2020-07-21T07:49:42Z

Hi @Nurlan199206 are you able to fix the issue. What are steps you take to resolve the issue ?

I'm getting slimier error

[core@okd4-services ~]$ openshift-install gather bootstrap --dir=install_dir/ --bootstrap xxx.xxx.xxx.xxx --master xxx.xxx.xxx.xxx
INFO Pulling debug logs from the bootstrap machine
FATAL failed to run remote command: Process exited with status 127

josephsadek · 2020-08-18T15:37:12Z

@abhinavdahiya @jomeier
Thanks for all your help!
After setup DNS forward to public, I have completed the cluster installation. :)
Another question:
I configuration 3 worker nodes for cluster, but after installation, only 2 worker nodes joined cluster, So whether only two work nodes can join automatically by default, If you want more work nodes, you need to join the cluster manually?

can you show my how to configure DNS forward to public

sheetalp304 · 2020-12-21T06:56:15Z

@abhinavdahiya @jomeier
Thanks for all your help!
After setup DNS forward to public, I have completed the cluster installation. :)
Another question:
I configuration 3 worker nodes for cluster, but after installation, only 2 worker nodes joined cluster, So whether only two work nodes can join automatically by default, If you want more work nodes, you need to join the cluster manually?

I am facing the same issue, not able to resolve quay.in
Can you provide the steps to set DNS forward to public which worked in your case?

ablaabiyad · 2021-01-20T13:50:15Z

Still endless :6443/version?timeout=32s: EOF HELP!!!! LB,DNS settings correct!!!

I have the same issue on Virtualbox, if you managed to correct this, would you please share a hint?

Nurlan199206 · 2021-01-21T07:40:44Z

@ablaabiyad check this:

https://github.com/Nurlan199206/okd4/blob/master/local

https://github.com/Nurlan199206/okd4/blob/master/haproxy.cfg

ablaabiyad · 2021-01-22T19:47:28Z

@ablaabiyad check this:

https://github.com/Nurlan199206/okd4/blob/master/local

https://github.com/Nurlan199206/okd4/blob/master/haproxy.cfg

Still have the same issue using your haproxy and I cannot even retrieve logs even I can access ssh with root and core to the bootstrap machine.
FATAL failed to create SSH client: failed to use the provided keys for authentication: ssh: handshake failed: ssh: unable to authenticate,

abhinavdahiya added the triage/needs-information Indicates an issue needs more information in order to work on it. label Dec 2, 2019

abhinavdahiya added the platform/google label Jan 31, 2020

openshift-ci-robot closed this as completed Feb 1, 2020

DEBUG Still waiting for the Kubernetes API: Get https://mydomain.kz:6443/version?timeout=32s: EOF #2615

DEBUG Still waiting for the Kubernetes API: Get https://mydomain.kz:6443/version?timeout=32s: EOF #2615

Comments

Nurlan199206 commented Nov 2, 2019 • edited

Version

Platform:

What happened?

What you expected to happen?

How to reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

References

Nurlan199206 commented Nov 4, 2019

abhinavdahiya commented Nov 4, 2019

Nurlan199206 commented Nov 5, 2019 • edited

abhinavdahiya commented Nov 5, 2019

redmark-redhat commented Nov 15, 2019 • edited

abhinavdahiya commented Nov 15, 2019

redmark-redhat commented Nov 15, 2019

Nurlan199206 commented Nov 21, 2019

Nurlan199206 commented Nov 21, 2019

Nurlan199206 commented Nov 23, 2019 • edited

Nurlan199206 commented Nov 23, 2019

ChrystianDuarte commented Nov 26, 2019

jomeier commented Dec 1, 2019 • edited

abhinavdahiya commented Dec 2, 2019

whls commented Dec 19, 2019

jomeier commented Dec 19, 2019 via email

whls commented Dec 19, 2019

whls commented Dec 19, 2019

jomeier commented Dec 19, 2019

abhinavdahiya commented Dec 19, 2019

whls commented Dec 20, 2019

jomeier commented Dec 20, 2019 via email

whls commented Dec 20, 2019

Nurlan199206 commented Jan 25, 2020

abhinavdahiya commented Jan 31, 2020

Dennys503 commented Jan 31, 2020

Dennys503 commented Jan 31, 2020

Nurlan199206 commented Feb 1, 2020 • edited

vrutkovs commented Feb 1, 2020

openshift-ci-robot commented Feb 1, 2020

milan-dikkumburage commented Jul 21, 2020

josephsadek commented Aug 18, 2020

sheetalp304 commented Dec 21, 2020

ablaabiyad commented Jan 20, 2021

Nurlan199206 commented Jan 21, 2021 • edited

ablaabiyad commented Jan 22, 2021

Nurlan199206 commented Nov 2, 2019 •

edited

Nurlan199206 commented Nov 5, 2019 •

edited

redmark-redhat commented Nov 15, 2019 •

edited

Nurlan199206 commented Nov 23, 2019 •

edited

jomeier commented Dec 1, 2019 •

edited

Nurlan199206 commented Feb 1, 2020 •

edited

Nurlan199206 commented Jan 21, 2021 •

edited