Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vSphere: oc logs ... results in 'net/http: TLS handshake timeout" #86

Closed
jomeier opened this issue Feb 25, 2020 · 13 comments
Closed

vSphere: oc logs ... results in 'net/http: TLS handshake timeout" #86

jomeier opened this issue Feb 25, 2020 · 13 comments
Labels
triage/needs-information Indicates an issue needs more information in order to work on it.

Comments

@jomeier
Copy link
Contributor

jomeier commented Feb 25, 2020

Hi,

OKD version: 4.4.0-0.okd-2020-02-25-003044

I cant get the logs of pods neither in the web ui nor on the console. I always get this:

~/okd-4.4-onprem$ oc logs apiserver-9kfkf
Error from server: Get https://10.0.224.179:10250/containerLogs/openshift-apiserver/apiserver-9kfkf/openshift-apiserver: net/http: TLS handshake timeout

(10.0.224.179 is master2)

oc project openshift-apiserver

oc get pods -o wide
NAME              READY   STATUS    RESTARTS   AGE   IP            NODE                NOMINATED NODE   READINESS GATES
apiserver-9kfkf   1/1     Running   0          54m   10.254.2.13   chmuokd4c2master2   <none>           <none>
apiserver-cqjpq   1/1     Running   0          54m   10.254.0.15   chmuokd4c2master1   <none>           <none>
apiserver-rq8nf   1/1     Running   0          54m   10.254.1.33   chmuokd4c2master0   <none>           <none>
oc get nodes
NAME                STATUS   ROLES           AGE   VERSION
chmuokd4c2master0   Ready    master,worker   62m   v1.17.1
chmuokd4c2master1   Ready    master,worker   62m   v1.17.1
chmuokd4c2master2   Ready    master,worker   62m   v1.17.1
chmuokd4c2worker0   Ready    worker          52m   v1.17.1
chmuokd4c2worker1   Ready    worker          52m   v1.17.1
chmuokd4c2worker2   Ready    worker          52m   v1.17.1
oc get csr
NAME        AGE   REQUESTOR                                                                   CONDITION
csr-24b5f   62m   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued
csr-4h2jv   52m   system:node:chmuokd4c2worker1                                               Approved,Issued
csr-4wgpb   52m   system:node:chmuokd4c2worker2                                               Approved,Issued
csr-5p9gj   53m   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued
csr-5z6h2   61m   system:node:chmuokd4c2master1                                               Approved,Issued
csr-8ljb2   54m   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued
csr-bxzg7   61m   system:node:chmuokd4c2master2                                               Approved,Issued
csr-ckf7p   61m   system:node:chmuokd4c2master0                                               Approved,Issued
csr-dwppt   53m   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued
csr-hl785   52m   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued
csr-mv9mq   62m   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued
csr-s4ghg   52m   system:node:chmuokd4c2worker0                                               Approved,Issued
csr-sgzft   52m   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued
csr-zhhck   62m   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued

On master2 in the logs of pod api-server (sudo crictl logs api-server):

...
E0225 12:26:49.999186       1 status.go:71] apiserver received an error that is not an metav1.Status: &url.Error{Op:"Get", URL:"https://10.0.224.179:10250/containerLogs/openshift-apiserver/apiserver-9kfkf/openshift-apiserver", Err:http.tlsHandshakeTimeoutError{}}
...

Has anybody an idea what could be wrong in my setup?

Greetings,

Josef

SOLUTION: My cluster is behind a corporate proxy. I had to add the subnet of my VMs (CIDR) to the no_proxy field in the install-config.yaml file !

@vrutkovs
Copy link
Member

API server can't reach kubelet's port. Ensure 10250 is not blocked on masters / workers by any kind of firewall

@jomeier
Copy link
Contributor Author

jomeier commented Feb 25, 2020

I'm behind a corporate proxy. If I unset my https_proxy vars at least the curl command

curl https://10.0.224.179:10250/containerLogs/openshift-apiserver/apiserver-9kfkf/openshift-apiserver -k

returns Unauthorized.

But

oc get logs ... 

doesn't work. Same net/http tls timeout error.

I'm wondering if oc shouldn't use hostnames instead of IP addresses because in my NO_PROXY env variable I set our intranet domain so internal hostnames aren't sent to the corporate proxy. This worked in the past.

Can I set the log-level on the oc log command somehow to get more debug info?

@vrutkovs
Copy link
Member

Does curl work from master node? Did you setup proxy according to https://docs.openshift.com/container-platform/4.3/installing/installing_bare_metal/installing-restricted-networks-bare-metal.html#installation-configure-proxy_installing-restricted-networks-bare-metal?

I'm wondering if oc shouldn't use hostnames

oc asks api-server to get logs, and apiserver is using InternalDNS address to contact the kubelet. Masters must be able to reach nodes using its short hostname

@jomeier
Copy link
Contributor Author

jomeier commented Feb 25, 2020

I have done that.

But because in the past the proxy information in the install-config.yaml file was not served from the bootstrap machine to the masters and workers I patch that to their ignition-files. Shouldn't I do that?

@jomeier
Copy link
Contributor Author

jomeier commented Feb 25, 2020

The curl from the master also says: Unauthorized. Also the oc logs command does not work there.

@jomeier
Copy link
Contributor Author

jomeier commented Feb 25, 2020

Maybe you remember #19

@jomeier
Copy link
Contributor Author

jomeier commented Feb 25, 2020

Something must have changed in the last weeks because formerly I got logs.

@vrutkovs vrutkovs added the triage/needs-information Indicates an issue needs more information in order to work on it. label Feb 25, 2020
@jomeier
Copy link
Contributor Author

jomeier commented Feb 25, 2020

I add this section to all ignition files for the masters and workers:

    "storage": {
        "files": [
          {
            "filesystem": "root",
            "group": {},
            "path": "/etc/hostname",
            "user": {},
            "contents": {
              "source": "data:text/plain;charset=utf-8,chmuokd4c2master0",
              "verification": {}
            },
            "mode": 420
          },
          {
            "group": {},
            "overwrite": true,
            "path": "/etc/profile.d/proxy.sh",
            "user": {
                "name": "root"
            },
            "contents": {
                "source": "data:text/plain;charset=utf-8;base64,<SECRET>",
                "verification": {}
            },
            "mode": 420
          },
          {
            "group": {},
            "overwrite": true,
            "path": "/etc/systemd/system.conf.d/10-default-env.conf",
            "user": {
                "name": "root"
            },
            "contents": {
                "source": "data:text/plain;charset=utf-8;base64,<SECRET>",
                "verification": {}
            },
            "mode": 420
        }
        ]
      },

@vrutkovs
Copy link
Member

The curl from the master also says:

Okay, now try curl from api container

@jomeier
Copy link
Contributor Author

jomeier commented Feb 25, 2020

That's not working.

In the api-server container HTTPS_PROXY env variable is set to my corporate proxy. Also NO_PROXY is set but without the IPs of my masters.

If I unset the proxy variables in the API server container, the curl responds with Unauthorized. The problem seems to be that the master's IP address is not in the NO_PROXY variable.

@vrutkovs
Copy link
Member

Also NO_PROXY is set but without the IPs of my masters.

What's install-config.yaml section? The docs wants all of these set:

A comma-separated list of destination domain names, domains, IP addresses, or other network CIDRs to exclude proxying. Preface a domain with . to include all subdomains of that domain. Use * to bypass proxy for all destinations.

@jomeier
Copy link
Contributor Author

jomeier commented Feb 25, 2020

You mean I must add the subnet of my VMs also to the no_proxy env variable? I'll try that.

@jomeier
Copy link
Contributor Author

jomeier commented Feb 25, 2020

Thanks Vadim. That was the problem :-) ! I had to add the subnet CIDR of my VMs to the no_proxy field in the install-config.yaml file.

@jomeier jomeier closed this as completed Feb 25, 2020
binnes pushed a commit to binnes/okd that referenced this issue Jul 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage/needs-information Indicates an issue needs more information in order to work on it.
Projects
None yet
Development

No branches or pull requests

2 participants