pkg/start: use loopback kubeconfig to talk to API #28

jhixson74 · 2019-07-23T23:45:45Z

This code modifies cluster-bootstrap to use a kubeconfig configured for localhost API access.

This is necessary due to a limitation with Azure internal load balancers. See limitation #2 here: https://docs.microsoft.com/en-us/azure/load-balancer/load-balancer-overview#limitations

"Unlike public Load Balancers which provide outbound connections when transitioning from private IP addresses inside the virtual network to public IP addresses, internal Load Balancers do not translate outbound originated connections to the frontend of an internal Load Balancer as both are in private IP address space. This avoids potential for SNAT port exhaustion inside unique internal IP address space where translation is not required. The side effect is that if an outbound flow from a VM in the backend pool attempts a flow to frontend of the internal Load Balancer in which pool it resides and is mapped back to itself, both legs of the flow don't match and the flow will fail."

kubeconfig-loopback is generated by the installer.

https://jira.coreos.com/browse/CORS-1094

abhinavdahiya · 2019-07-24T00:01:31Z

Somethings i noticed was...

The cluster-bootstrap already uses the the localhost to talk to the bootstrap-api-server

cluster-bootstrap/pkg/start/start.go

Lines 90 to 102 in c040eb0

    
           // We don't want the client contact the API servers via load-balancer, but only talk to the local API server. 
        
           // This will speed up the initial "where is working API server" process. 
        
           localClientConfig := rest.CopyConfig(restConfig) 
        
           localClientConfig.Host = "localhost:6443" 
        
           // Set the ServerName to original hostname so we pass the certificate check. 
        
           hostURL, err := url.Parse(restConfig.Host) 
        
           if err != nil { 
        
           	return err 
        
           } 
        
           localClientConfig.ServerName, _, err = net.SplitHostPort(hostURL.Host) 
        
           if err != nil { 
        
           	return err 
        
           }

and switches to using the load balancer when the bootstapping is finished.

So we could maybe add the feature to copy the kubeconfig-local to the secrets so the objects created by the cluster-bootstrap (static pods) can use that.

jhixson74 · 2019-07-25T00:36:57Z

Somethings i noticed was...

The cluster-bootstrap already uses the the localhost to talk to the bootstrap-api-server

cluster-bootstrap/pkg/start/start.go

Lines 90 to 102 in c040eb0

// We don't want the client contact the API servers via load-balancer, but only talk to the local API server.

// This will speed up the initial "where is working API server" process.

localClientConfig := rest.CopyConfig(restConfig)

localClientConfig.Host = "localhost:6443"

// Set the ServerName to original hostname so we pass the certificate check.

hostURL, err := url.Parse(restConfig.Host)

if err != nil {

return err

}

localClientConfig.ServerName, _, err = net.SplitHostPort(hostURL.Host)

if err != nil {

return err

}

and switches to using the load balancer when the bootstapping is finished.

So we could maybe add the feature to copy the kubeconfig-local to the secrets so the objects created by the cluster-bootstrap (static pods) can use that.

Doesn't this do that?

https://github.com/openshift/cluster-bootstrap/pull/28/files#diff-d32f9bfe59296f6d39f436924d7bb03aR37

(sorry, I don't know how to make it show inline)

sttts · 2019-07-25T09:28:08Z

and switches to using the load balancer when the bootstapping is finished.
So we could maybe add the feature to copy the kubeconfig-local to the secrets so the objects created by the cluster-bootstrap (static pods) can use that.

We intentionally use the LB after bootstrapping because we want to take down the bootstrap apiserver as soon as possible because it is not consuming configuration created by the operators. Switching the second phase to loopback as well will destroy this.

sttts · 2019-07-25T09:28:17Z

/hold

abhinavdahiya · 2019-07-25T22:48:43Z

So the goal is to make sure the bootstrap-control-plane is using local host to talk to the apisever.
openshift/cluster-kube-controller-manager-operator#270
openshift/cluster-kube-scheduler-operator#156

So rather than changing the kubeconfig useg by the cluster-bootstrap, what if we copy the loopback one to bootstrap-secrets so that bootstrap-{kcm,ks} are talking to the bootstrap-apiserver on localhost..?

or do you think we should use the env variables like https://github.com/openshift/cluster-version-operator/blob/210264aea17b9d6e278435a371775c8eb507568e/bootstrap/bootstrap-pod.yaml#L30-L33 for bootstrap-{kcm,ks} are talking to the bootstrap-apiserver on localhost..?

cc @sttts @jhixson74

sttts · 2019-07-26T07:59:18Z

/hold cancel

It think this is actually fine.

sttts · 2019-07-26T07:59:29Z

/retest

sttts · 2019-07-26T11:47:06Z

/approve

pkg/start/bootstrap.go

abhinavdahiya · 2019-07-31T07:04:42Z

/test e2e-aws

jhixson74 · 2019-07-31T19:45:26Z

/test e2e-aws

abhinavdahiya · 2019-07-31T20:08:26Z

/lgtm

openshift-ci-robot · 2019-07-31T20:08:33Z

@abhinavdahiya: changing LGTM is restricted to collaborators

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

sttts · 2019-08-01T06:35:18Z

/lgtm

openshift-ci-robot · 2019-08-01T06:35:23Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: abhinavdahiya, jhixson74, sttts

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [sttts]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-bot · 2019-08-01T10:24:51Z

/retest

Please review the full test history for this PR and help us cut down flakes.

…ys for etcd-signer Since the pivots to prefer loopback Kube-API access: * bf59ebf (azure: generate loopback kubeconfig to access API locally, 2019-07-17, openshift#2085). * 82d81d9 (data/data/bootstrap: use loopback kubeconfig for API access, 2019-07-24, openshift#2086). * openshift/cluster-bootstrap@61d1428bea (pkg/start: use loopback kubeconfig to talk to API, 2019-07-23, openshift/cluster-bootstrap#28). * possibly more logs on the bootstrap machine have contained distracting errors like these reported in [1]: $ grep 'not localhost\|etcd-signer' journal-bootstrap.log ... Aug 20 10:33:56 cnv-qe-08.cnvqe.lab.eng.rdu2.redhat.com podman[8366]: 2019-08-20 10:33:56.090073216 +0000 UTC m=+2.644782091 container start d0dcc42a1335c1224df35a48a279f63f1cb7a03c94de5ebb29e2633e6ee6c429 (image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:f20394d571ff9a28aed9366434521d221d8d743a6efe2a3d6c6ad242198a522e, name=etcd-signer) Aug 20 10:33:58 cnv-qe-08.cnvqe.lab.eng.rdu2.redhat.com openshift.sh[2867]: error: unable to recognize "./99_kubeadmin-password-secret.yaml": Get https://localhost:6443/api?timeout=32s: x509: certificate is valid for api.bm1.oc4, not localhost Aug 20 10:34:01 cnv-qe-08.cnvqe.lab.eng.rdu2.redhat.com approve-csr.sh[2870]: Unable to connect to the server: x509: certificate is valid for api.bm1.oc4, not localhost ... Aug 20 10:43:55 cnv-qe-08.cnvqe.lab.eng.rdu2.redhat.com openshift.sh[2867]: error: unable to recognize "./99_kubeadmin-password-secret.yaml": Get https://localhost:6443/api?timeout=32s: x509: certificate is valid for api.bm1.oc4, not localhost Aug 20 10:43:59 cnv-qe-08.cnvqe.lab.eng.rdu2.redhat.com podman[15272]: 2019-08-20 10:43:59.68789639 +0000 UTC m=+0.188325679 container died d0dcc42a1335c1224df35a48a279f63f1cb7a03c94de5ebb29e2633e6ee6c429 (image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:f20394d571ff9a28aed9366434521d221d8d743a6efe2a3d6c6ad242198a522e, name=etcd-signer) ... With this commit, we pass the localhost cert to etcd-signer so we can form the TLS connection to gracefully say "sorry, I'm not really a Kube API server". Fixes [2]. [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1743661 [2]: https://bugzilla.redhat.com/show_bug.cgi?id=1743840

openshift-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Jul 23, 2019

openshift-ci-robot requested review from deads2k and sttts July 23, 2019 23:46

openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 25, 2019

openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 26, 2019

openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 26, 2019

abhinavdahiya reviewed Jul 29, 2019

View reviewed changes

pkg/start/bootstrap.go Outdated Show resolved Hide resolved

pkg/start: use loopback kubeconfig to talk to API

61d1428

jhixson74 force-pushed the master_azure_restrict_bootstrap_clients branch from 0608e6d to 61d1428 Compare July 30, 2019 23:05

openshift-ci-robot assigned sttts Aug 1, 2019

openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Aug 1, 2019

openshift-merge-robot merged commit 106f76b into openshift:master Aug 1, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pkg/start: use loopback kubeconfig to talk to API #28

pkg/start: use loopback kubeconfig to talk to API #28

jhixson74 commented Jul 23, 2019 •

edited

abhinavdahiya commented Jul 24, 2019

jhixson74 commented Jul 25, 2019 •

edited

sttts commented Jul 25, 2019

sttts commented Jul 25, 2019

abhinavdahiya commented Jul 25, 2019

sttts commented Jul 26, 2019

sttts commented Jul 26, 2019

sttts commented Jul 26, 2019

abhinavdahiya commented Jul 31, 2019

jhixson74 commented Jul 31, 2019

abhinavdahiya commented Jul 31, 2019

openshift-ci-robot commented Jul 31, 2019

sttts commented Aug 1, 2019

openshift-ci-robot commented Aug 1, 2019

openshift-bot commented Aug 1, 2019

pkg/start: use loopback kubeconfig to talk to API #28

pkg/start: use loopback kubeconfig to talk to API #28

Conversation

jhixson74 commented Jul 23, 2019 • edited

abhinavdahiya commented Jul 24, 2019

jhixson74 commented Jul 25, 2019 • edited

sttts commented Jul 25, 2019

sttts commented Jul 25, 2019

abhinavdahiya commented Jul 25, 2019

sttts commented Jul 26, 2019

sttts commented Jul 26, 2019

sttts commented Jul 26, 2019

abhinavdahiya commented Jul 31, 2019

jhixson74 commented Jul 31, 2019

abhinavdahiya commented Jul 31, 2019

openshift-ci-robot commented Jul 31, 2019

sttts commented Aug 1, 2019

openshift-ci-robot commented Aug 1, 2019

openshift-bot commented Aug 1, 2019

jhixson74 commented Jul 23, 2019 •

edited

jhixson74 commented Jul 25, 2019 •

edited