New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CORS-3260: CAPI: Create GCP Internal LB #8151
CORS-3260: CAPI: Create GCP Internal LB #8151
Conversation
Skipping CI for Draft Pull Request. |
/label platform/google |
4c5cf49
to
ccc11ba
Compare
@@ -5,6 +5,7 @@ import ( | |||
"os" | |||
"path/filepath" | |||
|
|||
"github.com/Azure/go-autorest/autorest/to" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will change this to a different package
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this one be used?
"k8s.io/utils/pointer"
The MAO is incompatible with the external proxy load balancer (MAO requires target pools, which only work with passthrough LB). Just pushed some new changes to try some workarounds to see if we can get a non-production ready poc. 24143b2 has some temporary workarounds to rename the CAPG instance group to try to be compatible with the MAO instance group (instances can only be in a single instance group). only the first master is Provisioning correctly. Why the other machines are failing is still unclear. Master-1 and 2 are failing like:
Workers are apparently being created but cannot be found(?)...
|
980c36b
to
9dbd867
Compare
ff26926
to
e29ed72
Compare
@patrickdillon: This pull request references CORS-3260 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
@@ -85,9 +109,74 @@ func createInternalLBAddress(ctx context.Context, in clusterapi.InfraReadyInput) | |||
}, | |||
} | |||
|
|||
if _, err := service.HealthChecks.Insert(in.InstallConfig.Config.GCP.ProjectID, healthCheck).Context(ctx).Do(); err != nil { | |||
_, err = service.RegionHealthChecks.Insert(projectID, region, healthCheck).Context(ctx).Do() | |||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if we need to augment the delete code, I see this getting created but when I destroy the cluster I don't see it being deleted. Only this health check resource remains e.g.:
$ gcloud compute health-checks describe https://www.googleapis.com/compute/v1/projects/openshift-dev-installer/regions/us-east1/healthChecks/bfournie-capg-test-gn6g7-api-internal
checkIntervalSec: 2
creationTimestamp: '2024-04-06T10:28:41.909-07:00'
description: Created By OpenShift Installer
healthyThreshold: 3
httpsHealthCheck:
port: 6443
proxyHeader: NONE
requestPath: /readyz
id: '7659072829660084390'
kind: compute#healthCheck
name: bfournie-capg-test-gn6g7-api-internal
region: https://www.googleapis.com/compute/v1/projects/openshift-dev-installer/regions/us-east1
selfLink: https://www.googleapis.com/compute/v1/projects/openshift-dev-installer/regions/us-east1/healthChecks/bfournie-capg-test-gn6g7-api-internal
timeoutSec: 2
type: HTTPS
unhealthyThreshold: 3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added 7de950d to delete HTTPS health checks. Before we were only handling HTTP health checks. Have not tested yet, but will now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added 7de950d to delete HTTPS health checks. Before we were only handling HTTP health checks. Have not tested yet, but will now.
Hm no, tested and I don't think this is the right way. The CAPG-created https is getting deleted, but the internal one is still not. will debug that soon.
Also my install failed with:
ERROR failed to fetch Cluster: failed to generate asset "Cluster": failed to create cluster: failed preparing ignition data: ignition failed to provision storage: failed to create storage: failed to create bucket: googleapi: Error 409: Your previous request to create the named bucket succeeded and you already own it., conflict
I suspect #8056 broke the capg flow. #8056 adds ignition bucket creation in tfvars, which is still executed in the capi flow, so I think the same logic is being run twice and we hit this error. cc @barbacbd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left further comments on how to fix #8056 (comment)
|
||
// TODO: the subnet is only relevant for internal load balancer | ||
op, err := service.BackendServices.Patch(projectID, extBesvcName, extBesvc).Context(ctx).Do() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this code will probably go away when we create the Load Balancers in CAPG but it would be useful to have a comment here until then as to why this patch is necessary.
@@ -5,6 +5,7 @@ import ( | |||
"os" | |||
"path/filepath" | |||
|
|||
"github.com/Azure/go-autorest/autorest/to" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this one be used?
"k8s.io/utils/pointer"
0d79d96
to
f91a7cb
Compare
163fa38
to
7de950d
Compare
7de950d
to
7b6b4f4
Compare
By default, CAPG creates a LB forwarding rule for port 443. Update to 6443 for OpenShift.
Creates an internal passthrough LB to serve the API and machine configs to the cluster. Utilizes the instance groups created by CAPG.
capg sets the APIServerPort to 443 unless this is explicitly set. Explicitly setting to the 6443 default.
CAPG installs will only use backend services--no target pools, so we need to remove target pools from the machine and control plane machineset manifests, so the machine-api operator doesn't throw an error looking for a target pool that doesn't exist.
7b6b4f4
to
12bfe75
Compare
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bfournie The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This commit adds the capability to delete regional health checks along with the global health checks. To do this, it refactors the health-check code so that the logic essentially remains the same, but different services (health check or regoinal health check) can be injected.
12bfe75
to
4765e43
Compare
@patrickdillon: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
0bbbb02
into
openshift:master
[ART PR BUILD NOTIFIER] This PR has been included in build ose-installer-altinfra-container-v4.16.0-202404222343.p0.g0bbbb02.assembly.stream.el8 for distgit ose-installer-altinfra. |
Adds an internal passthrough LB using the instance groups created by CAPG.
Currently uses a patch in vendored CAPG as a shortcut, see commit message for more detail . To use this, make sure to delete the capg binary from
cluster-api/bin
so that the provider will be rebuilt with the patch.It should be possible to move this functionality into the installer and will test it ASAP.