-
Notifications
You must be signed in to change notification settings - Fork 849
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix certmanager race condition and version numbers. #1134
Conversation
* To fix the race condition with certmanager (kubeflow#1125) we move the KF issuer into a separate package from the package deploying kubeflow. This way we can wait for cert-manager to start before deploying resources. * Labels need to be immutable otherwise upgrades won't work (see kubeflow#1131). So remove version number from common labels and application selector so that apply will work to update resources.
/assign @krishnadurai |
@krishnadurai @johnugeorge @yanniszark could one of you review this please? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: yanniszark The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
* GoogleCloudPlatform/kubeflow-distribution#33 is tracking GCP blueprints on private GKE with VPC-SC * This PR doesn't fully enable that but it includes a lot of necessary changes. * cluster-private-patch.yaml is a cluster patch that turns on a lot of settings to deploy GKE with private GKE * For ease of use we make the master publicly accessible anywhere; users could configure that behavior if desired using patch overlays. * Use kpt setters to name all the networking resources (firewall rules, networks, etc...) * This ensures the names are unique based on the KF deployment name and won't conflict with existing rules. * The setters also ensures that the references get set correctly; e.g. the firewall rules correctly refer the newly created network. * Add a CNRM resource to enable CloudDNS. * Per GoogleCloudPlatform/kubeflow-distribution#31 we should probably use CNRM and not AnthosCLI to enable all required services. * Add a kpt setter to control firewall rule logging * Enabling firewall rule logging can be useful to debug why connections are blocked. Enable logging on firewall rules. * Add an extra firewall rule for ISTIO *Per https://istio.io/docs/setup/platform-setup/gke/ we need to manually create an additional firewall rule to allow traffic to the ISTIO pilot webhook port. * Add a NAT to allow outbound internet egress * Egress is still blocked by firewall rules * Per kbueflow/gcp-blueprints#34 this was an attempt to make it possible to pull images from DockerHub and Quay.IO. This was partially succesful; pulling from DockerHub works but for Quay.IO the firewall rules are strill blocking required connections. * Fix the v3 version of the cert-manager package. * kubeflow#1134 moved the kubeflow issuer into its own package to avoid race conditions * That refactored means that the v3 packages no longer included the actual cert-manager resources * This PR fixes that by having the v3 package pull in the base package
* GoogleCloudPlatform/kubeflow-distribution#33 is tracking GCP blueprints on private GKE with VPC-SC * This PR doesn't fully enable that but it includes a lot of necessary changes. * cluster-private-patch.yaml is a cluster patch that turns on a lot of settings to deploy GKE with private GKE * For ease of use we make the master publicly accessible anywhere; users could configure that behavior if desired using patch overlays. * Use kpt setters to name all the networking resources (firewall rules, networks, etc...) * This ensures the names are unique based on the KF deployment name and won't conflict with existing rules. * The setters also ensures that the references get set correctly; e.g. the firewall rules correctly refer the newly created network. * Add a CNRM resource to enable CloudDNS. * Per GoogleCloudPlatform/kubeflow-distribution#31 we should probably use CNRM and not AnthosCLI to enable all required services. * Add a kpt setter to control firewall rule logging * Enabling firewall rule logging can be useful to debug why connections are blocked. Enable logging on firewall rules. * Add an extra firewall rule for ISTIO *Per https://istio.io/docs/setup/platform-setup/gke/ we need to manually create an additional firewall rule to allow traffic to the ISTIO pilot webhook port. * Add a NAT to allow outbound internet egress * Egress is still blocked by firewall rules * Per kbueflow/gcp-blueprints#34 this was an attempt to make it possible to pull images from DockerHub and Quay.IO. This was partially succesful; pulling from DockerHub works but for Quay.IO the firewall rules are strill blocking required connections. * Fix the v3 version of the cert-manager package. * #1134 moved the kubeflow issuer into its own package to avoid race conditions * That refactored means that the v3 packages no longer included the actual cert-manager resources * This PR fixes that by having the v3 package pull in the base package
To fix the race condition with certmanager (certmanager install has race condition - try to create KF certmanager resources before cert manager is available #1121) we move the KF
issuer into a separate package from the package deploying kubeflow.
This way we can wait for cert-manager to start before deploying resources.
Labels need to be immutable otherwise upgrades won't work (see commonLabels need to be immutable for upgrades - remove version from commonLabels #1131).
So remove version number from common labels and application selector
so that apply will work to update resources.