New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Container Cluster stuck in non-ready state because channel update rejected #194
Comments
Hey @jlewi , the underlying API's canonical form for the release channel is all uppercase: https://cloud.google.com/kubernetes-engine/docs/reference/rest/v1beta1/projects.locations.clusters#channel Curiously, the underlying API accepts the lowercase form as well, but further GET requests return the uppercase form, which we currently don't suppress a diff for. If you create your cluster with |
Thanks that's super helpful I will fix that on our end. Feel free to close this issue unless you want to leave open to continue to track handling the diff more gracefully. |
* Tracking issue GoogleCloudPlatform/kubeflow-distribution#33 * Fix the setters on firewall rules. They should be partial setters so we don't lose the suffixes. * Add a firewall rule to allow cert-manager webhooks this is necessary to work with private GKE ref https://docs.cert-manager.io/en/release-0.11/getting-started/webhook.html#running-on-private-gke-clusters * Add kpt/kustomize function to configure the transform to replace images with the mirror'd image versions. * Update image mirroring configs * Instead of using "*" to match all images we list out image prefixes to match so we are a bit more intentional. * We want to include gcr.io images in order to support working with VPC-SC. For VPC-SC gcr.io images need to be mirror'd as well because they are unlikely to be within the perimeter * Use the locations gcr.io/${PROJECT}/mirror It looks like the mirror'ing pipeline includes the registry name * Change the release channel on the cluster to be upper case * Per GoogleCloudPlatform/k8s-config-connector#194 we need release channels to be upper case otherwise updates fail. * centraldashboard v3 kustomization.yaml needs an image stanza * Without this we end up deploying using tag "latest" which isn't what we want. * Use CNRM to enable services GoogleCloudPlatform/kubeflow-distribution#31 * Remove cert-manager ACME challenge from excluded paths for JWT validation * We no longer use cert-manager so we no longer need to allow that path. * We need to add a default network route in order to allow cloudnat to access the outbound interet access * Need to access jwks * Give routes and nat resources unique names based on the KF name. * Route to public internet should be higher priority so google apis take precedence.
* Tracking issue GoogleCloudPlatform/kubeflow-distribution#33 * Fix the setters on firewall rules. They should be partial setters so we don't lose the suffixes. * Add a firewall rule to allow cert-manager webhooks this is necessary to work with private GKE ref https://docs.cert-manager.io/en/release-0.11/getting-started/webhook.html#running-on-private-gke-clusters * Add kpt/kustomize function to configure the transform to replace images with the mirror'd image versions. * Update image mirroring configs * Instead of using "*" to match all images we list out image prefixes to match so we are a bit more intentional. * We want to include gcr.io images in order to support working with VPC-SC. For VPC-SC gcr.io images need to be mirror'd as well because they are unlikely to be within the perimeter * Use the locations gcr.io/${PROJECT}/mirror It looks like the mirror'ing pipeline includes the registry name * Change the release channel on the cluster to be upper case * Per GoogleCloudPlatform/k8s-config-connector#194 we need release channels to be upper case otherwise updates fail. * centraldashboard v3 kustomization.yaml needs an image stanza * Without this we end up deploying using tag "latest" which isn't what we want. * Use CNRM to enable services GoogleCloudPlatform/kubeflow-distribution#31 * Remove cert-manager ACME challenge from excluded paths for JWT validation * We no longer use cert-manager so we no longer need to allow that path. * We need to add a default network route in order to allow cloudnat to access the outbound interet access * Need to access jwks * Give routes and nat resources unique names based on the KF name. * Route to public internet should be higher priority so google apis take precedence.
* Fix a bunch issues with GCP blueprints for private gke. * Tracking issue GoogleCloudPlatform/kubeflow-distribution#33 * Fix the setters on firewall rules. They should be partial setters so we don't lose the suffixes. * Add a firewall rule to allow cert-manager webhooks this is necessary to work with private GKE ref https://docs.cert-manager.io/en/release-0.11/getting-started/webhook.html#running-on-private-gke-clusters * Add kpt/kustomize function to configure the transform to replace images with the mirror'd image versions. * Update image mirroring configs * Instead of using "*" to match all images we list out image prefixes to match so we are a bit more intentional. * We want to include gcr.io images in order to support working with VPC-SC. For VPC-SC gcr.io images need to be mirror'd as well because they are unlikely to be within the perimeter * Use the locations gcr.io/${PROJECT}/mirror It looks like the mirror'ing pipeline includes the registry name * Change the release channel on the cluster to be upper case * Per GoogleCloudPlatform/k8s-config-connector#194 we need release channels to be upper case otherwise updates fail. * centraldashboard v3 kustomization.yaml needs an image stanza * Without this we end up deploying using tag "latest" which isn't what we want. * Use CNRM to enable services GoogleCloudPlatform/kubeflow-distribution#31 * Remove cert-manager ACME challenge from excluded paths for JWT validation * We no longer use cert-manager so we no longer need to allow that path. * We need to add a default network route in order to allow cloudnat to access the outbound interet access * Need to access jwks * Give routes and nat resources unique names based on the KF name. * Route to public internet should be higher priority so google apis take precedence. * * Regenerate tests.
Describe the bug
A clear and concise description of what the bug is.
ConfigConnector Version
Run the following command to get the current ConfigConnector version
kubectl get ns cnrm-system -o jsonpath='{.metadata.annotations.cnrm\.cloud\.google\.com/version}'
cnrm.cloud.google.com/version: 1.9.1
To Reproduce
Steps to reproduce the behavior:
Create a ContainerCluster CNRM resource setting the release channel for the cluster to get
the status
Apply the resource to create the cluster
Reapply the resource
Container cluster reports
The release channel shouldn't be changing.
I suspect this an issue in the update logic since there are some restrictions about mutations to
release channel
https://cloud.google.com/kubernetes-engine/docs/concepts/release-channels
YAML snippets:
The text was updated successfully, but these errors were encountered: