Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removing stuck with the message "Waiting on cluster-scoped-gc" #35345

Closed
oed-guzym opened this issue Nov 2, 2021 · 5 comments
Closed

Removing stuck with the message "Waiting on cluster-scoped-gc" #35345

oed-guzym opened this issue Nov 2, 2021 · 5 comments

Comments

@oed-guzym
Copy link

Rancher Server Setup

  • Rancher version: v2.5.8
  • Installation option: Docker

Information about the Cluster

  • Kubernetes version: v1.19.4
  • Cluster Type: Downstream
    • Custom

Describe the bug
In the Rancher UI, a cluster is stuck with the message "Waiting on cluster-scoped-gc". We don't get the cluster removed and we can't interact with it anymore. If we manually go over the "Delete" button nothing happens. Via the API we see the following transition message:

"state": "removing", "transitioning": "error", "transitioningMessage": "waiting on cluster-scoped-gc"

When we click on the cluster in the UI, we get the message:

ClusterUnavailable 503: cluster not found

Due to lack of ideas, we have already tried to rebuild the cluster using the same automated way, but we get the following error message:

Bad response statusCode [422]. Status [422 Unprocessable Entity]. Body: [code=MissingRequired, fieldName=ClusterTemplateRevision, message=this cluster is created from a clusterTemplateRevision, please pass the clusterTemplateRevision, baseType=error]

A connection via "kubectl" can also no longer be established.

The infrastructure of the cluster no longer exists. This is only the entry in the interface, which cannot be removed. We currently have no idea how to get the cluster removed.

To Reproduce
Unfortunately, we do not know exactly how we got into this state. The problem appeared after we automatically deployed and removed the cluster for testing purposes.

Result
The result of the error is that we see an orphaned cluster in the interface and unfortunately we cannot get the entry removed.

Expected Result
It would be good if there was some way to force deletion of orphaned artifacts / clusters and all associated references in the UI.

Screenshots
Cluster-error-message

@bartcee
Copy link

bartcee commented Nov 17, 2021

I encountered the same issue on Rancher v.2.5.9. The cluster was imported with the rancher CLI tool and deleted with rancher cluster rm cluster-name. Stuck with the Waiting on cluster-scoped-gc error message.

@stale
Copy link

stale bot commented Jan 17, 2022

This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 60 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions.

@stale stale bot added the status/stale label Jan 17, 2022
@Heiko-san
Copy link

Any workaround to fix clusters in that state ? :(

@stale stale bot removed the status/stale label Jan 25, 2022
@vichaos
Copy link

vichaos commented Feb 8, 2022

Switch to rancher-cluster 's kube-context and run following command (Replace ${STUCK_CLUSTER} with the stuck cluster's name or id )

kubectl get clusters.management.cattle.io $(kubectl get clusters.management.cattle.io \
-o jsonpath='{range .items[*]}{@.spec.displayName}:{@.metadata.name}{"\n"}{end}'| \
grep ${STUCK_CLUSTER} | cut -d ":" -f2) -o=json|jq '.metadata.finalizers = null' | kubectl apply -f -

Any workaround to fix clusters in that state ? :(

@stale
Copy link

stale bot commented Apr 10, 2022

This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 60 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions.

@stale stale bot added the status/stale label Apr 10, 2022
@stale stale bot closed this as completed Apr 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants