Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leftover cluster namespaces after cluster deletion #31546

Closed
ryanelliottsmith opened this issue Mar 1, 2021 · 10 comments
Closed

Leftover cluster namespaces after cluster deletion #31546

ryanelliottsmith opened this issue Mar 1, 2021 · 10 comments
Assignees
Labels
internal kind/bug Issues that are defects reported by users or that we know have reached a real release team/hostbusters The team that is responsible for provisioning/managing downstream clusters + K8s version support
Milestone

Comments

@ryanelliottsmith
Copy link

ryanelliottsmith commented Mar 1, 2021

What kind of request is this (question/bug/enhancement/feature request):
bug

Steps to reproduce (least amount of steps as possible):
Unclear at present, it appears that cluster namespace object deletion is not triggered when a downstream cluster is deleted

Result:
Users have reported issues with Rancher local clusters that have more cluster namespaces than actual downstream clusters:

$ kubectl get clusters.management.cattle.io --no-headers |wc -l
275
$ kubectl get namespaces --no-headers |grep "c-" |wc -l
753

Environment information

  • Rancher version (rancher/rancher/rancher/server image tag or shown bottom left in the UI): v2.4.8
  • Installation option (single install/HA): k8s

gz#14931

@ryanelliottsmith ryanelliottsmith added kind/bug Issues that are defects reported by users or that we know have reached a real release internal labels Mar 1, 2021
@deniseschannon
Copy link

@ryanelliottsmith for the multiple customers, what type of downstream clusters are we talking about? Is it all or is it a specific one?

@aaronyeeski
Copy link
Contributor

aaronyeeski commented Mar 19, 2021

Tried reproducing this behavior with the following steps:

  • On local cluster of v2.5.7 Rancher server
  • Run the following commands to check number of namespaces.
# Outputs number of clusters
`kubectl get clusters.management.cattle.io --no-headers |wc -l`
# Outputs number of namespaces for downstream clusters in local cluster
`kubectl get namespaces --no-headers |grep "c-" |wc -l`
  • Log in as a standard user
  • Deploy 2 downstream clusters.
  • Run the two commands.
  • Delete 1 downstream cluster.
  • Run the two commands.

Result
I had a setup with 2 downstream clusters. And saw 4 namespaces:

Aarons-MBP:~ aaronyee$ kubectl get namespaces --no-headers |grep "c-"
c-mspvq                                      Active   10m
c-n25mq                                      Active   11m
cluster-fleet-default-c-mspvq-2f2207275e45   Active   5m24s
cluster-fleet-default-c-n25mq-98a5777b0eb5   Active   5m57s

After deleting 1 downstream cluster, I saw 3 namespaces:

Aarons-MBP:~ aaronyee$ kubectl get namespaces --no-headers |grep "c-"
c-mspvq                                      Active   24m
cluster-fleet-default-c-mspvq-2f2207275e45   Active   18m
cluster-fleet-default-c-n25mq-98a5777b0eb5   Active   19m

@aaronyeeski
Copy link
Contributor

aaronyeeski commented Mar 19, 2021

Tried reproducing this behavior with the following steps:

  • On local cluster of v2.4.8 Rancher server
  • Run the following commands to check number of namespaces.
# Outputs number of clusters
`kubectl get clusters.management.cattle.io --no-headers |wc -l`
# Outputs number of namespaces for downstream clusters in local cluster
`kubectl get namespaces --no-headers |grep "c-" |wc -l`
  • Log in as a standard user
  • Deploy 20 downstream clusters.
  • Run the two commands.
  • Delete 5 downstream clusters.
  • Run the two commands.

Result:
Was unable to see extra namespaces left over after deletion.
Before deletion:

Aarons-MBP:~ aaronyee$ kubectl get clusters.management.cattle.io --no-headers |wc -l
      21
Aarons-MBP:~ aaronyee$ kubectl get namespaces --no-headers |grep "c-" |wc -l
      20

After deleting 5 clusters:

Aarons-MBP:~ aaronyee$ kubectl get clusters.management.cattle.io --no-headers |wc -l
      16
Aarons-MBP:~ aaronyee$ kubectl get namespaces --no-headers |grep "c-" |wc -l
      15

@aaronyeeski
Copy link
Contributor

aaronyeeski commented Mar 19, 2021

Tested an upgrade and rollback scenario with the following steps.

  • On local cluster of v2.4.8 Rancher server
  • Run the following commands to check number of namespaces.
# Outputs number of clusters
`kubectl get clusters.management.cattle.io --no-headers |wc -l`
# Outputs number of namespaces for downstream clusters in local cluster
`kubectl get namespaces --no-headers |grep "c-" |wc -l`
  • Log in as a standard user
  • Deploy 5 downstream clusters.
  • Run the two commands.
  • Upgrade Rancher setup to v2.5.5
  • Run the two commands.
  • Downgrade Rancher setup to v2.4.8
  • Run the two commands.

Result:
On v2.4.8 with 5 clusters:

Aarons-MBP:~ aaronyee$ kubectl get clusters.management.cattle.io --no-headers |wc -l
       6
Aarons-MBP:~ aaronyee$ kubectl get namespaces --no-headers |grep "c-" |wc -l
       5

After upgrading Rancher to v2.5.5, fleet namespaces are added:

Aarons-MBP:~ aaronyee$ kubectl get clusters.management.cattle.io --no-headers |wc -l
       6
Aarons-MBP:~ aaronyee$ kubectl get namespaces --no-headers |grep "c-" |wc -l
      10
kubectl get namespaces --no-headers |grep "c-"
c-5g6tf   90m
c-66dbk   89m
c-nzlkh   91m
c-smgmr   90m
c-st6mt   90m
local     100m
Aarons-MBP:~ aaronyee$ kubectl get namespaces --no-headers |grep "c-"
c-5g6tf                                      Active   90m
c-66dbk                                      Active   89m
c-nzlkh                                      Active   91m
c-smgmr                                      Active   90m
c-st6mt                                      Active   90m
cluster-fleet-default-c-5g6tf-522d75803d01   Active   7m45s
cluster-fleet-default-c-66dbk-75899698761d   Active   7m45s
cluster-fleet-default-c-nzlkh-8b881c18f996   Active   7m45s
cluster-fleet-default-c-smgmr-447d8c721eb3   Active   7m45s
cluster-fleet-default-c-st6mt-0938c20f8fcf   Active   7m45s

After rollback to v2.4.8, fleet namespaces are not removed:

Aarons-MBP:~ aaronyee$ kubectl get clusters.management.cattle.io --no-headers |wc -l
       6
Aarons-MBP:~ aaronyee$ kubectl get namespaces --no-headers |grep "c-" |wc -l
      10

After deleting a cluster from the rollback setup, I was able to see a fleet namespace which was not deleted.

Aarons-MBP:~ aaronyee$ kubectl get clusters.management.cattle.io --no-headers |wc -l
       5
Aarons-MBP:~ aaronyee$ kubectl get namespaces --no-headers |grep "c-" |wc -l
       9

@deniseschannon
Copy link

@ryanelliottsmith Had your users attempted to upgrade and then rollback?

@deniseschannon
Copy link

deniseschannon commented Mar 19, 2021

Since this was after a rollback from 2.5 to 2.4, we do not expect the fleet namespaces to be cleaned up and this would be expected.

@deniseschannon deniseschannon added the team/hostbusters The team that is responsible for provisioning/managing downstream clusters + K8s version support label Dec 1, 2021
@deniseschannon deniseschannon added this to the v2.6.4 - Triaged milestone Dec 1, 2021
@zube zube bot removed the [zube]: Next Up label Jan 12, 2022
@thedadams
Copy link
Contributor

I was able to reproduce this behavior on Rancher v2.6.3. It is intermittent, but I was able to reproduce it more reliably if the machine deletion took longer than expected or failed. Then, I forced the deletion of the machine. The machines and cluster objects were all cleaned up, but the cluster namespace was leftover.

I reproduced this using v2 provisioning (an RKE2 cluster), but the same controllers are used to create the cluster namespace in that case.

@timhaneunsoo
Copy link

timhaneunsoo commented Feb 14, 2022

Test Environment:

Rancher version: v2.6-head 11a7451
Rancher cluster type: HA
Docker version: 20.10

Downstream cluster type: Various (rke1 aws, rke2 aws, rke1 digital ocean)


Testing:

Tested this issue with the following steps:

  1. Create multiple clusters
  2. Force delete machine in cluster
  3. Check cluster namespace is cleaned up.

Result - Pass
After force deleting machines of RKE1 and RKE2 clusters, the cluster namespace is no longer left over. Rancher local clusters now have the same amount of cluster namespaces compared with actual downstream clusters.
image.png

Fresh - Pass
Upgrade 2.6.3 to 2.6-head 11a7451 - Pass

@wpwoodjr
Copy link

kubectl get namespaces --no-headers |grep "c-" |wc -l
3

Wait, what about the cluster-fleet-default namespaces? Here is what I see:

% kubectl get clusters.management.cattle.io --no-headers       
c-5gdxx   48d
c-6kpd4   68d
c-ch2jz   130d
c-fwmlt   130d
c-hmb78   130d
c-jsklh   130d
c-rcp68   130d
local     130d


% kubectl get namespaces --no-headers |grep "c-"       
c-5gdxx                                                   Active   48d
c-6kpd4                                                   Active   68d
c-ch2jz                                                   Active   130d
c-fwmlt                                                   Active   130d
c-hmb78                                                   Active   130d
c-jsklh                                                   Active   130d
c-rcp68                                                   Active   130d
cluster-fleet-default-c-5gdxx-aa4329d03704                Active   48d
cluster-fleet-default-c-6kpd4-bcabef9fadcf                Active   68d
cluster-fleet-default-c-ch2jz-a5c8bd69999f                Active   130d
cluster-fleet-default-c-fwmlt-a94cb4a59c0d                Active   130d
cluster-fleet-default-c-hmb78-d7c2bc579677                Active   130d
cluster-fleet-default-c-jsklh-6f62b10e803e                Active   130d
cluster-fleet-default-c-rcp68-c5f10c100033                Active   130d

I.E. there is one cluster-fleet-default cluster namespace for each cluster namespace. Is this not expected? I'm on v2.6.2

@thedadams
Copy link
Contributor

@wpwoodjr Those are managed by fleet, not Rancher, and are outside the scope of this issue.

@zube zube bot removed the [zube]: Done label May 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
internal kind/bug Issues that are defects reported by users or that we know have reached a real release team/hostbusters The team that is responsible for provisioning/managing downstream clusters + K8s version support
Projects
None yet
Development

No branches or pull requests

9 participants