[CI] Minor updates to playground cleanup #10802

cpepper96 · 2024-04-27T21:13:06Z

Notes for Reviewers

A couple of minor tweaks to the playground cleanup script:

Add line to delete all resources in a namespace prior to deleting said namespace. I think this will allow for a more graceful deletion of namespaces.
Removed patch command as it does not work.

Given that this is essentially a prod cluster, I'm hesitant to add the other command I found to force delete a namespace. I think for now it might be best to just investigate "stuck" namespaces as they come up. Thoughts?

Signed commits

Yes, I signed my commits.

Signed-off-by: Jerod Culpepper <cpepper96@gmail.com>

github-actions · 2024-04-27T21:15:09Z

leecalcote · 2024-04-28T03:25:03Z

What does more harm:

– leftover namespaces that take up space and scheduling cycles, if not empty.

Ungraceful delete of unwanted workloads.

cpepper96 · 2024-04-29T20:28:56Z

What does more harm:

– leftover namespaces that take up space and scheduling cycles, if not empty.

Ungraceful delete of unwanted workloads.

Honestly, I'm not sure what the worst case scenario for forcing deletion of a namespace is. When I was researching this issue people recommend not force deleting namespaces because you aren't fixing the root cause and that root cause could persist (Ex. if we force deleted the "Terminating" namespaces I was looking at earlier then we may not have been able to determine that the broken API services were causing them to hang and therefore those API services would have not been deleted).

nikzayn · 2024-04-30T14:39:02Z

What does more harm:
– leftover namespaces that take up space and scheduling cycles, if not empty.

Ungraceful delete of unwanted workloads.

Honestly, I'm not sure what the worst case scenario for forcing deletion of a namespace is. When I was researching this issue people recommend not force deleting namespaces because you aren't fixing the root cause and that root cause could persist (Ex. if we force deleted the "Terminating" namespaces I was looking at earlier then we may not have been able to determine that the broken API services were causing them to hang and therefore those API services would have not been deleted).

I think while we are forcefully deleting the unused namespaces, we should check if that particular namespaces are linked with mentioned below:

Pods, if linked with pods, first need to drain the pods in case of deployments, then need to delete the pod.
Services, if linked with services, then we need to update the selectors and labels.
PVs and PVCs, if linked with both, then we need to delete the same as well.
ConfigMaps and Secrets, if linked with both, then we need to delete this as well.
Then, we can decommission or delete the namespaces gracefully.
Last, we can check if we have deleted all the namespaces properly or not -> kubectl get ns kubectl get all -A

cpepper96 · 2024-04-30T18:13:41Z

What does more harm:
– leftover namespaces that take up space and scheduling cycles, if not empty.

Ungraceful delete of unwanted workloads.

Honestly, I'm not sure what the worst case scenario for forcing deletion of a namespace is. When I was researching this issue people recommend not force deleting namespaces because you aren't fixing the root cause and that root cause could persist (Ex. if we force deleted the "Terminating" namespaces I was looking at earlier then we may not have been able to determine that the broken API services were causing them to hang and therefore those API services would have not been deleted).

I think while we are forcefully deleting the unused namespaces, we should check if that particular namespaces are linked with mentioned below:

Pods, if linked with pods, first need to drain the pods in case of deployments, then need to delete the pod.

Services, if linked with services, then we need to update the selectors and labels.

PVs and PVCs, if linked with both, then we need to delete the same as well.

ConfigMaps and Secrets, if linked with both, then we need to delete this as well.

Then, we can decommission or delete the namespaces gracefully.

Last, we can check if we have deleted all the namespaces properly or not -> kubectl get ns kubectl get all -A

By linked do you mean those resources are deployed to the namespace? The script currently deletes all resources in the target namespace kubectl delete all --all -n $ns and then deletes that namespace. Is there a better way of deleting those resources?

nikzayn · 2024-04-30T18:35:30Z

What does more harm:
– leftover namespaces that take up space and scheduling cycles, if not empty.

Ungraceful delete of unwanted workloads.

Honestly, I'm not sure what the worst case scenario for forcing deletion of a namespace is. When I was researching this issue people recommend not force deleting namespaces because you aren't fixing the root cause and that root cause could persist (Ex. if we force deleted the "Terminating" namespaces I was looking at earlier then we may not have been able to determine that the broken API services were causing them to hang and therefore those API services would have not been deleted).

I think while we are forcefully deleting the unused namespaces, we should check if that particular namespaces are linked with mentioned below:

Pods, if linked with pods, first need to drain the pods in case of deployments, then need to delete the pod.

Services, if linked with services, then we need to update the selectors and labels.

PVs and PVCs, if linked with both, then we need to delete the same as well.

ConfigMaps and Secrets, if linked with both, then we need to delete this as well.

Then, we can decommission or delete the namespaces gracefully.

Last, we can check if we have deleted all the namespaces properly or not -> kubectl get ns kubectl get all -A

By linked do you mean those resources are deployed to the namespace? The script currently deletes all resources in the target namespace kubectl delete all --all -n $ns and then deletes that namespace. Is there a better way of deleting those resources?

Yes, you can delete all the resources but we just need to decommission the specific namespace so that kubectl delete --all will only delete the main resources like Pods, Deployments, Services and ReplicaSet. If there are other resources which are encapsulated in the K8s scenario like ConfigMaps, Secrets, PVCs etc. It will not delete it.
Secondly, I am thinking of using kubectl delete namespace <namespace> why because it's safe and controlled.
Lastly, if we are using kubectl delete all --all -n $ns are we sure that other critical data is not getting affected because we are tagging all here?

MUzairS15 · 2024-05-04T09:42:21Z

We are only deleting namespace/resources which are deployed by users as they come and try out playground. The namespaces that affect the working of playground environment is intact, also using kubectl delete all —all -n will not lead to deletion of cluster wide resources, so even if some of the cluster wide resources (CR, PV..) are somehow linked they will not get deleted. (Eg. an existing CRB used in prod getting used by a Role in one of the ns created by user even with —all it will not get deleted)

sangramrath · 2024-05-05T14:44:05Z

I second @MUzairS15. This is a playground, meaning it is understood that users are not supposed to run anything production here. We could put a warning at some stage in the workflow stating this and also that it will be lost in nightly cleanup. This is common for cloud sandboxes (public clouds).

Signed-off-by: Jerod Culpepper <cpepper96@gmail.com>

…pepper96/meshery into cpepper96/playground-reset-job

cpepper96 · 2024-05-08T04:11:18Z

I added the "Terminating" namespace check back and the patch command to force delete a "Terminating" namespace. I also cut down the kubectl delete $ns timeout to 1 minute. Let me know what y'all think @MUzairS15 @sangramrath @nikzayn @leecalcote

cpepper96 added 2 commits April 27, 2024 13:53

updates to playground cleanup

d9ebc72

Signed-off-by: Jerod Culpepper <cpepper96@gmail.com>

updated to match other loop syntax

a5885fa

Signed-off-by: Jerod Culpepper <cpepper96@gmail.com>

github-actions bot added the playground label Apr 27, 2024

Merge branch 'master' into cpepper96/playground-reset-job

b18ed5c

Merge branch 'master' into cpepper96/playground-reset-job

4dfe47e

cpepper96 added 2 commits May 7, 2024 20:56

update with terminating check, force delete command

3020bae

Signed-off-by: Jerod Culpepper <cpepper96@gmail.com>

Merge branch 'cpepper96/playground-reset-job' of https://github.com/c…

3b9dec4

…pepper96/meshery into cpepper96/playground-reset-job

MUzairS15 merged commit daae859 into meshery:master May 8, 2024
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CI] Minor updates to playground cleanup #10802

[CI] Minor updates to playground cleanup #10802

cpepper96 commented Apr 27, 2024

github-actions bot commented Apr 27, 2024 •

edited

leecalcote commented Apr 28, 2024

cpepper96 commented Apr 29, 2024

nikzayn commented Apr 30, 2024

cpepper96 commented Apr 30, 2024

nikzayn commented Apr 30, 2024

MUzairS15 commented May 4, 2024

sangramrath commented May 5, 2024

cpepper96 commented May 8, 2024

[CI] Minor updates to playground cleanup #10802

[CI] Minor updates to playground cleanup #10802

Conversation

cpepper96 commented Apr 27, 2024

github-actions bot commented Apr 27, 2024 • edited

leecalcote commented Apr 28, 2024

cpepper96 commented Apr 29, 2024

nikzayn commented Apr 30, 2024

cpepper96 commented Apr 30, 2024

nikzayn commented Apr 30, 2024

MUzairS15 commented May 4, 2024

sangramrath commented May 5, 2024

cpepper96 commented May 8, 2024

github-actions bot commented Apr 27, 2024 •

edited