Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removing PVC finalizer to let it be terminated #6629

Closed
naor2013 opened this issue Aug 31, 2021 · 25 comments · Fixed by #7260
Closed

Removing PVC finalizer to let it be terminated #6629

naor2013 opened this issue Aug 31, 2021 · 25 comments · Fixed by #7260
Assignees
Labels

Comments

@naor2013
Copy link

Summary

What happened/what you expected to happen?
When Argo deletes a PVC at the end of the run, it stays as "Terminated" until the Workflow object is deleted.
I'd expect it to terminate it completely to save space.
It is caused by a finalizer 'kubernetes.io/pvc-protection' protecting the PVC from being terminated and it was discussed here #3095

Note: If the PVC is deleted, while resubmitting works, retry does not. So maybe deleting it should be optional when failed with configuration but I don't think there is any problem doing it for completed successfully workflows.

Diagnostics

👀 Yes! We need all of your diagnostics, please make sure you add it all, otherwise we'll go around in circles asking you for it:

What Kubernetes provider are you using?
OpenShift 4.x (One of latest ones, 4.5 I think)

What version of Argo Workflows are you running?
3.1.2

What executor are you running? Docker/K8SAPI/Kubelet/PNS/Emissary
Emissary

Did this work in a previous version? I.e. is it a regression?
I don't think it ever worked.


Message from the maintainers:

Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.

@naor2013
Copy link
Author

naor2013 commented Sep 2, 2021

@alexec would love to get your feedback on this 😄

@alexec
Copy link
Contributor

alexec commented Sep 7, 2021

I'm not clear what the problem here is. You want the PVC to be deleted when is in GCed? But it is not?

@naor2013
Copy link
Author

naor2013 commented Sep 8, 2021

Exactly.
What causing it is the finalizer in the PVC that connects it to the "completed" pod and until the pods are removed completely, the PVC does not get terminated.

@alexec
Copy link
Contributor

alexec commented Sep 8, 2021

What if you enable Pod GC?

@naor2013
Copy link
Author

naor2013 commented Sep 8, 2021

I want to see the pod's data (logs, workflow UI, etc) in the Argo UI after it is finished, potentially for a long time.

@sarabala1979
Copy link
Member

@naor2013 Is PVC gets deleted once the pod is deleted or workflow is deleted? PVC will have ownerreference of workflow.
Can you verify this?

@sarabala1979
Copy link
Member

I think this PR is protecting PVC kubernetes/kubernetes#66923

@naor2013
Copy link
Author

naor2013 commented Sep 8, 2021

I'm not sure if it gets deleted when the pods get deleted or when the wf gets deleted but why does it matter? I don't want neither to be deleted because I want to see all the logs and everything in the Argo UI.
I can check if it's important.

@naor2013
Copy link
Author

@sarabala1979 any updates about this? 😄

@sarabala1979
Copy link
Member

@naor2013 as perhttps://github.com/kubernetes/kubernetes/pull/66923, if PVC mount to pod, PVC will not completely removed until pod get deleted

@sarabala1979
Copy link
Member

It will release memory. I think it will say in 'terminate1` state

@naor2013
Copy link
Author

Right, but Argo does have a feature to delete PVC, which doesn't work because of that.
I would love if Argo had a feature (maybe configurable in some way) that removes the finalizer in the same code that it deleted the PVC.
That will help free up space automatically instead of letting the users do it manually.

@sarabala1979
Copy link
Member

sarabala1979 commented Sep 17, 2021

@naor2013 Got it. I think you can disable this protection in k8s
kubernetes/kubernetes#79437

The finalizer is also added by the pv-protection and pvc-protection controllers. You can disable that feature by passing --feature-gates=StorageObjectInUseProtection=false to the kube-controller-manager. Note that this is not recommended, and can result in data loss, and the feature will be unconditionally enabled in future releases.

@naor2013
Copy link
Author

@sarabala1979 The problem with that is we have a big k8s infrastructure that is managed in our organization, and the Argo parts are just a minor part of everything.
Asking our k8s team to disable that feature for everyone just for our use-case will be probably not work.

For now, we remove the finalizer in our code that submits the Argo workflow but when we use resubmit in the UI it does not call our code that submits the Workflow so the finalizer stays.

It would be great if it will be implemented in the controller level.

@sarabala1979
Copy link
Member

We can enhance the VolumeClaimGC with force flag. if the flag is true then it will remove the finalizer. I will work it

@alexec
Copy link
Contributor

alexec commented Sep 17, 2021

@sarabala1979, if @naor2013 is keen on this, could he submit the PR with your guidance?

@sarabala1979 sarabala1979 added this to the v3.2 milestone Sep 17, 2021
@naor2013
Copy link
Author

@alexec Unfortunately I don't know golang well enough to do that. I write mostly Python.

@sarabala1979 a flag for that feature would be excellent! Thank you so much 😄

@alexec alexec added solution/workaround There's a workaround, might not be great, but exists invalid labels Sep 21, 2021
@stale stale bot removed wontfix labels Sep 21, 2021
@alexec alexec added wontfix and removed solution/workaround There's a workaround, might not be great, but exists invalid labels Sep 21, 2021
@naor2013
Copy link
Author

@alexec @sarabala1979 I know you're working on many tasks for Argo but is this issue something that is in the backlog is some way or has an ETA?
I see it was removed from the "To Do" column and it's under the Q3 milestone which I assume is a mistake so I'm a bit confused.

@alexec alexec removed this from the 2021 Q3 milestone Oct 19, 2021
@alexec
Copy link
Contributor

alexec commented Nov 22, 2021

I want to make sure I understand the problem, so let me play it back:

You want the pvc-protection. finaliser to be removed from a workflow's PVC when the workflow is completed.

Should this also be protected by a check to make sure no other pods have the PVC mounted?

@naor2013
Copy link
Author

That's a good description of what I want, yes.

I don't see a use-case where any pods will be mounted to the Argo-generated PVC after the Workflow is finished but if you think there is a use-case for it and checking for it is possible, it can be added I guess, but for my use-case this kind of check is not required.

@alexec
Copy link
Contributor

alexec commented Nov 22, 2021

Good point. We don't need to check for others. Just when the workflow is complete, we can patch the finalizer.

@alexec
Copy link
Contributor

alexec commented Nov 22, 2021

These are created here:

func (woc *wfOperationCtx) createPVCs(ctx context.Context) error {

Deleted here:

func (woc *wfOperationCtx) deletePVCs(ctx context.Context) error {

alexec added a commit to alexec/argo-workflows that referenced this issue Nov 23, 2021
Signed-off-by: Alex Collins <alex_collins@intuit.com>
@alexec alexec linked a pull request Nov 23, 2021 that will close this issue
1 task
@alexec alexec self-assigned this Nov 23, 2021
alexec added a commit that referenced this issue Nov 23, 2021
Signed-off-by: Alex Collins <alex_collins@intuit.com>
@tiwarisanjay
Copy link
Contributor

@Alfablos
Copy link

Alfablos commented Mar 2, 2022

Solution : https://argoproj.github.io/argo-workflows/environment-variables/#:~:text=ARGO_REMOVE_PVC_PROTECTION_FINALIZER

This removes the PVC but the PV is still there even though the workflows finishes successfully. It does have pvc-protection as finalizer. The storageClass has "Delete" as reclaimPolicy :/ I'm not sure whether this is intentional or not...

@alexec
Copy link
Contributor

alexec commented May 3, 2022

Please comment on #8592.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants