Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Volume can be deleted in the middle of cloning #414

Closed
jsafrane opened this issue Feb 26, 2020 · 6 comments · Fixed by #422
Closed

Volume can be deleted in the middle of cloning #414

jsafrane opened this issue Feb 26, 2020 · 6 comments · Fixed by #422
Assignees

Comments

@jsafrane
Copy link
Contributor

We use finalizers to prevent volumes from being deleted while a snapshot is being taken. Should we use the same precaution to protect volumes that are being cloned? Or do we assume that the cloning is faster / easier and storage backend will sort it out somehow?

@msau42 @xing-yang, what do you think?

@xing-yang
Copy link
Contributor

Actually snapshotting is usually faster than cloning if no upload is needed. I think it is a good idea to add a finalizer to prevent the source volume from being deleted in cloning.

@jsafrane
Copy link
Contributor Author

Looking at the external provisioner, it's simple to add a finalizer to PVC that is being cloned.

But it's quite hard to remove it reliably. Here is a brief design:

  • During ProvisionExt(), the provisioner adds a finalizer to PVC before calling cloning CreateVolumeRequest.

  • CSI external-provisioner must watch PVCs via a new workqueue.

    • When PVC X has DeletionTimestamp && it has the finalizer:
      • If all PVCs that have X as its data source are not Pending, the finalizer is removed.
      • If there is a Pending PVC that has X as its data source (i.e. cloning may be in progress), the PVC is added back to the workqueue with exp. backoff, so the PVC is re-evaluated periodically, until it is cloned.
    • Therefore CSI external-provisioner must instantiate and run its own PV/PVC/StorageClass shared informers and pass them to provisioning library as options.
    • As side effect, if a PVC that has X as data source is wrong (has wrong storage class, size, ...), it effectively blocks X from being deleted, even though there is no cloning in progress.
      • Checking if cloning is in progress for a PVC is currently impossible, as the cloning may have been started by a previous provisioner instance that has just been restarted. Any in-memory cache will not survive this.
  • During ProvisionExt(), the provisioner clones a PVC even if the source PVC has DeletionTimestamp: the cloning operation may have been started when the source PVC was "ok" and the operation timed out. The provisioner must call CreateVolume again to get the final error code or provisioned volume. [This is actually current behavior, posting here for completeness].

@jsafrane
Copy link
Contributor Author

jsafrane commented Mar 9, 2020

/help

@k8s-ci-robot
Copy link
Contributor

@jsafrane:
This request has been marked as needing help from a contributor.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

/help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Mar 9, 2020
@Danil-Grigorev
Copy link
Contributor

/assign

@jsafrane
Copy link
Contributor Author

/help cancel

@jsafrane jsafrane removed the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Mar 16, 2020
kbsonlong pushed a commit to kbsonlong/external-provisioner that referenced this issue Dec 29, 2023
…go_modules/google.golang.org/grpc-1.53.0

Bump google.golang.org/grpc from 1.52.3 to 1.53.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants