-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(zfspv): handling unmounted volume . #76
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
There can be cases where openebs namespace has been accidently deleted (Optoro case: https://mdap.zendesk.com/agent/tickets/963), There the driver attempted to destroy the dataset which will first umount the dataset and then try to destroy it, the destroy will fail as volume is busy. Here, as mentioned in the steps to recover, we have to manually mount the dataset ``` 6. The driver might have attempted to destroy the volume before going down, which sets the mount as no(this strange behavior on gke ubuntu 18.04), we have to mount the dataset, go to the each node and check if there is any unmounted volume zfs get mounted if there is any unmounted dataset with this option as "no", we should do the below :- mountpath=zfs get -Hp -o value mountpoint <dataset name> zfs set mountpoint=none zfs set mountpoint=<mountpath> this will set the dataset to be mounted. ``` So in this case the volume will be unmounted and still mountpoint will set to the mountpath, so if application pod is deleted later on, it will try to mount the zfs dataset, here just setting the `mountpoint` is not sufficient, as if we have unmounted the zfs dataset (via zfs destroy in this case), so we have to explicitely mount the dataset **otherwise application will start running without any persistence storage**. Here automating the manual steps performed to resolve the problem, we are checking in the code that if zfs dataset is not mounted after setting the mountpoint property, attempt to mount it. This is not the case with the zvol as it does not attempt to unmount it, so zvols are fine. Also NodeUnPublish operation MUST be idempotent. If this RPC failed, or the CO does not know if it failed or not, it can choose to call NudeUnPublishRequest again. So handled this and returned successful if volume is not mounted also added descriptive error messages at few places. Signed-off-by: Pawan <pawan@mayadata.io>
Codecov Report
@@ Coverage Diff @@
## master #76 +/- ##
=======================================
Coverage 23.57% 23.57%
=======================================
Files 14 14
Lines 475 475
=======================================
Hits 112 112
Misses 362 362
Partials 1 1 Continue to review full report at Codecov.
|
kmova
reviewed
Apr 9, 2020
kmova
approved these changes
Apr 9, 2020
1.9 Release Tracker - Due Apr 15th.
automation
moved this from RC2 - Due Apr 10, 2020
to Done
Apr 9, 2020
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There can be cases where openebs namespace has been accidentally deleted. If that happens, the volume CRs deletion will be triggered. Volume CR deletion will trigger dataset deletion process. Prior to the actual deletion of the data, the driver will attempt to do the following:
But since the volume is actively consumed by a pod, the destroy will fail, as the volume is busy. The zvols continue to operate. However with datasets, it is umounted resulting in the setting of mount="no", but then delete will fail.
Now, there are two actions that the user can take:
(a) Continue to clean-up, so that setup can be re-created
(b) Reinstall openebs and try to create volumes and point to the underlying datasets.
In case (a), the user will have to go delete the application pods and then start deleting the PVCs. However, the PV deletion will fail, because the steps to clean up do not find that volume is mounted and will be aborted.
Let us assume case (b), and prior to actually reinstalling, as the volumes are still intact, applications are expected to continue to access the data.
However, if a node restart occurs, due to the mount being set to no, the pods will not be able to access the volume. The data stored by the pod will not persist as it is not backed by persistence storage.
To recover from the partial clean up steps. the following needs to be done on each of the nodes:
zfs get mounted
: check if there is any unmounted dataset with this option as "no".zfs mount <dataset name>
The above commands will result in mounting the dataset.
Here in this PR :
Signed-off-by: Pawan pawan@mayadata.io