-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data lost by k3s-uninstall.sh #3264
Comments
My proposal would be to:
If needed I could propose a PR. |
Signed-off-by: angelnu <git@angelnucom>
@bradtopol - thanks for merging! Would you consider a backport of this fix to 1.20? I would be happy to trigger a PR there. |
@angelnu install.sh is only served off master, and is live as soon as merged - so there's no point in backporting it. You will need to re-run the installer to get the updated uninstall script though. |
I see - I did a test to see if I was getting the fix but it did not work for me. The reason is that I install with Ansible and it turns out that that project keeps a derived copy of install.sh at https://github.com/PyratLabs/ansible-role-k3s/blob/main/templates/k3s-killall.sh.j2 I will do a PR to commit the fix there as well. Thanks! Update: opened PyratLabs/ansible-role-k3s#113 |
Fixed upstream - see k3s-io/k3s#3264
@angelnu Following the steps to reproduce, i am seeing all mounts in /var/lib/kubelet are not unmounted after running k3s-uninstall.
During uninstall shows target is busy
|
@ShylajaDevadiga - could you please check what process is keeping the mount busy? I tested with ceph and there after killing all the pods (done a few lines before in killall) the unmount works. Maybe the volumedevices plugin requires additional cleanup. And for confirmation - did the killall about when hitting the busy error? This should prevent the unexpected delete if the unmount fails. |
@angelnu yes by deleting the pod that uses the pvc, umount is successful.
Without deleting the pod here is the fuser results if it helps.
|
if the umount fails then at least some fails are still in use - if it is a process within the container then it should be killed by when the pod is deleted within k3s-killall.sh. This is why my suggestion is to deck after the mount fail on what process keeps the mount busty with lsof. I suspect that your CSI is starting a process outside the pod hat keeps the mount busy and it is not killed by k3s-killall.sh. Handing for those would process would need to be added if this gets confirmed. When do cleanly delete the container the CSI does the unmount. |
@angelnu I had use hostpath in the earlier scenario. After internal discussion we decided to use longhorn csi. Validated fix on k3s version v1.21.1+k3s1. Umount was successful.
|
Environmental Info:
K3s Version:
Node(s) CPU architecture, OS, and Version:
Linux test-k3s2 5.4.0-72-generic #80-Ubuntu SMP Mon Apr 12 17:35:00 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Cluster Configuration:
3 servers
Describe the bug:
The k3s-killall.sh does not unmount all folders under /var/lib/kubelet. Explicitly it does not unmount the CSI ceph mount points which are placed under
/var/lib/kubelet/plugins/kubernetes.io/csi/pv
. This results on k3s-uninstall deleting their content later on when it doesrm -rf /var/lib/kubelet
Steps To Reproduce:
k3s-uninstall.sh
Expected behavior:
Actual behavior:
ceph volumen content lost
Additional context / logs:
NA
The text was updated successfully, but these errors were encountered: