-
Notifications
You must be signed in to change notification settings - Fork 266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot do two etcd restores in a row on the same host #353
Comments
This was encountered in k3s, but I have moved the issue to rke2 because it should have the same problem here and I want it in the GA milestone. |
cc @rancher-max - this is the issue i was chatting with you about while prepping for the demo |
I thought that we had done this on purpose to prevent users from accidentally putting the restore command in their systemd unit and wedging it into a restore loop. I think we shouldn't change this behavior, but maybe add an additional message instructing the user to delete the Or do you think we should delete the |
I understand protecting against multiple accidental restores, but there needs to be some way to do more than one restore. Telling the user in logs to delete a random Restore is now tied to |
Validated using k3s commit id as well as rke2 beta23 k3s version v1.19.2+k3s-c3c98319 Following the above instructions restoring multiple times from the snapshots was successful. |
Environmental Info:
K3s Version:
1.19.1+k3s1 (actually it was a late RC that I used in a demo)
Node(s) CPU architecture, OS, and Version:
A medium/average size digital ocean droplet running Ubuntu 20.04
Cluster Configuration:
3 masters using embedded etcd (though it should repro with just 1 master)
Describe the bug:
If you try to do two etcd restores from the same host, the second one will fail.
We apparently have a check that looks for
/var/lib/rancher/k3s/server/db/etcd-old/
and refuses to do the restore if that director is there, thinking that the db has "already" been restored.Steps To Reproduce:
Warning: these steps are mostly from memory, so may not be perfect
2a. stop k3s
2b. perform the restore just using the k3s binary:
2c. start k3s backup
2d. Observe cluster come back up successfully
This will break
Expected behavior:
Cluster should restore cleaning the second time
Actual behavior:
Cluster fails to restore the second time with this error:
The text was updated successfully, but these errors were encountered: