-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Option to NOT run backup as soon as the replicationsource is applied to the cluster #627
Comments
Maybe implementing Point In Time Recovery will solve this problem too if this can be implemented in Volsync. But it requires the method to supports incremental backups (I believe But I'm fine with having option not to backup on creation like this issue describes too if this is too hard to implement. |
@onedr0p If you're looking for something you can do right now - you could take a look at spec.paused = true in the replicationsource spec. Something like:
What this will do is still create a job for the replicationsource backup, however the job will run with parallelism = 0 so no actual pod will run until you remove However, be aware that if you're using a copyMethod of Snapshot or Clone, it means the snapshot/clone will still be taken immediately - so when you do unpause later on, this first backup will still be of the (empty?) pvc when you initially created the replicationsource. |
@tesshuflower that isn't really all that ideal because I would have to edit a ton of ReplicationSources in a GitOps repo and have I am curious on your thoughts if there could be a global option for the controller to handle this? helm values # When a ReplicationSource is created run a backup immediately
backupImmediately: false
manageCRDs: true
metrics:
disableAuth: true With that option set I could always do a restic restore using |
It could be implemented as a global option that causes CRs w/o a status to get populated w/ a I'm not sure that all users on a cluster would necessarily agree about what the create-time behavior should be (controller option). There could also be problems where the schedule happens around the same time as the object is created, leading to a delay that isn't long enough. I'm wondering if there's perhaps another way to accomplish the sequencing you want. Could you describe your setup a bit more? |
I run a kubernetes cluster at home with GitOps/FluxCD and sometimes find myself nuking the cluster and re-provisioning it. This means I use volsync to backup and restore the data of my stateful applications when that happens. For example I use Frigate for my NVR and it has a sqlite database I need backed up. In this folder I have volsync.yaml which is the Now let's say I nuke my cluster, when I go back to re-provision it Flux will deploy Frigate and also apply that Now on the restore side, I have a pretty wild taskfile (think |
@JohnStrunk It would be kind of neat to do this the way cloudnative-pg handles it with an https://cloudnative-pg.io/documentation/1.19/backup_recovery/#scheduled-backups I can try to PR this if you are happy with that solution, any pointers on implementation details you would like to see would be welcomed. Thanks! |
@onedr0p I was discussing this with @JohnStrunk and he had an idea that might give you a more robust solution, assuming we understand your use-case correctly. If we simply skip the 1st sync on creation we still have the issue that depending on when you deploy all your CRs, it might still try to do a backup (before your app has started) if the cron schedule happens to be around that time. We had thought you were perhaps creating a replicationdestination in direct mode to your PVC to restore your data at the same time creating a replicationsource with that same PVC as the sourcePVC. If this is the case there are some potential options with the new volumepopulator feature that would perhaps help and avoid the timing issue I mentioned above. However re-reading the above, are you hoping simply to skip the 1st sync while your app deploys for the first time? |
Generally for me that would likely never happen. I backup at night/early morning when I would never be redeploying my cluster.
Yes that would be ideal, I've seen other backup tools offer this option to skip the first backup and then only run on schedule so it would be nice to have an option for this here too. |
@onedr0p I guess we were assuming you needed to delay the 1st sync because you were putting down all the yamls including a replicationdestination that would restore a PVC at the same time as you want to put a replicationsource that would start the backups of that PVC. Is the above not really necessary in your setup? You are looking to deploy a new empty pvc and app, and just want to start the backups but not until the 1st scheduled sync (when presumably your app is up and has written initial data). |
I described my full cluster restore process in this comment (with the downsides and why this feature would be nice to have) so I am unsure if you need me to put this into better words or not but I'll try. My current process for full cluster backup/restore
The downsides to this process are that during step (3) as soon Maybe we're having a hard time communicating this over text? I am down for a quick voice chat to get anything cleared up if needed. |
Your use case seems like something we should support, and I'd like to be able to do it in a robust way w/o extensive scripting on the user's part. Would the following help? If you use the volume populator that was recently added, during cluster restore you could:
This should cause:
The downside is that you'll get an initial backup that is identical to the one you just restored. However, I don't think the above will require external sequencing (or concern over the timing relative to the cronspec) |
I'm definitely interested in giving the volume populator method a try, although I'm curious how that will work with statefulsets and volume claim templates. Do you think that should also cover that usecase? |
I'll try to explain it a bit - I'm supposed to be writing documentation for the volume populator so that should be coming soon - for now here's a high level overview: The way the volume populator works, you can create a PVC with a replicationdestination as the
Like John mentioned, this does mean the 1st backup will probably still contain the same contents as what was just restored. When it comes to stateful sets, you may just need to be careful about naming your PVC to the name it's expecting but startup of stateful sets should re-use existing PVCs so I don't think it should be an issue. |
@tesshuflower that makes sense, I suppose this issue can wait until that feature is released and documented. I'll circle back then with giving the new method a shot instead of my hacky scripts. Thanks for taking the time on explaining all that, I look forward to giving it a shot. |
@onedr0p I have a PR up with my 1st pass at documenting volume populator if you want to take a look: https://volsync--833.org.readthedocs.build/en/833/ If you have comments/suggestions about the content itself, feel free to add comments directly in the PR: #833 |
Over all it looks good but I am not sure how this helps out when you completely nuke your cluster and then want to re-provision since the |
@onedr0p on a re-provisioned cluster it should be something like this:
Now the PVC will remain in pending state until the ReplicationDestination has finished and has created a latestImage (i.e. pulled the data down from your restic repo, written a volumesnapshot). Once the replicationdestination is done, the PVC will be populated with the contents of that volumesnapshot and the PVC will become ready to use, at which point the pod for the app that's trying to mount the PVC can start. |
That sounds awesome, I'll be sure to give it a shot once the new version of volsync is released! |
Volume Populator is indeed awesome and it really simplifies bootstrapping a cluster after a disaster. Thanks ❤️ While being a low priority, I think having the feature describe here would still be nice because once the Volume Populator runs and the app comes online, a backup is made immediately so there's still a needless backup made. |
Understood - I think the main hesitation at the moment to implement something like this is that we essentially don't want to break the existing scheduling with changes - and there's still no guarantee you don't make this backup earlier than you'd like depending on your schedule. |
Describe the feature you'd like to have.
Hi 👋🏼
Please add a configuration option to not have the backups run when the replicationsource is applied to the cluster.
What is the value to the end user? (why is it a priority?)
When bootstrapping a new cluster using GitOps the replicationsource is applied and will start backing up data right away. This isn't really ideal since I want to recover from a previous backup.
How will we know we have a good solution? (acceptance criteria)
An option is added to prevent backups being taken as soon as a replicationsource is applied to the cluster.
Additional context
The text was updated successfully, but these errors were encountered: