Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

syncoid exclude snapshot types by name #153

Open
graham00 opened this issue Oct 20, 2017 · 38 comments
Open

syncoid exclude snapshot types by name #153

graham00 opened this issue Oct 20, 2017 · 38 comments

Comments

@graham00
Copy link

graham00 commented Oct 20, 2017

I've made some changes to add a new command, --exclude-snaps, so that I can avoid replicating certain types of snapshots. For example:

syncoid --exclude-snaps="hourly"

This skips snapshots in getsnaps() matching that term, and it then loops syncdataset with -i from oldest matching snapshot to next-oldest non-excluded snapshot until it gets a return value indicating it reached the last snapshot. Seems to be cutting the amount of replicated data for most of my VMs over WAN down to less than 50% of what it was.

I can make a pull request with the changes, but if you're not wanting to approach this issue from that direction, then I'll probably just add it in my rsync fork as a personal workaround for now. Thanks

Edit - I did end up adding support for excluding to my fork.

@sithson
Copy link

sithson commented Nov 14, 2017

Hi Graham and jimsalterjrs.

I would also love to see the --exclude functionality as I don't need to replicate hourly snapshots to my backup site.

As of now, hourly snaps always replicate, then sanoid removes them every minute on the backend site.
Very annoying, and causes unnecessary stress on my backup disks.

Thanks for a great tool jimsalterjrs, and I hope you consider Grahams request =)
Cheers!

@phreaker0
Copy link
Collaborator

@graham00 did you create a branch/pr anywhere with your changes?

@redmop
Copy link
Contributor

redmop commented Nov 21, 2018

Does Sanoid handle not trying to delete snapshots that are cloned? That would be a handy addition.

@phreaker0
Copy link
Collaborator

@redmop zfs destroy will error out in this case by default:

zfs destroy test/a@test
cannot destroy 'test/a@test': snapshot has dependent clones
use '-R' to destroy the following datasets:
test/clone

@redmop
Copy link
Contributor

redmop commented Nov 21, 2018

So I'll receive this error every time sanoid runs until excluded or it can actually delete it. Maybe a patch to include an option to ignore this error?

@phreaker0
Copy link
Collaborator

@redmop A patch to silence would be possible but this would need to be configurable which should default to off. In the normal use case if you instruct sanoid to prune specific snapshots you want to know if it failed.

@redmop
Copy link
Contributor

redmop commented Nov 21, 2018

Default to off, of course.
I could always just rename the snapshot when I clone it too.

@phreaker0
Copy link
Collaborator

@graham00 ping

@graham00
Copy link
Author

graham00 commented Nov 22, 2018

@graham00 did you create a branch/pr anywhere with your changes?

@phreaker0 I did - see the link to my fork in my opening comment. The usage for excluding hourly snapshots is documented. If you're only after that, then I suggest you also add "--no-rsync" as is described above that section, so that my fork otherwise behaves like the original syncoid does. (btw - sorry I didn't catch your comment in September!)

Also - I was occasionally syncing my fork up with changes to the main syncoid, but I haven't for a while. My fork is currently based on syncoid-1.4.16, so any feature in normal syncoid beyond that won't work in my fork.

@phreaker0
Copy link
Collaborator

@graham00 thx, just FYI syncoid now supports resume able replication.

@matveevandrey
Copy link
Contributor

Would be happy to see this functionality too

@dbuades
Copy link

dbuades commented Jan 27, 2019

+1 for this feature too.
Have you considered making a pull request with the changes in your fork, @graham00 ?

@jackmurray
Copy link

+1, would love to see this

@amartin3225
Copy link

+1

@Ratief
Copy link

Ratief commented Mar 27, 2019

This is a very important feature for me as it would save a lot of wasted/useless data transfer.

@ryanjaeb
Copy link

ryanjaeb commented Jun 5, 2019

I could also make use of this. I re-use the snapshot name @borgsnap to create a stable location for running Borg backups. When those snapshots end up on my syncoid targets, it causes future syncs to fail. For now it's easy enough to prune those snapshots before running syncoid, but it would great if I could just exclude them in the first place via --exclude-snaps=borgsnap.

@redmop
Copy link
Contributor

redmop commented Jun 5, 2019

@ryanjaeb Might I suggest using the daily/weekly/monthly snapshot for borg backups? That's how I'm doing it.

@phreaker0
Copy link
Collaborator

@ryanjaeb I just list the latest snapshot and use it for borg backup.

@ryanjaeb
Copy link

ryanjaeb commented Jun 5, 2019

@redmp @phreaker0 How do you ensure those snapshots aren't reaped during a long running Borg backup?

@redmop
Copy link
Contributor

redmop commented Jun 5, 2019

That is why I use dailies. I keep them for a 30 days. My frequents don't always last long enough to be backed up during the backup window.

@phreaker0
Copy link
Collaborator

@ryanjaeb I don't need to, zfs destroy won't work if the snapshot is mounted:

$ sudo mount -t zfs test@test /mnt/tmp/
$ sudo zfs destroy test@test
cannot destroy snapshot test@test: dataset is busy

@pquan
Copy link

pquan commented Jun 6, 2019

There's an example of how to exlude specific snapshots in a fork of syncoid here: https://github.com/graham00/syncoid-rsync
Unfortunately that fork is not maintained any more, but still usefull I hope.

Exclude hourly snapshots on remote backups saves a huge amount of data

@phreaker0
Copy link
Collaborator

@pquan it's drifted to much apart, syncoid handles much more (error) cases now. I'm planning to implement this but I don't know when :-)

@discostur
Copy link

+1 would be really cool to exclude specific snapshots from syncing

@graham00
Copy link
Author

Sorry for the delayed response! Life has been crazy lately.

@dbuades

Have you considered making a pull request with the changes in your fork, @graham00 ?

I haven't, since I discussed it with Jim beforehand and he wanted to wait out official resume support. Understandable, since what I did was a fairly ugly (uses a lot of shell tools on both ends) implementation...though of course it does work lol.

The other more minor changes (exceptions/etc) could be merged more easily - honestly, I'm just severely lacking time in life so I haven't had the time to. Anyone should feel free to do so though! The code for the non-resume stuff should be pretty easy to merge back in.

@pquan

fork is not maintained any more

I just updated it with a new security-related feature (--no-rollback) that I've been testing personally for a while offline, but yeah, updates are far and few between and the newer syncoid checks for more error conditions, as phreaker0 mentioned. Basically, I just hack quick changes into it now and then as I personally need something. Anyone should feel free to diff it out and add the code to the main branch if helpful though! Honestly, part of it is that I'm not particularly familiar with how to do pull/etc stuff on github - this branch and whatnot was the first time I've ever done anything beyond download from github.

@yanky83
Copy link

yanky83 commented Aug 19, 2019

+1

@thorro
Copy link

thorro commented Feb 9, 2020

+1

EDIT: As I needed the function ASAP, I just added the following (hackish) code to syncoid getsnaps() function, in each of the two foreach blocks.

            if ($line =~ /$fs\@__replicate.*/) {
                next;
            }

This will skip all snapshots starting with __replicate.

@phreaker0
Copy link
Collaborator

@thorro but this will only work if __replicate is always the most recent snapshot or you are using --no-stream option.

@mreymann
Copy link

mreymann commented Oct 5, 2020

Any news on the --exclude-snaps feature? Would be nice to have!

@RubyFeinstein
Copy link

Just opened a pull request with the required change, let me know if I missed anything:
#593

@segdy
Copy link

segdy commented Mar 3, 2021

This looks like such a must-have for me that I wonder: Is there any reason this has not been added in nearly four years?

I came across this because I was wondering why my --exclude were ignored.

I think it's a very obvious requirement, that you don't want to have frequent or hourly snapshots replicated to a backup server but do want to have daily/weekly/monthly replicated.

So now I have the issue that frequent+hourly snapshots land on my backup server (to which I replicate once a day) and totally clutter everything. Also when I manually delete them, they get re-transfered.

Is there any workaround whatsoever?

@graham00
Copy link
Author

graham00 commented Mar 29, 2021

I've been updating my servers (and associated scripts) lately, and I felt motivated to update syncoid from the old modified version I was using so I could take advantage of all the nice new error checking and other features that everyone's added in recent years.

I still very much want to be able to exclude snapshots though, so I just ported my --exclude-snapshots code over into the current syncoid. It's located here:

https://github.com/graham00/sanoid/blob/master/syncoid

That link only contains modifications necessary for --exclude-snapshots to work - it doesn't have any of the other messy stuff I had done in the past to implement zfs resume before zfs itself had it, etc, since that's now built in to ZFS (and stable...) and supported by syncoid.

@phreaker0 or @jimsalterjrs - I tried to do a pull request, but I don't have rights to create branches, and github appears to not allow people to create more than one fork. I'd rather not delete my old fork, if for no other reason than just to leave all that stuff I cobbled together for the zfs resume there for reference for me in the future if I ever end up doing anything similar (splitting up streams with shell utilities, etc) with perl.

Anyway, since I can't do a pull request - if the above looks good, if you want to merge that in that'd be great. It's working for me, but I haven't tested every possible flag combination, and I'm sure it might have some issue. I think it's the most complete implementation of a "skip snapshots X/Y/Z" sort of feature without bunches of extra stuff kludged together like I had before, though, so it might be worth merging and just filing bugs for anything that's missing for others to implement in the future? Up to you all of course - just wanted to put it out there since I've seen people commenting here over the years wanting that feature in a more updated version of mainstream syncoid.

I updated the command summary at the bottom, but - for anyone wanting to run this, you basically just add something like:
--exclude-snaps=hourly,frequent
...it splits whatever you feed it by comma delimiters, and then compares each snapshot to be sent against each term with ($line =~ /$fs@.*$term/) and skips if any terms matches.

@segdy
Copy link

segdy commented Mar 29, 2021

@graham00 amazing!!
Would love to see this so much in syncoid :)

Hope @jimsalterjrs agrees to.

I’ll check out your code to test meanwhile

@phreaker0
Copy link
Collaborator

@graham00 you don't need another fork to do an pull request. Just create a new branch in your fork bases on the upstream's master branch and apply you changes there. From this branch you can make a pull request against the original masters branch.

@graham00
Copy link
Author

graham00 commented Mar 31, 2021

@phreaker0 Uhhh, sorry, I'm completely lost. I did look around online earlier, but couldn't figure out any way to make this happen then and similar stackoverflow/etc questions didn't seem to be giving a "this is possible" vibe.

Okay, so I went to https://github.com/graham00/syncoid-rsync and went to "branches" (https://github.com/graham00/syncoid-rsync/branches). I then don't have any option to create a new branch... I have a "new pull request" option, but when I do that, it says it can't merge automatically, and gives me no other options. When I went ahead and did it, it seems to have submitted a pull request back on https://github.com/jimsalterjrs, wanting to just merge everything in my fork. .......I hate github >.>

Oh, and when doing the pull request, I tried to intentionally reverse the direction (to pull the master back to my fork on a new branch), but it wouldn't let me do that.

Anyway, I deleted that pull request because I'm obviously not trying to merge everything in my fork back in. *shrug*

@graham00
Copy link
Author

Okay, nevermind, I realized that there is apparently no way to create a new branch from the "branches" screen, but if you go to front page and instead search for a branch that doesn't exist, then you will have the option to create it. Yikes github.......

Anyway, I did that, and then on that branch, deleted all but the single "syncoid" file and tried to do a pull request against that branch. Part of the pull request is then deleting all those other files from the main repository.....

How do I get that new branch exactly like the main repository, so that I can then modify only "syncoid" and submit the pull request?

@graham00
Copy link
Author

Phew, ookay, I gave up and just wiped out my fork after archiving it offline, made a new fork, made the same changes back to "syncoid" again and then did a pull request back into the main repo:

#632

@amartin3225
Copy link

@phreaker0 any updates on reviewing this PR? I am having difficulty with some large syncoid runs due to a lot of changing data in hourly snapshots, so being able to exclude those would be extremely useful. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests