Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --include-snaps and --exclude-snaps options to syncoid #699

Merged
merged 11 commits into from
Jan 13, 2024

Conversation

mr-vinn
Copy link
Contributor

@mr-vinn mr-vinn commented Nov 30, 2021

This PR adds new options to syncoid that filter out snapshots from the source and prevent their transfer to the destination. I added an --exclude-snaps option and an --include-snaps option; both accept a regex argument and both can be specified multiple times. If both options are used in the same syncoid call, then snapshots matching both patterns will be excluded.

In contrast to PR #632, using --exclude-snaps or --include-snaps does not imply --no-stream. That lets you start with a set of source snapshots like this:

  • daily1
  • hourly1
  • hourly2
  • daily2
  • hourly3
  • hourly4
  • daily3

And end up with these snaps on the destination by using --exclude-snaps='hourly':

  • daily1
  • daily2
  • daily3

In other words, using one of the new options causes syncoid to simulate a -I sync by doing a series of -i syncs between intermediate snapshots that are not filtered.

A note on the print statement refactoring: I wanted to remove the many if (debug) and if (!quiet) statements, but it does make this diff a lot bigger. Maintainers, please let me know if you'd like me to rebase this without commit 79aaf45.

@mr-vinn
Copy link
Contributor Author

mr-vinn commented Nov 30, 2021

This addresses issues #153 and #250.

@mr-vinn
Copy link
Contributor Author

mr-vinn commented Dec 1, 2021

I realized that my first push had some changes in the wrong commit, so I rebased to fix that.

@DarwinAwardWinner
Copy link

One thing you should clarify in the docs is whether the regex is matched against only the snapshot name or the entire string of dataset@snapshot. It might also be good to add similar documentation to the --exclude option since there's now more than one.

@mr-vinn
Copy link
Contributor Author

mr-vinn commented Dec 3, 2021

@DarwinAwardWinner good point about the potential confusion between --exclude and the new options. I pushed another commit that adds a new --exclude-datasets option to replace --exclude, and prints out a deprecation warning when --exclude is used. I also added more to the README that hopefully clears up what the regex patterns are matched against.

@DarwinAwardWinner
Copy link

Note that you have a typo in a few places: rexeg instead of regex.

Other than that, it looks good!

@mr-vinn
Copy link
Contributor Author

mr-vinn commented Dec 3, 2021

Oops, thanks for catching that typo @DarwinAwardWinner. I edited the last commit to fix that.

@discostur
Copy link

Would love to see this getting merged ...

@comrade-meowski
Copy link

I've been testing this PR heavily on multiple ZFS hosts for the last 3 or 4 weeks. The new exclude functionality adds the only feature I was missing from sanoid/syncoid.

Good work @mr-vinn

@DarwinAwardWinner
Copy link

I've also been using this branch on my systems as well.

@discostur
Copy link

discostur commented Jan 8, 2022 via email

@sienar1
Copy link

sienar1 commented Mar 11, 2022

This would be a fantastic feature to have. Any chance this is getting merged soon?

@Ratief
Copy link

Ratief commented Dec 22, 2022

This is the number 1 feature that syncoid is missing.

Replace `print` and `warn` statements with a logging function.
Build the zfs send and receive commands in a new subroutine, and
implement other subroutines that can be called instead of building a zfs
command and running it with system();
Process snapshots in one pass rather than looping separately for both
guid and create time.
Add --include-snaps and --exclude-snaps options to filter the snapshots
that syncoid uses.
 Update the README with the new --include-snaps and --exclude-snaps
 syncoid options.
Test the new --include-snaps and --exclude-snaps options for syncoid.
Add a new option, --exclude-datasets, to replace --exclude. This makes
the naming more consistent now that there are options to filter both
snapshots and datasets.

Also add more information to the README about the distinction between
--exclude-datasets and --(in|ex)clude-snaps.
@amartin3225
Copy link

Any updates on when this could get merged? I am also very excited about this functionality

@fmoledina
Copy link

I just started using this branch for a couple datasets where I'm storing a lot of frequently snapshots on the host, but don't need them on the backup target. It's really great!

I'd like to suggest including some indication of the $sourcefs during the send/recv. One dataset I backup has a lot of child datasets and it's never clear to me which dataset is currently being operated on. Possible locations include having $sourcefs in either the a top-level output per dataset or with each incremental sync. Thanks.

@phreaker0 phreaker0 added enhancement revision needed Changes are required for this PR to be merged labels Mar 21, 2023
@phreaker0
Copy link
Collaborator

@mr-vinn i rebased your PR to the latest master to resolve conflicts. This looks really great and is quite a huge change but the restructuring looks really clean.
Unfortunately some tests fail.

($stdout, $exit) = tee_stdout {
system("$synccmd")
};
($exit, $stdout) = syncincremental($sourcehost, $sourcefs, $targethost, $targetfs, $matchingsnap, $newsyncsnap, defined($args{'no-stream'}));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here stdout doesn't seems to include the output needed which is later parsed for "contains partially-complete state" and "estination already exists" which likely lead to the test failure.
in the actual replication function capturing stdout or stderr is decided by the pv size which I don't quite get but I didn't looked to close. Can you look into it?

@amartin3225
Copy link

@mr-vinn any updates on this?

@phreaker0
Copy link
Collaborator

I guess I'll take over here because the original author is inactive and this is a highly requested feature.
I rebased the PR to the latest master to resolve conflicts.
I also found the refactoring error which lead to replication failures in some cases, all tests run through now so this looks ready for prime time now.

@jimsalterjrs jimsalterjrs merged commit 680bf23 into jimsalterjrs:master Jan 13, 2024
@jimsalterjrs
Copy link
Owner

Thanks @phreaker0 !

@phreaker0
Copy link
Collaborator

I hope it gets some broader testing in the master. I will roll it out over my systems in the next time to make sure there is no breakage.

@jimsalterjrs
Copy link
Owner

I hope it gets some broader testing in the master. I will roll it out over my systems in the next time to make sure there is no breakage.

At the very least, I end up testing the master branch myself--whenever I deploy a new system, I use master, not the latest release. Mostly for this exact reason. :)

@comrade-meowski
Copy link

Finally! Thanks for merging this at last. I'm already building+deploying master to some dev systems where they will be heavily tested - a mix of Ubuntu and Arch systems, x86_64 and aarch64. I shall report back.

@comrade-meowski
Copy link

This is long since closed but nonetheless after two weeks I can report back that testing has gone flawlessly. Thanks again for pushing this over the finish line at last.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement revision needed Changes are required for this PR to be merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants