-
-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Config questions, and taking over existing replicas #337
Comments
|
Supplementary question: if you want to have sinks with different root_fs for different clients, do you have to make different sink jobs listening on different ports? Or can you make multiple sink jobs which listen on the same port, and the client identity is used to bind to the correct sink job? |
|
I renamed all the existing destination snapshots to new names. At first it didn't appear they could be taken over: However I did a "zfs rollback " on each filesystem, and after that zrepl was happy - performing incremental updates on top of what was already there. Success! It would be nice to rename the sanoid snapshots to match zrepl naming scheme: The date_time is obvious, but I don't know what the _000 represents. Still, this isn't important - I can just let the sanoid snapshots age out, and then delete them manually. Actually: I will just continue to run sanoid with Many thanks for releasing such an excellent tool! (Aside: whilst syncoid does work, I have a requirement for replicating huge filesystems over TCP without ssh. Syncoid only works over ssh, so I had to compile ssh-hpn with crypto disabled to get decent performance - that was painful) |
|
Hi,
zrepl's pruning grid works on the snapshot's creation times as tracked by the snapshot's
Out of curiosity: Does your CPU have AES instructions (aesni)? Have you tried running vanilla (whatever your distro offers by default) openssh but forcing AES cipher suites, especially AES-CTR if your openssh version allows that mode? Many people seem to try regular openssh for [...] After some dirty benchmarking on my haswell desktop (avx2!) i7-4790, aes128-gcm@openssh.com seems to be faster than aes128-ctr on my machine, which was a little surprising to me; YMMV. Results on my server were similar but somewhat unsteady. |
That's very cool - I will test now. I hadn't realised that zrepl would be able to do time-based pruning of any snapshot, not just ones created by zrepl - and that it doesn't rely on the format of the snapshot name. I think that's worth saying explicitly in the prune page.
Thank you for the suggestion, I hadn't tried changing ciphers. I went back to a couple of these machines - they are Xeon Bronze 3104 (6 cores/12 threads, 1.7GHz). I see the "aes" flag does exist in Trying your benchmark without any cipher selection: That was the sort of bottleneck I was seeing. Trying your sweep across ciphers: So, aes128 is better, but still slower than the disk arrays can generate. ssh is single-threaded, and these Xeon Bronze processors have a low clock speed per core. Your older i7-4790 has a base speed of 3.6GHz, and turbo 4.0GHz - more than twice as fast. It would be interesting to see what zrepl's tls transport is capable of - although I don't see any settings for selecting the cipher. I suppose it will pick the crypto/tls library's preferred default? |
|
@InsanePrawn thanks for jumping in so quickly!
Job names are not currently (and hopefully never) transmitted over the wire. Maybe at some point we are going to assign GUIDs to jobs and transfer those, but not ATM.
Thanks for your experiment, I imagine this thread could be quite insightful to other syncoid users who are considering to switch.
Yes, I saw you comment on #253, I'll continue that discussion there.
Yeah, that's sometime necessary, I have not fully figured out why. Is
Thank you very much, compliments like this make the whole thing worthwile :)
Would you mind opening up a small doc PR for this as well? Fresh eyes are usually the best doc writers ;) Meta
I have considered setting up Discourse a few times, but I guess unless we have some SSO options, the bar of creating a new account is quite high. We have a |
|
Thank you. I'll see if I can find time for PRs, won't be this week though. There was one other question I had. Suppose you have multiple clients pushing to the same server, but you want different clients to use a different Do these have to be different sink jobs listening on different ports? Or can you have multiple sink jobs bound to the same port, and server would use the client identity to decide which sink job to associate each session with? |
Yes, you need to create two sinks ATM. Anyways, this limitation exists only because I had to pick some config model at the time of zrepl's initial release. Based on pure gut feeling. |
|
I'll close this issue since all questions seem to have been answered. Open unassigned issue for a quick-start guide is out here: #368 |
(Sorry if this is not the right place to post, but I couldn't find a linked forum/community group)
I am looking to migrate from an existing zfs replication tool ( syncoid ) to zrepl, but after a first pass of reading the docs I still had some questions.
I think I am able to answer this after re-reading the docs
At https://zrepl.github.io/v0.3.0-rc1/configuration/overview.html#n-push-jobs-to-1-sink it says:
Then I had to dig further: "client identity" is transport-specific. e.g. if using the TCP transport then it's the originating IP address mapped to an identity via a configuration table in the sink.
It's still not immediately clear if the push job names on different push hosts must be distinct. If the job names are only of local significance (i.e. not sent over the wire) then they could have the same job name.
It appears not: here it says:
and I don't see a way in a push job to strip the path prefix.
Let me give a specific example. I am currently doing syncoid push jobs to replicate lxd containers like this:
At the source side (host nuc1), I have datasets like this:
and snapshots like this:
The
@syncoid...snapshot is effectively a bookmark, and the others are just period snapshots, which are also replicated.On another source (host nuc2), I have datasets like this:
At the destination side I have:
That is, the containers from both hosts are replicated into the same target parent dataset - the idea being that I can move containers between hosts without ending up with two backups.
Also, note that the original path prefix "zfs/lxd" is not present in the destination.
As far as I can tell, if I make a zrepl sink job with root_fs "storage1/backup", then on replication I will be forced to deliver to
Is that correct?
This is not too big a big deal: I can rename all my existing replica datasets to match the new schema. Also, if a container moves I can do a corresponding rename on the backup server.
After this renaming, I'll test whether zrepl is able to pick up the existing target dataset and apply further incremental replication, or whether it's going to insist on re-replicating from scratch. I will report back here - I think it's a use case which is worth documenting.
The text was updated successfully, but these errors were encountered: