Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

auto-apply --raw send flag (ZREP_SEND_FLAGS) on encrypted datasets #144

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

auto-apply --raw send flag (ZREP_SEND_FLAGS) on encrypted datasets #144

wants to merge 1 commit into from

Conversation

onlime
Copy link

@onlime onlime commented Dec 12, 2019

We need some auto-detection of encrypted zfs datasets so that --raw zfs send flag is added without having to explicitly set ZREP_SEND_FLAGS environment variable. As we usually want to replicate all datasets managed by zrep using zrep sync -f all, we could not use different ZREP_SEND_FLAGS (with or without --raw) for individual datasets.

To be on the safe side, this patch only sets ZREP_SEND_FLAGS=--raw if the environment variable was not defined / was empty.

@ppbrown
Copy link
Member

ppbrown commented Dec 12, 2019 via email

@darkpixel
Copy link

What about the use-case where the local filesystem is encrypted and the remote filesystem uses the exact same key and has it loaded. No need to pass --raw.

@ppbrown
Copy link
Member

ppbrown commented Dec 12, 2019 via email

@darkpixel
Copy link

Quoth the manpage:

-w, --raw
    For encrypted datasets, send data exactly as it exists on disk. This allows backups to
    be taken >>even if encryption keys are not currently loaded<<. The backup may then
    be received on an untrusted machine since that machine will not have the encryption
    keys to read the protected data or alter it without being detected. Upon being received,
    the dataset will have the same encryption keys as it did on the send side, although the
    keylocation property will be defaulted to prompt if not otherwise provided. For
    unencrypted datasets, this flag will be equivalent to -Lec.  Note that if you do not use
    this flag for sending encrypted datasets, data will be sent unencrypted and may be
    re-encrypted with a different encryption key on the receiving system, which will disable
    the ability to do a raw send to that system for incrementals.

Basically the raw flag lets you send encrypted datasets to the receiver without sending the key.
Without the raw flag, you're sending the decrypted data to the remote. The remote may store it unencrypted, or if it's going to a pool or dataset that has encryption set up, it will encrypt it using the receiver's encryption/key setup.

@ppbrown
Copy link
Member

ppbrown commented Dec 13, 2019 via email

@onlime
Copy link
Author

onlime commented Dec 13, 2019

Wait... To the original poster... Why EXACTLY dont you just have --raw in your ZREP_SEND_FLAGS all the time? That is not made clear to me.

As I tried to explain in my first comment:

As we usually want to replicate all datasets managed by zrep using zrep sync -f all, we could not use different ZREP_SEND_FLAGS (with or without --raw) for individual datasets.

To be more specific: We have like 10 unencrypted and only 2 encrypted datasets. If we use ZREP_SEND_FLAGS=--raw this is applied to all zrep replication, as we want to use the luxury of your all which detects zrep managed datasets. It would be uncool to use zrep list and loop over those datasets by ourselves (in a wrapper script) and only apply --raw on encrypted datasets. Or are you trying to say that using --raw on unencrypted datasets does no harm at all and we shouldn't care about overusing it for all replications?

Our setup looks like this:

# Create encrypted zpool
$ zpool create -O acltype=posixacl -O encryption=on -O keylocation=prompt -O keyformat=passphrase epool sdf
$ zfs get encryption,keylocation,keystatus,keyformat epool
NAME   PROPERTY     VALUE        SOURCE
epool  encryption   aes-256-ccm  -
epool  keylocation  prompt       local
epool  keystatus    available    -
epool  keyformat    passphrase   -

# Create dataset inside epool
$ zfs create epool/subvol-test

epool exists on both host nodes, using the same key (passphrase), being correctly mounted. zrep initializing reports:

hn2$ zrep -i epool/subvol-test hn1 epool/subvol-test
Setting zrep properties on epool/subvol-test
Creating snapshot epool/subvol-test@zrep_000000
Sending initial replication stream to hn1:epool/subvol-test
cannot send epool/subvol-test@zrep_000000: encrypted dataset epool/subvol-test may not be sent with properties without the raw flag
cannot receive: failed to read from stream
Destroying any zrep-related snapshots from epool/subvol-test
Removing zrep-related properties from epool/subvol-test
Error: Error transferring epool/subvol-test@zrep_000000 to hn1:epool/subvol-test. Resetting

There should be no need to pass --raw if encryption keys are the same on both hosts and they are loaded. Besides, this works flawlessly (without passing --raw) on ZFS volumes with fixed volsize and btrfs filesystem inside (which should not make any difference, as ZFS does not know about underlying data).

So maybe we should not develop such a workaround like I have proposed, but rather figure out why zrep asks for --raw while it is actually not required in this use case.

@ppbrown
Copy link
Member

ppbrown commented Dec 13, 2019 via email

@darkpixel
Copy link

@onlime Yeah, you can use --raw regardless of the dataset being encrypted or not.
From that man page snippet: For unencrypted datasets, this flag will be equivalent to -Lec.

So you can definitely use ZREP_SEND_FLAGS="--raw" for all of them.

@ppbrown I'll give real-world use-cases for each example.

Encrypted source uses --raw to send to the destination to ensure the destination can't access the data. This is how we back up to the cloud and to our datacenter. We don't store the keys in the cloud so no one can get the keys/data.

Encrypted source does not use --raw to send to the destination since they destination has it's own encryption. This is how we back up to our external USB drives since they have a different encryption key.

Not encrypted source does not use --raw to send to encrypted destination. We have a few boxes that don't use ZFS-native encryption (they have encryption at the LUKS layer and those LUKS devices are turned into a zpool) but need to be backed up in an encrypted format to the destination (be it a locally attached USB drive or our cloud/datacenter).

Basically I don't think zrep should be trying to figure out if it needs to use --raw based on encryption properties. Encryption properties can't tell you the whole story. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants