Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upZFS Encryption #5769
ZFS Encryption #5769
Conversation
mention-bot
commented
Feb 9, 2017
|
@tcaputi, thanks for your PR! By analyzing the history of the files in this pull request, we identified @behlendorf, @grwilson and @ahrens to be potential reviewers. |
|
Several updates have been made to the code since it was moved (thanks to @ahrens for the review thus far). These are the highlights:
These changes do constitute an on-disk format change, so anyone updating from an earlier version of the PR will have to recreate their pool. |
6cf566f
to
ae18e6d
grahamperrin
commented
Feb 10, 2017
•
|
Thanks to everyone so far
For clarity (for any newcomer to this PR), from https://github.com/tcaputi/zfs/blob/ae18e6d0a8c1b32991094594d1b751646bc5b4ca/man/man8/zfs.8 with added emphasis:
(The first – a definition of user data – includes some things that might otherwise be viewed as metadata.) Also from the opening of this PR:
– could be updated to reflect the split that is highlighted in the first (human) comment. |
grahamperrin left a comment
|
Documentation: first pass … |
| accessing the data allow for it. Deduplication with encryption will leak | ||
| information about which blocks are equivalent in a dataset and will incur an | ||
| extra CPU cost per block written. | ||
| This feature enables the creation and management of natively encrypted datasets. |
This comment has been minimized.
This comment has been minimized.
grahamperrin
Feb 10, 2017
As details of the feature are more in the manual page for zfs(8) than in the page for zpool(8), so the SEE ALSO part of this zpool-features(5) page should be expanded to include zfs(8).
| Selecting \fBencryption\fR=\fBon\fR when creating a dataset indicates that the | ||
| default encryption suite will be selected, which is currently \fBaes-256-ccm\fR. | ||
| In order to provide consistent data protection, encryption must be specified at | ||
| dataset creation time and it cannot be changed afterwards. | ||
| .sp | ||
| For more details and caveats about encryption see \fBzpool-features\fR. |
This comment has been minimized.
This comment has been minimized.
grahamperrin
Feb 10, 2017
The SEE ALSO part of this zfs(8) manual page should be expanded to include zpool-features(5).
| accessing the data allow for it. Deduplication with encryption will leak | ||
| information about which blocks are equivalent in a dataset and will incur an | ||
| extra CPU cost per block written. | ||
| This feature enables the creation and management of natively encrypted datasets. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
tcaputi
Feb 10, 2017
Author
Contributor
Yes. This functions the same way as any other zfs property (compression, deduplication, etc)
c4daa03
to
8594e08
niieani
commented
Feb 11, 2017
|
Fantastic work |
| @@ -245,26 +247,35 @@ dsl_dataset_phys(dsl_dataset_t *ds) | |||
| #define DS_UNIQUE_IS_ACCURATE(ds) \ | |||
| ((dsl_dataset_phys(ds)->ds_flags & DS_FLAG_UNIQUE_ACCURATE) != 0) | |||
|
|
|||
| /* flags for holding the dataset */ | |||
| #define DS_HOLD_FLAG_DECRYPT (1 << 0) /* needs access encrypted data */ | |||
This comment has been minimized.
This comment has been minimized.
ahrens
Feb 12, 2017
Member
Lets use an enum for this, so that it's more clear what values the function argument can have. (Unfortunately there's a bad example set by some other flags - cleanup has been on my to-do list for a while now.)
This comment has been minimized.
This comment has been minimized.
| if (dd == NULL) { | ||
| dsl_dir_t *winner; | ||
|
|
||
| dd = kmem_zalloc(sizeof (dsl_dir_t), KM_SLEEP); | ||
| dd->dd_object = ddobj; | ||
| dd->dd_dbuf = dbuf; | ||
| dd->dd_pool = dp; | ||
|
|
||
| if (doi.doi_type == DMU_OTN_ZAP_METADATA && |
This comment has been minimized.
This comment has been minimized.
ahrens
Feb 12, 2017
Member
I think you can use dsl_dir_is_zapified(), instead. (at a cost of maybe doing dmu_object_info twice - shouldn't be a big deal)
This comment has been minimized.
This comment has been minimized.
|
|
||
| /* | ||
| * encrypted datasets can only be moved if they are | ||
| * an encryption root (locally set keyformat). |
This comment has been minimized.
This comment has been minimized.
ahrens
Feb 12, 2017
Member
Why don't we allow that? Seems like a common use case would be "encrypt all datasets below this the same way". "whoops, named a child dataset wrong, rename it". As long as the new location is under the same encryption root, I think we can allow the rename. Would that be hard to implement?
I thought there was a restriction that an unencrypted dataset couldn't be under an encrypted dataset. Doesn't that need to be enforced here?
I didn't see either of these restrictions in the manpages. Seems like they should at least be in the zfs rename section, and maybe also the Encryption section.
This comment has been minimized.
This comment has been minimized.
tcaputi
Feb 12, 2017
Author
Contributor
Why don't we allow that? Seems like a common use case would be "encrypt all datasets below this the same way". "whoops, named a child dataset wrong, rename it"
To clarify, renaming an encryption child is ok. Moving to a new parent is not currently.
As long as the new location is under the same encryption root, I think we can allow the rename. Would that be hard to implement?
I think I can implement it like that. Shouldn't be a big deal.
I thought there was a restriction that an unencrypted dataset couldn't be under an encrypted dataset. Doesn't that need to be enforced here?
Whoops. It looks like I must've rebased that away at some point. I will put it back.
I didn't see either of these restrictions in the manpages. Seems like they should at least be in the zfs rename section, and maybe also the Encryption section.
I will make sure this is documented in both places (I think there is a sentence about it in the encryption section).
| locally, the dataset is an encryption root. Encryption roots share their keys | ||
| with all datasets that inherit this property. This means that when a key is | ||
| loaded for the encryption root, all children that inherit the \fBkeylocation\fR | ||
| property are automatically loaded as well. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
| .sp | ||
| Even though the encryption suite cannot be changed after dataset creation, the | ||
| keylocation can be with \fBzfs change-key\fR. If the dataset is an encryption | ||
| root this property may also be changed with \fBzfs set\fR. If \fBprompt\fR is |
This comment has been minimized.
This comment has been minimized.
ahrens
Feb 12, 2017
Member
If it's not an encryption root, can the key be changed with zfs change-key? The current wording implies it can, by calling out the restriction only for zfs set
This comment has been minimized.
This comment has been minimized.
tcaputi
Feb 12, 2017
Author
Contributor
That is correct. non-encryption roots can use zfs change-key, which will make them an encryption root. I will add a sentence to the man page to reflect this.
| if (ret != 0) | ||
| goto error; | ||
|
|
||
| return (0); |
This comment has been minimized.
This comment has been minimized.
ahrens
Feb 12, 2017
Member
seems like this (success case) would work right if it fell through to the error label.
This comment has been minimized.
This comment has been minimized.
tcaputi
Feb 12, 2017
Author
Contributor
I can change it. I personally like separate error labels that don't share code.
| * the keys may not be loaded. | ||
| */ | ||
| if (wbuf == NULL && BP_IS_ENCRYPTED(bp)) | ||
| zio_flags |= ZIO_FLAG_RAW; |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
tcaputi
Feb 12, 2017
Author
Contributor
I don't see why not. I just didn't want to impact existing functionality. I'll run some tests to make sure that works and make the change.
| BP_SET_ENCRYPTED(bp, B_TRUE); | ||
| } | ||
|
|
||
| /* Perform the encryption. This should not fail */ |
This comment has been minimized.
This comment has been minimized.
ahrens
Feb 12, 2017
Member
Can we use a more precise word than "should"? You're gracefully handling failure just below, so presumably it does fail in some cases... Maybe This will fail for block types that are partially encrypted (dnode, zil) if no encryption is needed.
This comment has been minimized.
This comment has been minimized.
tcaputi
Feb 12, 2017
Author
Contributor
I will change the "should". I am not gracefully handling errors. The only acceptable return codes here are 0 and ZIO_NO_ENCRYPTION_NEEDED (see the assertion below).
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
tcaputi
Feb 13, 2017
Author
Contributor
No, its a special return code to indicate that the block didn't need to be encrypted for some reason (for instance a dnode block with no encryptable bonus buffers).
| * evenly across the bits of the checksum. Therefore, | ||
| * when truncating a weak checksum we XOR the first | ||
| * 2 words with the last 2 so that we don't "lose" any | ||
| * entropy unnecessarily. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
tcaputi
Feb 12, 2017
Author
Contributor
I'm not aware of any, but I know all secure checksums support being truncated without issue whereas there may be weaknesses for XOR folding.
| uint32_t *iv2 = (uint32_t *)(iv + sizeof (uint64_t)); | ||
|
|
||
| ASSERT(BP_IS_ENCRYPTED(bp)); | ||
| bp->blk_dva[2].dva_word[0] = LE_64(*((uint64_t *)salt)); |
This comment has been minimized.
This comment has been minimized.
ahrens
Feb 12, 2017
Member
This trick doesn't work on processors which don't support unaligned memory access. You'll need to do it byte at a time (if it's unaligned, at least - but if you implement it both ways, make sure we can test hitting both code paths).
Same goes for the following functions.
See zap_leaf_array_read() for an example (the la_array is not aligned).
This comment has been minimized.
This comment has been minimized.
|
Thanks for updating the PR, will build out a new test mule this week. |
There are no restrictions, but I would argue that there is also nothing that would prevent them from running a |
afee592
to
2284451
84eeb12
to
5b21838
| * | ||
| * The L1ARC has a slightly different system for storing encrypted data. | ||
| * Raw (encrypted + possibly compressed) data has a few subtle differences from | ||
| * data that is just compressed. The biggest difference is that it is not always |
This comment has been minimized.
This comment has been minimized.
ahrens
Feb 23, 2017
Member
I think you can delete always, because it is not possible to decrypt ... if keys aren't loaded (right?)
This comment has been minimized.
This comment has been minimized.
| * may have both an encrypted version and a decrypted version of its data at | ||
| * once. When a caller needs a raw arc_buf_t, it is allocated and the data is | ||
| * copied out of this header. To avoid complications with b_pabd, raw buffers | ||
| * cannot be shared. |
This comment has been minimized.
This comment has been minimized.
ahrens
Feb 23, 2017
Member
Not sure where is the best place to explain this in the code, but at least for my own edification:
When would we have both encrypted and unencrypted data in the ARC? If we scrub when the keys are not loaded, we'll have encrypted dnode blocks, I think? And once we have encrypted send, that could cache encrypted user data in the ARC? And if we later load the keys and do an unencrypted read, it would decrypt from b_rabd to b_abd.
Once we have both b_rabd and b_abd, do we ever evict one of them, to reduce memory usage? Or are we stuck with 2x memory usage until we happen to evict the whole hdr?
This comment has been minimized.
This comment has been minimized.
tcaputi
Feb 23, 2017
Author
Contributor
When would we have both encrypted and unencrypted data in the ARC? If we scrub when the keys are not loaded, we'll have encrypted dnode blocks, I think? And once we have encrypted send, that could cache encrypted user data in the ARC? And if we later load the keys and do an unencrypted read, it would decrypt from b_rabd to b_abd.
All correct.
Once we have both b_rabd and b_abd, do we ever evict one of them, to reduce memory usage? Or are we stuck with 2x memory usage until we happen to evict the whole hdr?
If we have both we evict the encrypted one as soon as all encrypted buffers are destroyed
This comment has been minimized.
This comment has been minimized.
tcaputi
Feb 23, 2017
Author
Contributor
Not sure where is the best place to explain this in the code...
There is actually already a comment for this in arc_buf_destroy_impl() (where the header freeing takes place).
This comment has been minimized.
This comment has been minimized.
ahrens
Feb 23, 2017
Member
OK cool, I haven't got to reading that part of the code yet, it's a lot to fit in my brain at once :-)
| @@ -817,6 +835,8 @@ static arc_state_t *arc_l2c_only; | |||
| #define arc_need_free ARCSTAT(arcstat_need_free) /* bytes to be freed */ | |||
| #define arc_sys_free ARCSTAT(arcstat_sys_free) /* target system free bytes */ | |||
|
|
|||
| /* encrypted + compressed size of entire arc */ | |||
| #define arc_raw_size ARCSTAT(arcstat_raw_size) | |||
This comment has been minimized.
This comment has been minimized.
ahrens
Feb 23, 2017
Member
Encryption doesn't change the size of the data, so how is this different from arc_compressed_size?
This comment has been minimized.
This comment has been minimized.
tcaputi
Feb 23, 2017
Author
Contributor
this just gives an extra stat for how much raw encrypted data is stored in the ARC. I thought it would be useful seeing as how this memory operates separately from compressed memory.
This comment has been minimized.
This comment has been minimized.
ahrens
Feb 23, 2017
Member
So it's the sum of the size of the b_rabds? Could you update the comment to say that?
This comment has been minimized.
This comment has been minimized.
| @@ -112,20 +123,22 @@ typedef enum arc_flags | |||
| ARC_FLAG_L2_WRITING = 1 << 11, /* write in progress */ | |||
| ARC_FLAG_L2_EVICTED = 1 << 12, /* evicted during I/O */ | |||
| ARC_FLAG_L2_WRITE_HEAD = 1 << 13, /* head of write list */ | |||
| /* encrypted on disk (may or may not be encrypted in memory) */ | |||
| ARC_FLAG_ENCRYPT = 1 << 14, | |||
This comment has been minimized.
This comment has been minimized.
ahrens
Feb 23, 2017
Member
How about ARC_FLAG_ENCRYPTED (and HDR_ENCRYPTED()), since this is not being used as a verb.
This comment has been minimized.
This comment has been minimized.
| /* encryption parameters */ | ||
| uint8_t b_salt[DATA_SALT_LEN]; | ||
| uint8_t b_iv[DATA_IV_LEN]; | ||
| uint8_t b_mac[DATA_MAC_LEN]; |
This comment has been minimized.
This comment has been minimized.
ahrens
Feb 23, 2017
Member
b_mac seems different from the other encryption parameters, because it's used not when writing to L2ARC, but rather when decrypting an encrypted hdr. In that case I think the caller could pass in the BP and we could get the mac there. It might be worth explaining that we keep the mac here for convenience in this sake, so the caller doesn't have to pass in the BP again.
This comment has been minimized.
This comment has been minimized.
tcaputi
Feb 23, 2017
Author
Contributor
Interesting point. it looks like the only place this is really used is in arc_hdr_decrypt(). I will add that comment. I could probably also assert that the mac in l2arc_apply_transforms() matches the one in the header.
| * pool. When this is the case, we must first compress it if it is | ||
| * compressed on the main pool before we can validate the checksum. | ||
| */ | ||
| if (!HDR_COMPRESSION_ENABLED(hdr) && compress != ZIO_COMPRESS_OFF) { |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
ahrens
Feb 23, 2017
Member
Which comment specifically?
I think this code is now assuming that we write blocks to the L2ARC exactly as they are in the main pool, including compressed, unencrypted blocks, even if the b_pabd is uncompressed?
This comment has been minimized.
This comment has been minimized.
tcaputi
Feb 23, 2017
Author
Contributor
Oh. I'm sorry I thought this was a different piece of code. You are correct, this change allows the L2ARC to write out data as it is in the main pool, removing the need for this step.
| @@ -1652,15 +1701,14 @@ arc_hdr_set_compress(arc_buf_hdr_t *hdr, enum zio_compress cmp) | |||
| */ | |||
| if (!zfs_compressed_arc_enabled || HDR_GET_PSIZE(hdr) == 0) { | |||
| arc_hdr_clear_flags(hdr, ARC_FLAG_COMPRESSED_ARC); | |||
| HDR_SET_COMPRESS(hdr, ZIO_COMPRESS_OFF); | |||
This comment has been minimized.
This comment has been minimized.
ahrens
Feb 23, 2017
Member
Is this an intentional change to now set the hdr's compression algorithm even if it is not compressed?
This comment has been minimized.
This comment has been minimized.
tcaputi
Feb 23, 2017
Author
Contributor
Yes. Previously, disabled arc compression worked by simply not storing the compression at all. This change allows us to keep track of the compression algorithm for raw blocks even if arc compression is disabled.
| * This function will take a header that only has raw encrypted data in | ||
| * b_crypt_hdr.b_rabd and decrypts it into a new buffer which is stored in | ||
| * b_l1hdr.b_pabd. If designated on the header, this function will also | ||
| * decompress the data as well. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
| } | ||
|
|
||
| if (HDR_GET_COMPRESS(hdr) != ZIO_COMPRESS_OFF && | ||
| !HDR_COMPRESSION_ENABLED(hdr)) { |
This comment has been minimized.
This comment has been minimized.
ahrens
Feb 23, 2017
Member
This could use a comment explaining this case. I think it's saying that the data on disk is compressed, and therefore the raw data in b_rabd is compressed, but we don't want this hdr to be stored compressed (i.e. b_pabd should not be compressed), so we need to decompress it.
This comment has been minimized.
This comment has been minimized.
| tmp = abd_borrow_buf(cabd, arc_hdr_size(hdr)); | ||
|
|
||
| ret = zio_decompress_data(HDR_GET_COMPRESS(hdr), | ||
| hdr->b_l1hdr.b_pabd, tmp, HDR_GET_PSIZE(hdr), |
This comment has been minimized.
This comment has been minimized.
ahrens
Feb 23, 2017
Member
If it's possible this is used in a performance-sensitive code path, it would be good to avoid the copy from b_pabd to linearize it for decompression. This could be avoided by allocating b_pabd as linear, before decrypting into it. (And then replacing b_pabd with the decompressed scatter ABD, as you're already doing here.)
This comment has been minimized.
This comment has been minimized.
tcaputi
Feb 23, 2017
Author
Contributor
My only issue with that is that then we will not be honoring zfs_abd_scatter_enabled. I do not think this is an incredibly performance-sensitive path either, since this will realistically only be run in 2 circumstances: decrypting dnode blocks and reading data that is already in the ARC due to a scrub.
Let me know what you think and I can change it if you still feel it is appropriate.
This comment has been minimized.
This comment has been minimized.
ahrens
Feb 23, 2017
Member
We could still honor zfs_abd_scatter_enabled because the data stored in b_pabd would be scatter (or not). It's only the intermediary, short-lived buffer (decrypted but not yet uncompressed) that would be linear.
But it does seem like not a super common code path so I could go either way.
This comment has been minimized.
This comment has been minimized.
tcaputi
Feb 23, 2017
Author
Contributor
I see what you mean now... This might be a bit messy to do since arc_hdr_alloc_abd() cannot currently decide how it would like its abd to look. Unfortunately the code for that is a few functions up the stack. I'm inclined to leave it for now unless it becomes a problem.
|
Oh I should mention, in the IllumOS patches, before I included your commits yesterday, when we did a
But it could be an old bug now, but if it is trivial for you to set lz4, and test |
rottegift
commented
Mar 4, 2017
|
Further to @lundman 's comment, we have openzfsonosx/zfs#557 |
|
@lundman I dont know what version you were on before, but the problem seems to be fixed now. |
|
Woo \o/ |
|
@sempervictus |
|
@teknoman117 The issue does not seem to reproduce on Illumos, so we must have changed something at some point. I hope that I can get to the bottom of it tomorrow. |
|
@teknoman117 I figured it out. The problem is actually just with your patch. You need this in the patch as well:
Let me know if I'm missing something or if you need anything else. |
|
My assumption to leave that guard in was based on this comment in aes_impl.c That being said, this does indeed seem to let the generic c implementation load a dataset encrypted using the the aesni or x86_64 optimized method. However, removing the guard completely breaks the accelerated methods. I suppose this means that for x86_64 the byte swap only is only needed for the generic implementation. |
jbrodriguez
commented
Apr 1, 2018
|
My question is .. why is this not stable yet ? I need to run zfs-linux-git on Arch Linux ... sort of unofficially .. is it testing what's keeping this from stable ? |
grantwwu
commented
Apr 1, 2018
|
@jbrodriguez It's not stable yet because it's a Hard Problem, and because the developers have run into some issues, most notably #6845. I don't think you really ought to complain unless you're paying the OpenZFS developers... and if you are you probably have internal channels for that :) |
jbrodriguez
commented
Apr 1, 2018
|
Thanks, I don't know how much you officially speak for @grantwwu, I was just asking a question. Nevertheless, thanks for pointing out the issue, I'll subscribe to it. |
Yeah. I think that comment was meant to be taken in the context of the entire file. In other words, I think the
The 3 big reasons:
|
jbrodriguez
commented
Apr 1, 2018
•
|
Thanks @tcaputi, the major version comment is spot on. I'm looking to start from scratch, so I may have overlooked some considerations. Can't praise enough the work you've done here, kudos. |
This change incorporates three major pieces: The first change is a keystore that manages wrapping and encryption keys for encrypted datasets. These commands mostly involve manipulating the new DSL Crypto Key ZAP Objects that live in the MOS. Each encrypted dataset has its own DSL Crypto Key that is protected with a user's key. This level of indirection allows users to change their keys without re-encrypting their entire datasets. The change implements the new subcommands "zfs load-key", "zfs unload-key" and "zfs change-key" which allow the user to manage their encryption keys and settings. In addition, several new flags and properties have been added to allow dataset creation and to make mounting and unmounting more convenient. The second piece of this patch provides the ability to encrypt, decyrpt, and authenticate protected datasets. Each object set maintains a Merkel tree of Message Authentication Codes that protect the lower layers, similarly to how checksums are maintained. This part impacts the zio layer, which handles the actual encryption and generation of MACs, as well as the ARC and DMU, which need to be able to handle encrypted buffers and protected data. The last addition is the ability to do raw, encrypted sends and receives. The idea here is to send raw encrypted and compressed data and receive it exactly as is on a backup system. This means that the dataset on the receiving system is protected using the same user key that is in use on the sending side. By doing so, datasets can be efficiently backed up to an untrusted system without fear of data being compromised. Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Jorgen Lundman <lundman@lundman.net> Signed-off-by: Tom Caputi <tcaputi@datto.com> Closes openzfs#494 Closes openzfs#5769
taoeffect
commented
May 30, 2019
|
This looks awesome, congrats! Is there a guide / tutorial anywhere on how to set this up? Can be integrated with an existing LUKS encrypted partition setup? (How?) Thanks for helping out a newb! |
|
Manpages show usage, luks is just the header for most dm-crypt setups, and dm-crypt CAN be used under it to protect its unencrypted metadata at the overhead of a whole encrypt/decrypt round under the zfs impl. |
taoeffect
commented
May 30, 2019
•
|
Which manpage, for LUKS or ZFS? (And if ZFS which command?). I might not have been super clear. It's my understanding that "ZFS on root" (i.e. ZFS managing the entire drive) doesn't work well on Linux, so what people do is they create a ZFS partition. That's what this guide does and that's what I've been following. In that guide a partition is created for But now that zfsonlinux has encryption, what's the best thing to do, and how? |
benley
commented
May 30, 2019
|
I'm not sure if this is the best thing to do, but on my laptop I've been using a small fat32 partition for |
|
I have an experimental version of the HOWTO for Debian that builds 0.8.0 from experimental: I haven't updated that since the rc4 package. The final 0.8.0 will likely be in experimental soon, at which point I'll update that again. I'm not sure how much I want to publicize this, though, as a hand-built backport isn't necessarily the best idea. |
|
LUKS is a Linux component, zfs is not, they share zero code, and one is a header for full block encryption of a device while the other is data encryption inside a filesystem. They're not related by any direct linkage. |
taoeffect
commented
May 31, 2019
|
@rlaager wrote:
Wow, thanks for that. It seems like it could be useful as a reference, but also seems quite complicated. I'm wondering if there's a simpler way? Currently exploring the following approach (with Fedora) for a VPS setup:
Is that doable? Would it be simpler? (anyone know of any guides for that?) |
taoeffect
commented
May 31, 2019
•
I should have mentioned in my previous comments, one of the reasons I can't follow that approach is that, as mentioned in that link, you don't always have the required RAM available to use a Live CD (as with a VPS), but also my web host does not offer a Fedora Live CD, so even if I had the RAM, I'd be forced to use the "anaconda" troublshooting environment which doesn't even have a working package manager and is a nightmare to use. |
We both use accelerated encryption implementations, but due to licensing restructions they are separate implementations. |
|
Is this still true now that SIMD support is cut upstream due to GPL-only export of FPU functions? 4.14+ have that commit applied nowadays since it got back ported to 4.14.120 |
|
@taoeffect I run ZFS as the root file system on my Linux machine. There's a small fat32 partition at the beginning which uses an efistub kernel and the zfs "pool" is the second partition which contains datasets for /, /home, and some others. |
If you have that commit then yes SIMD support is disabled. We aer still working on figuring out a workaround there. |
|
We build kernels in-house with zfs as a built-in to the main kernel image (avoids certain instrumentation passes for modules at runtime), which permits us the opportunity to simply revert that commit. Most of our clients cannot do this, and folks are talking about moving to 0.8 this year already due to crypto being available. Im a bit worried that more will stop using zfs for being too slow - we've had a few drop in the last year or so because modern storage hw is much faster than the zfs io pipeline, kpti smashed even more bs into the execution of the "slow" fs code paths, and with no SIMD this is just going to get worse and worse. |
Based on what little I saw, I must, unfortunately. agree. You wouldn't happen to have a link to the issue in question, would you? At this time, it looks to me like the Linux Kernel project is effectively controlled by zealots. I see this as the first step in a "whack-a-mole" process (with aggressive back-porting) to cripple ZoL in favor of the broken BTRFS. As with any religious war, the only thing that matters is the religion; facts are irrelevant. I don't believe this is a war that ZoL can win. With that in mind, I'm in the process of creating a message to all my ZoL customers, suggesting that they consider migrating their file servers from Linux/ZoL to FreeBSD/OpenZFS. |
jbrodriguez
commented
Jun 1, 2019
|
If need be, as the five and dime shop we are, we'll figure out how to push our own kernel build (+patches) into our archlinux setup. We used to use Oracle Solaris 11 11/11 for its ZFS encryption capabilities. Switched to dm-crypt and jumped in blindfolded into linux zfs encryption as soon as tom caputi and team made it available. We're not going back. Only forward. |
taoeffect
commented
Jun 1, 2019
|
@teknoman117 wrote:
You and @benley both seem to be doing this. However, every tutorial I have seen for ZFS on root is insanely complicated, and also does not seem to always work on VPS setups with only 2GB of RAM because of the LiveCD requirements. I wish deeply that ZFS were the default on Linux, and that it were simple to setup. That unfortunately is not the reality. Also, I have not seen any comparisons between the latest ZFS encryption stuff and ZFS on LUKS. It would be useful to know what the tradeoffs are in terms of both ease-of-use, security, and performance. |
I'm not really one to get into the politics of this whole situation, but I will say that despite the statement from some of the Linux maintainers, nothing has really changed in this regard. The Linux kernel has always had a set of functions and symbols that it exports in one release and stops exporting in the next and its internal APIs are always changing. As a "third party" module, we have always had to adapt our software to these changes and we must always write code that can support past, present, and future versions as best as possible. Every time a new kernel comes out, our testing builders fail and within a few days / weeks someone takes a look and fixes any new incompatibilities that have arisen. This particular issue is a bit trickier to work around, but I don't believe we are fundamentally incapable of dealing with it as we have in the past. |
|
@pashford: I wouldn't send that email out yet, likely not to work. ZoL has the accounting feature which, iirc, prevents use on non-Linux systems. This is why we started making all new pools with that feature disabled, then patched zfs to make disabled the default. @taoeffect: livecd is squashfs mapped to ram, not much to do with zfs. Far as dm-crypt vs ANY filesystem crypto, you probably want to get a firm understanding of what block data and file data are relative to each other, because you keep asking questions indicative of a thought process which does not account for the differences in the logical tiers of storage required to provide a user with access to a file (encrypted or otherwise). Far as performance goes, see above, Linux is actively trying to degrade our performance for by making the relevant interface GPL-only. So performance where? On 4.14.119 or today? What are you trying to use zfs for from a business-logic perspective, and why not pay the hosting company for the proper amount of resources? |
taoeffect
commented
Jun 2, 2019
|
Per @kpande's request, I've opened up two discussion threads on the mailing list:
(BTW, this is amazing mailing list software! Never seen it before!) But to quickly answer your questions @sempervictus:
I mean Linux in general. Whatever the latest kernel is. I doubt people will stay on old kernels for long due to the constant security problems that are being created, found, fixed, and re-created.
A 2GB VPS seems like a reasonable target for ZFS, no? Feel free to post any replies via the mailing list! |
|
So far as 2G goes, really depends on the workload. ARC usually takes half your ram, so if the kernel and userspace are under a gig, you're OK till you run software. If you have a large pool with a bunch of dedup blocks, your DDTs will take up even more. Swap and ARC/DDTs don't mix well... |
tcaputi commentedFeb 9, 2017
•
edited
Native encryption in zfsonlinux (See issue #494)
The change incorporates 2 major pieces:
The first feature is a keystore that manages wrapping and encryption keys for encrypted datasets. The commands are similar to that of Solaris but with a few key enhancements to make it more predictable, more consistent, and require less manual maintenance. It is fully integrated with the existing
zfs createfunctions andzfs clonefunctions. It also exposes a new set of commands viazfs keyfor managing the keystore. For more info on the issues with the Solaris implementation see my comments here and here. The keystore operates on a few rules.keylocationorkeyformatwhile creating a dataset causes the dataset to become the root of an encryption tree.The second feature is the actual data and metadata encryption. All user data in an encrypted dataset is stored encrypted on-disk. User-provided metadata is also encrypted, but metadata structures have been left plain so that scrubbing and resilvering still works without the keys loaded. The design was originallly inspired by this article but has been changed fairly significantly since.
Implementation details that should be looked at
key_mapping_tduringdsl_dataset_tryown(). I added a flag to this function for code that wishes to own the dataset, but that does not require encrypted data, such as the scrub functions. I did my best to confirm that all owners set this flag correctly, but someone should confirm them, just to be sure.zfs sendandzfs recvdo not currently do anything special with regards to encryption. The format of the send file has not changed and zfs send requires the keys to be loaded in order to work. At some point there should probably be a way to do raw sends.lzc_create() and lzc_clone()to support hidden arguments. I understand that the purpose of libzfs_core is to have a stable api interacting with the ZFS ioctls. However, these functions need to accept wrapping keys separately from the rest of their parameters because they need to use the (new) hidden_args framework to support hiding arguments from the logs. Without this, the wrapping keys would get printed to the zpool history.EDIT 5/4/16: Updated to reflect the current state of the PR
EDIT 1/3/17: Updated to reflect the current state of the PR
EDIT 2/9/17: reopened under a new PR due to long Github load times (previous PR was #4329)