Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZFS Crypto support #494

Closed
FransUrbo opened this Issue Dec 14, 2011 · 159 comments

Comments

Projects
None yet
@FransUrbo
Copy link
Member

FransUrbo commented Dec 14, 2011

As of ZFS Pool Version 30, there is support for encryption. This part is unfortunatly closed source, so an opensource implementation would be required. That means it would probably not be compatible with the Solaris version 'but who cares' :).

Illumos is apparently working on this at https://www.illumos.org/projects/zfs-crypto. Source repository can be found at https://bitbucket.org/buffyg/illumos-zfs-crypto. Unfortunatly there is no changes since the fork from illumos-gate. Should ZoL start thinking about this or should we just take the back seat?

Don't know how big of a problem this would be, but 'copying' the way that LUKS (Linux Unified Key Setup) do it seems to be a good place to start.

@behlendorf

This comment has been minimized.

Copy link
Member

behlendorf commented Dec 14, 2011

I'd like to hold off on this for the moment, we have enough other work on our plate and this is a huge change! If Illumos puts together an implementation we'll happily look at integrating it. We would could even use the source from ZFS Pool Version 30 if Oracle decides to release the source 6-12 months from now (unlikely but possible).

@FransUrbo

This comment has been minimized.

Copy link
Member Author

FransUrbo commented Dec 14, 2011

If you don't mind LUKS, I might have some time to look at this in a week or two.

@behlendorf

This comment has been minimized.

Copy link
Member

behlendorf commented Dec 14, 2011

I'm OK with making it easier to layer zfs on top of LUKS, that would be nice. It's just not what most people think of when they say zfs encryption support.

@FransUrbo

This comment has been minimized.

Copy link
Member Author

FransUrbo commented Dec 15, 2011

I was rather thinking of 'cloning'/'copying' the way LUKS works. Or rather, use the LUKS API inside ZFS. LUKS is used by 'cryptsetup' (configures encrypted block devices) and 'dmsetup' (The Linux Kernel Device Mapper userspace library). So it seems LUKS is an API for device encryption.

Using ZFS 'on top of' something like that would probably be easier, but not, as you say, not the intention...

@behlendorf

This comment has been minimized.

Copy link
Member

behlendorf commented Dec 15, 2011

It would be interesting to investigate if what your suggesting is possible. It would result in a second version of zfs encryption which isn't compatible with the pool v30 version but that might not be a big deal. We should be integrating the new feature flag support early next year so it could end up as a Linux-only feature.

@FransUrbo

This comment has been minimized.

Copy link
Member Author

FransUrbo commented Dec 15, 2011

I don't think we'll ever going to be compatible with v30... Not any of us, not unless Oracle all of a sudden 'sees the light', and I'm not holding my breath on that! :)

Best would be if we could come up with a solution, that would be portable to other OS'es. Don't know how much Linux the 'Linux Unified Key Setup' is, but it's worth a look. I'll start that once I have a workable sharesmb patch.

@baryluk

This comment has been minimized.

Copy link

baryluk commented Dec 15, 2011

How about at least reverse engineering v30 format?

@FransUrbo

This comment has been minimized.

Copy link
Member Author

FransUrbo commented Dec 15, 2011

Be my guest! Reverse engineering something, especially a crypt algorithm isn't any where near as simple as it sounds!

@baryluk

This comment has been minimized.

Copy link

baryluk commented Dec 16, 2011

We know it is using SHA-256, and AES-128 with Incremental mode probably, so actually there is nothing complicated, only some on-disk meta-data needs to be reverse enginered, like which bit is what, and where is salt, and where is stored information that it is AES-128 and not 192 or 256. It should be easy. Unfortunately I do not have access to Solaris right now to test it.

@FransUrbo

This comment has been minimized.

Copy link
Member Author

FransUrbo commented Dec 16, 2011

That DO sound easy :). Unfortunatly, we probably have to... I've spent the day looking into LUKS, but it does not seem to fit the purpose :(.

It is intended for being placed between the device and the FS. Which means it needs one device (either one physical disk or multiple disks presented as one through raid/md) where it can store data linearly... Kind of. But since ZFS is both a FS and a ... 'device mapper' (?) which have multiple devices, I doubt it will be possible to have LUKS split the data and it's key storage partitions split over multiple physical disk. I haven't looked at the code yet, just the specs but that's what it looks like so far.

@patrykk

This comment has been minimized.

Copy link

patrykk commented Dec 25, 2011

Hi,
"Oracle Solaris 11 Kernel Source-Code Leaked"
more information:

http://www.phoronix.com/scan.php?page=news_item&px=MTAzMDE

@akorn

This comment has been minimized.

Copy link
Contributor

akorn commented Dec 26, 2011

Of course, you shouldn't look at the leaked source if you work on ZFS lest Oracle accuse you of copyright infringement.

@patrykk

This comment has been minimized.

Copy link

patrykk commented Dec 27, 2011

Yes, You are right.

@baryluk

This comment has been minimized.

Copy link

baryluk commented Dec 27, 2011

LUKS is not an option. ZFS performs encryption on per-dataset/volume/file basis, LUKS works on device level. We already have crypto primitives available in kernel, we already have on-disk format designed, we just need to reverse enginer it (it should be slightly easier than designing it - which in case of crypto-stuff is hard to do properly/securely). Probably ZIL will be the hardest part.

Of course looking at leaked source-code is not an option at all. Even for second I wasn't thinking about it.

@dajhorn

This comment has been minimized.

Copy link
Member

dajhorn commented Jan 3, 2012

An interim solution is ecryptfs, which can be installed on top of ZFS.

Most RPM and DEB systems have built-in management for ecryptfs, which makes it easy to configure.

For maximum performance, dedup and compression should be disabled on any ZFS dataset that hosts a crypto layer.

@atoponce

This comment has been minimized.

@pyavdr

This comment has been minimized.

Copy link
Contributor

pyavdr commented May 6, 2012

This ( http://src.opensolaris.org/source/xref/zfs-crypto ) looks very nice, CDDL and lots of zfs crypto stuff. Maybe we should try to cooperate with Illumous for a common port to linux. In any case processors with AES-NI should be supported to gain optimal performance.

@maxximino

This comment has been minimized.

Copy link
Contributor

maxximino commented May 6, 2012

I would like to point out that the code linked above has ZPL_VERSION = 3 and SPA_VERSION=15. That's quite old!!
(source: http://src.opensolaris.org/source/xref/zfs-crypto/gate/usr/src/uts/common/sys/fs/zfs.h#318)
Oracle didn't merge that code until version 30 (http://hub.opensolaris.org/bin/view/Community+Group+zfs/30).

@behlendorf

This comment has been minimized.

Copy link
Member

behlendorf commented May 10, 2012

We should certainly work with the other ZFS implementations when any crypto work is being considered. Also it's my understanding that the link your referencing is to some of the early crypto work and it has been significantly reworked before being include in v30. That said, it's still probably a reasonable place to get familiar with the basic design decisions.

@ryao

This comment has been minimized.

Copy link
Member

ryao commented Jul 22, 2012

Here is Sun's design document for ZFS encryption support:

http://hub.opensolaris.org/bin/download/Project+zfs-crypto/files/zfs-crypto-design.pdf

We can check out the early code by doing hg clone ssh://anon@hg.opensolaris.org//hg/zfs-crypto/gate.

@mcr-ksh

This comment has been minimized.

Copy link

mcr-ksh commented Sep 26, 2012

I would love to see that as well. crypto is an amazing feature.

@FlorianFranzen

This comment has been minimized.

Copy link

FlorianFranzen commented Feb 13, 2013

The last post was 5 month ago. Did you guys decide on anything? What is the current state?

@gua1

This comment has been minimized.

Copy link

gua1 commented Feb 20, 2013

@FloFra
This is marked "Milestone: 1.0.0" and I think the zfsonlinux developers hope that illumos would have implemented crypto support by then or I suppose if illumos hasn't then they would work on it. My interpretation is "to be done in the distant future".

@grahamperrin

This comment has been minimized.

Copy link

grahamperrin commented Feb 20, 2013

In the ZFS on Linux area

https://groups.google.com/a/zfsonlinux.org/forum/?fromgroups#!searchin/zfs-discuss/crypto

https://groups.google.com/a/zfsonlinux.org/forum/?fromgroups#!searchin/zfs-devel/crypto leads to pool version 33, zfs-crypto (2011-12-22).

In the illumos area

Whilst https://www.illumos.org/projects/zfs-crypto is not recently updated, there's http://wiki.illumos.org/display/illumos/Project+Ideas (2012-07-18)

Device drivers

Niagra Crypto
… Re-implement the crypto acceleration drivers for the SPARC sun4v cpus.

File systems

ZFS encryption
… Import and update the work started by Darren Moffat to provide cryptographic support for ZFS.

I'll align myself with the latter.

Elsewhere

In irc://irc.freenode.net/#zfs on 2013-11-09, someone attention to code on GitHub. We acknowledged the need for someone to audit that code, so I didn't follow the link.

@ryao

This comment has been minimized.

Copy link
Member

ryao commented Sep 2, 2013

@grahamperrin mentioned some encryption code on github. It was determined on the mailing list that it includes code from the Solaris 11 leak and is therefore encumbered. We will not be using it.

@sempervictus

This comment has been minimized.

Copy link
Contributor

sempervictus commented Apr 28, 2014

I believe the encryption code referred to is located at https://github.com/zfsrogue/zfs-crypto. I've been able to merge and build both the SPL changes (https://github.com/zfsrogue/spl-crypto) and the ZFS branch.
Doesnt merge well with the pending ARC cache changes though so not being tested on my current labrats.
I'll post results once i free up a test cycle

@FransUrbo

This comment has been minimized.

Copy link
Member Author

FransUrbo commented Apr 28, 2014

@sempervictus I would be very, very careful using that code (IF you can get it to merge). There's a very, very (yes, yes! :) high risk that that code is the source of my loss of my pool (16TB almost full)....
We have not been able to absolutly pinpoint exactly why, but I've been running that code on my live server for almost a year, upgrading as I went and somehow, somewhere it (the crypto code) messed up the meta data so I/we might be unable to fix it...

@FransUrbo

This comment has been minimized.

Copy link
Member Author

FransUrbo commented Apr 28, 2014

Also, Rougue isn't really maintaining the code any more (ZoL have gone through a lot of changes since he created the repository).

@lundman

This comment has been minimized.

Copy link
Contributor

lundman commented Apr 28, 2014

Actually he is, updating the osx one when I ask etc. He's most likely waiting on 0.6.3 to tag and release.

You can ask him to do a merge anytime if there is something you want sooner.

@FransUrbo

This comment has been minimized.

Copy link
Member Author

FransUrbo commented Apr 28, 2014

@lundman All I've seen is that he have 'come in' once a month (or 'every now and then') accepting patches, sometimes without any review. There was a couple of pulls I wanted to discuss before merge (they required other ZoL pulls I did to be accepted first, which they weren't/haven't yet - and might not ever be).
And after the core-update of ZoL (some Illumos merge), the update wasn't really up to snuff. Considering how fast ZoL is moving right now, I think it's important that he keeps a much closer eye on what's happening, not just accepting others pulls willy-nilly. And basically, that's all he does now...
Don't misunderstand me, I want his code as much as the next guy, but the core issue is that that code somehow, somewhere messed up my pool. My take of all that is that it's because he isn't maintaining it properly. It (I) doesn't mean that the code shouldn't be considered, just that care should be taken before using it...

@lundman

This comment has been minimized.

Copy link
Contributor

lundman commented Mar 7, 2016

openzfsonosx/zfs@0dd851d

Ok, so now I compile it the whole way and load the kernel.
Although, I have a handful of #if 0 //portme still to address, mostly the uio places.

You create quite a number of new kernel modules, which is the standard on Linux after all. On FBSD and OSX, will probably stay at 2, and on IllumOS, 0. Only annoyance there is the mixture of Linux module sources, and actual crypto-api-sources, like that of icp/io/aes.c, sha2_mod.c modhash.c modconf.c. Perhaps the Linux module sources could be separated out?

cmd/icp/Makefile is in configure.ac, but not in git. I assume it was left in configure.ac by mistake, referring to test cmds you use in debug.

I'm a bit surprised how much ahead O3X is now to ZOL, since there was changes in the zio_checksum API calls, (the edonr and skein commit) but it is trivial to change the number of parameters, and test the flags instead. Or calls to zio_checksum_SHA256 will panic.

and I have ignored the assembler files, they don't work here. (gcc vs clang?)

@tcaputi

This comment has been minimized.

Copy link
Contributor

tcaputi commented Mar 7, 2016

@lundman

You create quite a number of new kernel modules, which is the standard on Linux after all. On FBSD and OSX, will probably stay at 2, and on IllumOS, 0. Only annoyance there is the mixture of Linux module sources, and actual crypto-api-sources, like that of icp/io/aes.c, sha2_mod.c modhash.c modconf.c. Perhaps the Linux module sources could be separated out?

I'm not really creating any new modules other than the ICP itself, although I considered doing that at one point (and it is still possible). What I am currently doing is simply emulating Illumos's kernel module structure within a single module. I stole the directory structure from Illumos, so that the root directory structure represents /usr/src/uts/common/crypto and the algs directory represents /usr/src/common/crypto The only other directories I added were asm-* folders which I needed to get architecture specific code working, and the os folder which has supporting code from various parts of Illumos, including modhash.c and modconf.c as you pointed out. I wanted to keep the directory structure as close to Illumos as possible to it would be to maintain as Illumos gets updates that we want.

cmd/icp/Makefile is in configure.ac, but not in git. I assume it was left in configure.ac by mistake, referring to test cmds you use in debug.

I'm not sure what you mean by this.... I have never had a cmd/icp directory and I don't see it in my condigure.ac.

and I have ignored the assembler files, they don't work here. (gcc vs clang?)

I wonder why the assembly doesnt work. Perhaps I can borrow a mac from someone at work and see what is going on there.

@lundman

This comment has been minimized.

Copy link
Contributor

lundman commented Mar 8, 2016

@tcaputi The dir-structure is all fine, and as you say, is like IllumOS. I meant more that any file that includes modctl.h (Linux file yes?) and mod_init calls, I took out with #ifdef LINUX, but those files also contain sources for the IllumOS crypto API, so the files are mixed with LINUX only code and IllumOS API code. No matter :)

cmd/icp/Makefile hmm, I had to remove it here, but now that I check my patch again for it, it is nowhere to be found. So I'm putting it down to my typo while resolving conflicts.

I took the assembler out for now, as I have other things to do first, before it can even run, so its not worth worrying about just yet.

Also, my comments are not meant to be negative, just reporting in as I port it over to OSX. I am pleased something is happening in this area :)

@tcaputi

This comment has been minimized.

Copy link
Contributor

tcaputi commented Mar 8, 2016

@lundman

The dir-structure is all fine, and as you say, is like IllumOS. I meant more that any file that includes modctl.h (Linux file yes?) and mod_init calls, I took out with #ifdef LINUX, but those files also contain sources for the IllumOS crypto API, so the files are mixed with LINUX only code and IllumOS API code. No matter :)

The *_mod_init functions are from Illumos. I renamed them to avoid name conflicts (in Illumos they were all just called _init() because each one belongs to a separate module). The only Linux-specific functions are illumos_crypto_exit() and illumos_crypto_init(), which are the entry points of the module and are needed to call these init functions, since they are not separate modules in the ICP. There are no Linux headers in the ICP.

@lundman

This comment has been minimized.

Copy link
Contributor

lundman commented Mar 8, 2016

Huh that is interesting. Then it would be mixed Linux or Illumos code, so #ifdef's it is :)

@lundman

This comment has been minimized.

Copy link
Contributor

lundman commented Mar 8, 2016

There, I have brought all that mod* stuff back in, and initialise as expected. I have ported everything I had missed. Currently it dies during init, here:

panic(cpu 1 caller 0xffffff7f8f655ec2): "rwlock 0xffffff8c89abeec0 not initialised\n"@spl-rwlock.c:88
(lldb) up
frame #3: 0xffffff7f8f7f9c44 zfs`mod_hash_insert(hash=0xffffff8c89abeec0, key=0xffffff7f8f93eae0, val=0xffffff7f8f93eb00) + 36 at modhash.c:601
-> 601      rw_enter(&hash->mh_contents, RW_WRITER);

(lldb) up
frame #4: 0xffffff7f8f7f089a zfs`kcf_init_mech_tabs + 1146 at kcf_mech_tabs.c:245
-> 245                  (void) mod_hash_insert(kcf_mech_hash,
   246                      (mod_hash_key_t)me_tab[i].me_name,
   247                      (mod_hash_val_t)&(me_tab[i].me_mechid));

(lldb) up
frame #5: 0xffffff7f8f7de292 zfs`illumos_crypto_init + 18 at illumos-crypto.c:91
   88       mod_hash_init();
   89
   90       /* initialize the mechanisms tables supported out-of-the-box */
-> 91       kcf_init_mech_tabs();

I have no call to initialise mh_contents in my patch, so maybe I missed something during the merge.

@lundman

This comment has been minimized.

Copy link
Contributor

lundman commented Mar 8, 2016

Hmm actually looks like IllumOS do not initialise it, tsk tsk eh :)

@tcaputi

This comment has been minimized.

Copy link
Contributor

tcaputi commented Mar 8, 2016

I tried to catch most of those, but there were a lot of places where they rely on zero'd allocation to initialize locks and such. They also didn't have *_fini functions for all of their *_init ones since a lot of their code is not designed to be removable. I tried to fix that as best as I could.

@tcaputi

This comment has been minimized.

Copy link
Contributor

tcaputi commented Mar 8, 2016

In case anyone is watching the PR, I pushed a couple of small fixes for encrypted zvols (which are also encrypted now)

@tcaputi

This comment has been minimized.

Copy link
Contributor

tcaputi commented Mar 8, 2016

@behlendorf or anyone who knows:
I'm looking into getting the test builds to work. Currently it fails when building the ICP. It seems that the current build system flattens the directory structure of all the modules. Can anybody tell me why this is needed? If it is necessary (as I assume it is for some reason or another) can anyone tell me what the preferred way to structure the ICP would be? I like the current structure because it largely emulates the structure from Illumos, which I think is valuable for maintenance purposes.

While I'm asking questions, does anyone know why zil_claim() needs to call dmu_objset_own() as opposed to dmu_objset_hold(). My current implementation requires that the dataset's keychain is loaded for dmu_objset_own() to succeed so the zil currently fails to claim encrypted datasets. Could it just do a hold and then check to see if it has an owner (under the dataset mutex)? Are there any circumstances where zil_claim() could be called while the dataset is already owned? The documentation for owning says:

Legitimate long-holders (including owners) should be long-running, cancelable tasks that should cause "zfs destroy" to fail.

Is zil_claim() really a long running process? It certainly doesn't seem cancelable.... If nobody knows, Ill just make the change and see if it breaks anything, but it would be good to know for sure why it was written like that to begin with.

@lundman

This comment has been minimized.

Copy link
Contributor

lundman commented Mar 9, 2016

I can report partial success. I cleaned up the missing rw_init with openzfsonosx/zfs@b8f2cc6
fixed the UIO code for apple, and the change to spl-list.c (allowing list_insert_before to pass NULL placement). Corrected the zio_checksum 3-args to 4-args change.

I can create a pool, and encrypted dataset, and copy a file to it, which does not show in the pool image file.

First sign of trouble is trying to import the said pool again, but will look into that too.

If any OSXers wish to play with it, it is under https://github.com/openzfsonosx/zfs/compare/wip_crypt3

@tcaputi

This comment has been minimized.

Copy link
Contributor

tcaputi commented Mar 14, 2016

So I've hit a bit of a roadblock and am not sure what the best solution is. If anybody knows a solution off hand, I would appreciate the input. The problem is regarding partially encrypted types, particularly dnodes. Encrypted datasets currently encrypt dnodes by encrypting bonus buffers, but not the dnode itself. Only a few bonus types need to be encrypted, and for right now I'm just working with System Attribute bonus buffers. Other bonus buffers are left in the clear. This setup should hopefully allow us to scrub datasets and check for errors without the keys being loaded.

Before keys can be loaded, several functions (spa_verify_load() during zpool import for instance) are called that need to read dnodes off the disk. These functions inevitably call arc_read() at some point, whether directly or through dnode_hold(). At this time, however, the keychains are not yet loaded into zfs, so the zio layer returns the block of dnodes with all encrypted bonus buffers still encrypted. This is OK because nothing before this point needs the encrypted bonus buffers.

The problem arises when the dnode block is used again later, when the keys are loaded. At this point zfs sees that the dnode block is cached in the ARC and doesnt bother to reread the data so it comes back with the bonus buffer still encrypted. This obviously breaks any code that relies on that.

The solutions I can think of are:

  1. Figure out all of the functions that don't need the bonus buffers and rework them to use zio_read() instead of arc_read(). This is hard because theres a lot of them, and a lot of them and I'm afraid I'll mess a few of them up in a way that we won't catch until much later wheen it breaks someone's system. This is also difficult because many of these functions go through dnode_hold(), and so I would need to add a flag parameter to that (very used) function.
  2. Force all encrypted bonus buffers into spill blocks. The spill blocks can be fully encrypted and leave dnodes fully unencrypted. Therefore, reading a block of dnodes should not read the bonus buffers. This is what Solaris chose, according to their article, but as @behlendorf commented above, this has performance repercussions (2 random reads for every dnode with an encrypted bonus buffer).
  3. Modify the ARC layer somehow to be able to mark buffers as encrypted / decrypted. If I could do this, then I could probably make it behave such that decrypt reads overtake non-decrypted reads. I can look into this, but I am not incredibly familiar with the ARC layer and I'm not sure how hard this would be to do. I also hesitate to add a flag field to arc_buf_t since (if my understanding is correct) there is one of those structs for each block in the ARC.

I apologize for the wall of text, but I would appreciate it if anyone had any input on how I should go about doing this.

@sempervictus

This comment has been minimized.

Copy link
Contributor

sempervictus commented Mar 14, 2016

@tcaputi: would it be possible to mark the bonus buffers cached in ARC as dirtied elsewhere and thus force a re-read from disk through the decryption routines? Essentially tell the ARC its somehow out of sync with the on-disk data instead of asking it to track every object as encrypted or not (which, being a boolean is small overhead, but as you point out, that's precious space).

@tcaputi

This comment has been minimized.

Copy link
Contributor

tcaputi commented Mar 14, 2016

@sempervictus: Since I posted the question I found arc_buf_header_t, pointed to by arc_buf_t, which appears to be 1-1 mapped. This struct does have a field for flags, which solves the space problem. I'm still not sure what changes would need to be made to the ARC for this to work. From the top of my head, I know that the ARC has a hashmap of buffers, and I imagine trying to maintain 2 copies of the same buffer in that hashmap could cause problems, since the second one would attempt take the first's place. Short of any other suggestions I plan on spending a lot of time tomorrow reading through arc.c.

@sempervictus

This comment has been minimized.

Copy link
Contributor

sempervictus commented Mar 14, 2016

@tcaputi: you may want to reach out to @tuxoko regarding ARC since his ABD patch stack significantly alters its behavior and, hopefully, will soon become the defacto ARC behavior on Linux.

@tuxoko

This comment has been minimized.

Copy link
Member

tuxoko commented Mar 14, 2016

@tcaputi
While bonus buffer are part of dnode. They are never operated "in place". If you follows dmu_bonus_hold, you'll see that they always allocate another buffer for bonus. And an interesting part is that this bonus buffer is never itself a part of ARC, and they will not be allocated during scrub.

So what you could do is don't do anything to dnode_phys, make it always have encrypted bonus. And make the decryption happens when you read the bonus into the separate bonus buffer.

But of course, for other stuff like user data and stuff, you'll likely need to mark it as encrypted/decrypted in ARC. The Illumos camp are working on compressed ARC, you might want to look at it. Maybe you can find some useful idea. https://drive.google.com/file/d/0B5hUzsxe4cdmbEh2eEZDbjY3LXM/view?usp=sharing

@sempervictus

This comment has been minimized.

Copy link
Contributor

sempervictus commented Mar 14, 2016

@tuxoko: would it be feasible to create something along the lines of a negative DVA in the arc_buf_hdr_t structure for encrypted data such that repeated access to the encrypted state can still be cached, and the real DVA for decrypted data?

@tcaputi

This comment has been minimized.

Copy link
Contributor

tcaputi commented Mar 14, 2016

@tuxoko:
Thanks for taking the time to respond. If what you say is true that bonus buffers are not held in the ARC, then I'm not sure what is happening in my code anymore. I will need to run some more tests to see what is going on. From my previous debugging tests, I saw the dnodes getting read through the zio layer from the disk into the arc during spa_verify_load(). After that I didn't see any other zio activity related to the encrypted dnodes. Then zfs panicked on an SA magic number assert from one of the encrypted dnodes. Somewhere, my tests must be misleading me.

@tcaputi

This comment has been minimized.

Copy link
Contributor

tcaputi commented Mar 15, 2016

I think I'm approaching a solution a solution, although there is now an additional (but hopefully easier) problem. It seems like the best place to decrypt the bonus buffers would be in dbuf_read_impl(), called by dmu_bonus_hold() (thanks very much to @tuxoko). Currently all encrypted bonus buffers in a DMU_OT_DNODE block are encrypted together, so they collectively have 1 IV and 1 MAC, which is stored in the blkptr_t to the block (as it is for all other object types).

The issue is that decrypting a single bonus buffer requires decrypting all the bonus buffers in the block. I believe the full DMU_OT_DNODE block should be available as long as a bonus buffer within it is available, but this is still at the very least inefficient. Perhaps this wouldn't be quite as bad of an option if it also cached the other decrypted bonus buffers instead of throwing them away.

That said, I had mentioned earlier that we could fit the IV and MAC for a dnode into dnode_phys_t's dn_pad3. This, however, would use 32 bytes of the remaining 36 bytes of padding in that struct, so I would prefer not to do that.

I'm not sure which method is better here, but I'm (again) open to any input. In the meantime I will continue looking myself.

@kpande

This comment has been minimized.

Copy link
Contributor

kpande commented Mar 15, 2016

excellent work!

@tcaputi

This comment has been minimized.

Copy link
Contributor

tcaputi commented Mar 31, 2016

I had to work on another project for a few weeks, but now I am back to working on this full time. After looking at the status of large dnode support in #3542 I am now less averse to using some of the padding in dnode_phys_t. I don't think it will be possible to encrypt / decrypt bonus buffers without keeping the IV / MAC in there or being incredibly inefficient. Without storing these parameters on a per bonus buffer basis, entire dnode_phys_t blocks need to be decrypted together in order to get the contents of just one bonus buffer. The only other alternative seems to be to push all encrypted bonus buffer types into spill blocks, but that seems to work against #3542, which looks like it has the end goal of deprecating spill blocks altogether.

For now I will start working on adding encryption parameters to dnode_phys_t.

@tcaputi

This comment has been minimized.

Copy link
Contributor

tcaputi commented May 4, 2016

@ALL ZFS encryption is (just about) ready for review (#4329). There are some issues with the buildbot's tests right now (which I will resolve shortly), but everything is otherwise implemented.

@cytrinox

This comment has been minimized.

Copy link

cytrinox commented Sep 14, 2016

Is it possible to get this into 0.7.0?
I've build a debian package for my local apt mirror including tcaputis patch.
Encryption is now productive on my fileserver (raidz2, 6x4TB) with multiple datasets and different options (crypt+compr, crypt-only, compr-only).
Building my own dkms package is trivial, but continue testing with keeping track on current master branch is very difficult for such a big patch. Maybe we can include but mark it as experimental for 0.7.0?

@tcaputi

This comment has been minimized.

Copy link
Contributor

tcaputi commented Sep 14, 2016

@cytrinox I would make the comment on the #4329 PR. I think most of the conversation about this feature is happening there at this point.

@tcaputi tcaputi referenced this issue Feb 9, 2017

Merged

ZFS Encryption #5769

Conan-Kudo added a commit to Conan-Kudo/zfs that referenced this issue Aug 17, 2017

Native Encryption for ZFS on Linux
This change incorporates three major pieces:

The first change is a keystore that manages wrapping
and encryption keys for encrypted datasets. These
commands mostly involve manipulating the new
DSL Crypto Key ZAP Objects that live in the MOS. Each
encrypted dataset has its own DSL Crypto Key that is
protected with a user's key. This level of indirection
allows users to change their keys without re-encrypting
their entire datasets. The change implements the new
subcommands "zfs load-key", "zfs unload-key" and
"zfs change-key" which allow the user to manage their
encryption keys and settings. In addition, several new
flags and properties have been added to allow dataset
creation and to make mounting and unmounting more
convenient.

The second piece of this patch provides the ability to
encrypt, decyrpt, and authenticate protected datasets.
Each object set maintains a Merkel tree of Message
Authentication Codes that protect the lower layers,
similarly to how checksums are maintained. This part
impacts the zio layer, which handles the actual
encryption and generation of MACs, as well as the ARC
and DMU, which need to be able to handle encrypted
buffers and protected data.

The last addition is the ability to do raw, encrypted
sends and receives. The idea here is to send raw
encrypted and compressed data and receive it exactly
as is on a backup system. This means that the dataset
on the receiving system is protected using the same
user key that is in use on the sending side. By doing
so, datasets can be efficiently backed up to an
untrusted system without fear of data being
compromised.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes zfsonlinux#494
Closes zfsonlinux#5769
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.