Skip to content
This repository was archived by the owner on Nov 7, 2019. It is now read-only.

8727 Native data and metadata encryption for zfs #124

Closed
wants to merge 1 commit into from

Conversation

lundman
Copy link

@lundman lundman commented Jun 7, 2016

Original ZOL PR openzfs/zfs#5769

@zettabot
Copy link

zettabot commented Jun 7, 2016

Can one of the admins verify this patch?

@ahrens
Copy link
Member

ahrens commented Jun 7, 2016

@zettabot go

@@ -39,7 +39,7 @@ MODSRCS_DIR = ../../../common/modules/zfs
GENUNIX_DIR = ../../../common/modules/genunix

CPPFLAGS += -I../../../../../lib/libzpool/common \
-I../../../../../uts/common/fs/zfs
-I../../../../../uts/common/fs/zfs -I../../../../../common/zfs
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are a little easier to read (and manage later) if written one-per-line.

@lundman lundman force-pushed the zol_crypto branch 3 times, most recently from 47c1067 to 5e1f9c1 Compare June 8, 2016 11:41
@ahrens
Copy link
Member

ahrens commented Jun 8, 2016

@zettabot go

@@ -204,6 +206,7 @@ static zfs_command_t command_table[] = {
{ "holds", zfs_do_holds, HELP_HOLDS },
{ "release", zfs_do_release, HELP_RELEASE },
{ "diff", zfs_do_diff, HELP_DIFF },
{ "key", zfs_do_crypto, HELP_CRYPTO },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless there's a good reason, these 3 strings should have the same "x":
"x", zfs_do_x, HELP_X

It looks like the indentation is wrong here, but maybe that's just github.

@gwr
Copy link

gwr commented Jun 8, 2016

One high-level question about this work: Is any of it derived from the old OpenSolaris "zfs-crypto" project? If so, can we get some references to (and/or differences from) the previous code?

{ DMU_BSWAP_UINT64, TRUE, FALSE, "DSL deadlist map hdr" },
{ DMU_BSWAP_ZAP, TRUE, FALSE, "DSL dir clones" },
{ DMU_BSWAP_UINT64, TRUE, FALSE, "bpobj subobj" },
{ DMU_BSWAP_ZAP, TRUE, FALSE, "DSL Keychain" }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless there's a really good reason, we should use the DMU_OTN_* types rather than adding a new object type.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used DMU_OT_* because I wanted the DSL keychain to appear with a name in zdb. Of course, I can change this easily. May I ask what the reason is for not wanting to add more types?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I know it isn't as nice for debugging :-(

The types here are statically allocated, so if someone else wants to add a new type, they would use the same enum value as you. Then it's impossible to integrate the two features into one codebase. For example, we have implemented several new features at Delphix that introduce new metadata objects. If we added new enum values here, we wouldn't be able to integrate both our features and your feature upstream.

Copy link

@tcaputi tcaputi Jun 10, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahrens

Quick question: I'm adding this change right now and everything works. Only problem is that zdb core dumps. The reason is that DSL Keychain is no longer a designated type. Instead it is now just a generic ZAP. As a result, zap_lookup() from dump_zap() panics because it should be using zap_lookup_uint64(). So my question is essentially what is the best way to fix this?

I can modify zap_dump_stats() to also return zap_phys_t->zap_flags in the zap stats. I could then look for this flag in dump_zap() . However, this gets messy because (technically) dump_zap() doesn't use the stats directly; it just calls dump_zap_stats(). So now dump_zap_stats() needs to return the flags by reference somehow (which is a messy) and all callers need to be changed to deal with this.

So should I do all of this work or leave DMU_OT_DSL_KEYCHAIN in the enum? Personally I still think being able to tell an object is a keychain in ZDB without looking for its parent is valuable as well, but that's just me.

EDIT:

To clarify what I meant above, in broader terms, with the DMU_OTN_* interface, we lose the ability to create behaviors for specific object types. Linux IOCTL definitions actually solve this exact problem by including a magic number bitmask However, they have 32 bits to work with and we have 8, so this probably won't help us much.

I see that this is a lot of text over a little zdb print function and I'm quickly leaving the scope of 'ZFS Encryption', so I can just make the change as I stated before, even if its not the cleanest code in the world. However, I think this topic will probably come up again sometime as people add new object types.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding an accessor function to get both kinds of flags:
int zap_get_flags(objset_t *os, uint64_t object, int *normflagsp, zap_flags_t *flagsp);

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did something a little different in the end. I added attr.za_binary_key boolean to the ZAP iterator API. Basically the field is set if the ZAP uses binary keys.

@lundman
Copy link
Author

lundman commented Jun 8, 2016

All the indention issues are just my emacs setting, I will get to fixing them. (Don't suppose there is an official .emacs for illumos? :) )

@tcaputi
Copy link

tcaputi commented Jun 8, 2016

For those of you who don't know me, I am the person maintaining the original version of this PR in ZFS on Linux. I would like to apologize upfront for any style errors you might have found. I made a grievous mistake with my editor early in development and did not notice it for a long time.

@gwr

One high-level question about this work: Is any of it derived from the old OpenSolaris "zfs-crypto" project? If so, can we get some references to (and/or differences from) the previous code?

It is to a small extent. When I first started working on this project I was using it as a reference. The first iteration of the changes in libzfs_crypto.c did look a lot like that project. However, as I started developing I realized that the old project was not going to support the features I wanted and I started redesigning everything from scratch. At this point, pretty much all of the code has been reworked several times (without using the old project), so I don't think a diff would be particularly useful.

@ahrens ahrens mentioned this pull request Jun 8, 2016
@lundman
Copy link
Author

lundman commented Jun 8, 2016

dmake: Warning: Command failed for target `packages.i386/system-test-zfstest.dep'
dmake: Warning: Target `install' not remade because of errors

No idea what that means.

Entries present in manifests but not proto area:
    dir group=bin mode=0755 owner=root path=opt/zfs-tests/tests/functional/cli_root/zfs_key
    file NOHASH group=bin mode=0555 owner=root path=opt/zfs-tests/tests/functional/cli_root/zfs_key/cleanup

looks like I missed zfs_key subdirectory in functional/cli_root/Makefile - Will be fixed next push.

@lundman
Copy link
Author

lundman commented Jun 9, 2016

Ok, I have done the indentation fixes, where I could find them, and also ran all the source files through cstyle and corrected its complaints. Not sure it is worth pushing that yet, since github will then hide the comments? (and it has no logic changes).

I should point out I had to rewrite zdb.c logic pointing uint64_t to za_name as the compiler complained of possible mis-aligned 64bit access.

hopefully, I got it right

         zap_cursor_t zc;
         zap_attribute_t attr;
         dsl_crypto_key_phys_t dckp;
-        uint64_t *txgid;
+        uint64_t txgid;
         size_t keylen;

         dump_zap_stats(os, object);
@@ -13,16 +13,16 @@
         for (zap_cursor_init(&zc, os, object);
                 zap_cursor_retrieve(&zc, &attr) == 0; zap_cursor_advance(&zc)) {

-                txgid = ((uint64_t *)attr.za_name);
+                bcopy(attr.za_name, &txgid, sizeof (txgid));

-                VERIFY0(zap_lookup_uint64(os, object, txgid, 1, 1,
+                VERIFY0(zap_lookup_uint64(os, object, &txgid, 1, 1,
                         sizeof (dsl_crypto_key_phys_t), &dckp));

                 keylen = BYTES_TO_BITS(
                         zio_crypt_table[dckp.dk_crypt_alg].ci_keylen);

                 (void) printf("\t\ttxg %llu : wkeylen = %u\n",
-                        (u_longlong_t)*txgid, (uint_t)keylen);
+                        (u_longlong_t)txgid, (uint_t)keylen);
         }
         zap_cursor_fini(&zc);
 }

@lundman lundman force-pushed the zol_crypto branch 4 times, most recently from 4278e74 to 8748ffb Compare October 5, 2017 03:42
@GernotS
Copy link

GernotS commented Oct 5, 2017

sorry to report that send/receive (see Aug. 18th/19th) still crashes when testing with more than 10GB of data:

panic[cpu1]/thread=ffffff000faf4c40:
BAD TRAP: type=e (#pf Page fault) rp=ffffff000faf4870 addr=68 occurred in module "zfs" due to a NULL pointer dereference
sched:
#pf Page fault
Bad kernel fault at addr=0x68
pid=0, pc=0xfffffffff7d49176, sp=0xffffff000faf4960, eflags=0x10202
cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 6f8<xmme,fxsr,pge,mce,pae,pse,de>
cr2: 68
cr3: 1dc00000
cr8: c

    rdi:               68 rsi: ffffff03dde12120 rdx: ffffff000faf4c40
    rcx:                d  r8: ffffff03db3662a0  r9:                0
    rax:                0 rbx: ffffff03dde12120 rbp: ffffff000faf4970
    r10: fffffffffb858740 r11: fffffffffb800983 r12:                0
    r13: ffffffffc001af00 r14: ffffffffc001e890 r15: ffffff03dde12120
    fsb:                0 gsb: ffffff03c9b89080  ds:               4b
     es:               4b  fs:                0  gs:              1c3
    trp:                e err:                0 rip: fffffffff7d49176
     cs:               30 rfl:            10202 rsp: ffffff000faf4960
     ss:               38

ffffff000faf4750 unix:real_mode_stop_cpu_stage2_end+b21c ()
ffffff000faf4860 unix:trap+e08 ()
ffffff000faf4870 unix:_cmntrap+e6 ()
ffffff000faf4970 zfs:arc_buf_remove+16 ()
ffffff000faf49b0 zfs:arc_buf_destroy_impl+bd ()
ffffff000faf4a10 zfs:arc_evict_hdr+b3 ()
ffffff000faf4aa0 zfs:arc_evict_state_impl+154 ()
ffffff000faf4b40 zfs:arc_evict_state+12e ()
ffffff000faf4b70 zfs:arc_adjust_impl+44 ()
ffffff000faf4ba0 zfs:arc_adjust+d9 ()
ffffff000faf4c20 zfs:arc_reclaim_thread+ae ()
ffffff000faf4c30 unix:thread_start+8 ()

@lundman lundman force-pushed the zol_crypto branch 2 times, most recently from ad60b8f to bc464bb Compare October 6, 2017 06:01
@arekinath
Copy link

arekinath commented Oct 11, 2017

I think you missed this from ZOL in the latest update:

diff --git a/usr/src/uts/common/fs/zfs/bptree.c b/usr/src/uts/common/fs/zfs/bptree.c
index c74d072..25c08f6 100644
--- a/usr/src/uts/common/fs/zfs/bptree.c
+++ b/usr/src/uts/common/fs/zfs/bptree.c
@@ -211,7 +211,8 @@ bptree_iterate(objset_t *os, uint64_t obj, boolean_t free, bptree_itor_t func,
        err = 0;
        for (i = ba.ba_phys->bt_begin; i < ba.ba_phys->bt_end; i++) {
                bptree_entry_phys_t bte;
-               int flags = TRAVERSE_PREFETCH_METADATA | TRAVERSE_POST;
+               int flags = TRAVERSE_PREFETCH_METADATA | TRAVERSE_POST |
+                   TRAVERSE_NO_DECRYPT;
 
                err = dmu_read(os, obj, i * sizeof (bte), sizeof (bte),
                    &bte, DMU_READ_NO_PREFETCH);

(without it, after the recent change to turn async destroy back on for encrypted datasets, you get a panic)

Matching code in ZOL at https://github.com/zfsonlinux/zfs/blob/master/module/zfs/bptree.c#L215

@lundman
Copy link
Author

lundman commented Oct 11, 2017

Ah well spotted, thanks!

@arekinath
Copy link

Re: the dump issues, doing this:

diff --git a/usr/src/uts/common/fs/zfs/zvol.c b/usr/src/uts/common/fs/zfs/zvol.c
index c6a6338..080b2cd 100644
--- a/usr/src/uts/common/fs/zfs/zvol.c
+++ b/usr/src/uts/common/fs/zfs/zvol.c
@@ -2157,6 +2157,9 @@ zvol_dumpify(zvol_state_t *zv)
 	if (zv->zv_flags & ZVOL_RDONLY)
 		return (SET_ERROR(EROFS));

+	if (os->os_encrypted)
+		return (SET_ERROR(ENOTSUP));
+
 	if (zap_lookup(zv->zv_objset, ZVOL_ZAP_OBJ, ZVOL_DUMPSIZE,
 	    8, 1, &dumpsize) != 0 || dumpsize != zv->zv_volsize) {
 		boolean_t resize = (dumpsize > 0);

Seems to be good enough to guard against dumpifying encrypted zvols, preventing the panic. Attempting to use dumpadm on them after adding those two lines exits like this:

# zfs get encryption zones
NAME   PROPERTY    VALUE        SOURCE
zones  encryption  aes-256-ccm  -
# zfs create -V 8G zones/dump2
# dumpadm -d /dev/zvol/dsk/zones/dump2
dumpadm: dumps not supported on /dev/zvol/dsk/zones/dump2

@lundman
Copy link
Author

lundman commented Oct 12, 2017

I prefer print myself, but there were no uses of print in existing zfs-tester test files, whereas there was precedent in using tr -d in zfs_create_014_pos.

@ikozhukhov
Copy link

you can use 'print' instead of 'echo' - it is applicable for portability.
zfs-tests was developed on illumos based platform and for more universal/portable way we can update it.
for example: with DilOS i have GNU coreutils in primary place instead of illumos tools and i have the same issues what can found others non-illumos based platforms.
i did some tricks to use illumos tools from /system/bin/ location for ZFS tests, but right now my env with zfs tests is broken by latest big update by delphix - where they wants to use easy way with PATH.
for my opinion - ZFS tests should be more universal for others platforms and we have to use universal tools - not depend on illumos versions of tools.
more universal will be:
to use GNU coreutils - more platforms have it, but only illumos based platforms have it at /usr/gnu/
to use GNU find - all platforms are using it
GNU SED, GREP/EGREP, DD, etc.
it is why it was more important to me to have environment variables file where we can redefine tools locations if needed like:
export ECHO=/usr/gnu/bin/echo
and use ${ECHO} where we need it, instead of to use PATH with links to tools.

@lundman
Copy link
Author

lundman commented Oct 12, 2017

I agree with everything you say, I already have had to change a quite a few things to the $ENV version on osx. It certainly would be nice if such a standard was agreed upon

@lundman
Copy link
Author

lundman commented Oct 12, 2017

Looks like we managed to pass all tests finally. I plan to squash everything into one single commit unless there is some other number of commits people would prefer..?

@ikozhukhov
Copy link

i think single commit will be fine.

@lundman lundman force-pushed the zol_crypto branch 4 times, most recently from b4a2aaf to 6ce01a1 Compare October 19, 2017 01:03
@lundman lundman changed the title Native data and metadata encryption for zfs 8727 Native data and metadata encryption for zfs Oct 21, 2017
@lundman lundman force-pushed the zol_crypto branch 2 times, most recently from 43e9d1a to 9bb6c76 Compare October 24, 2017 01:09
    This change incorporates three major pieces:

    The first change is a keystore that manages wrapping
    and encryption keys for encrypted datasets. These
    commands mostly involve manipulating the new
    DSL Crypto Key ZAP Objects that live in the MOS. Each
    encrypted dataset has its own DSL Crypto Key that is
    protected with a user's key. This level of indirection
    allows users to change their keys without re-encrypting
    their entire datasets. The change implements the new
    subcommands "zfs load-key", "zfs unload-key" and
    "zfs change-key" which allow the user to manage their
    encryption keys and settings. In addition, several new
    flags and properties have been added to allow dataset
    creation and to make mounting and unmounting more
    convenient.

    The second piece of this patch provides the ability to
    encrypt, decyrpt, and authenticate protected datasets.
    Each object set maintains a Merkel tree of Message
    Authentication Codes that protect the lower layers,
    similarly to how checksums are maintained. This part
    impacts the zio layer, which handles the actual
    encryption and generation of MACs, as well as the ARC
    and DMU, which need to be able to handle encrypted
    buffers and protected data.

    The last addition is the ability to do raw, encrypted
    sends and receives. The idea here is to send raw
    encrypted and compressed data and receive it exactly
    as is on a backup system. This means that the dataset
    on the receiving system is protected using the same
    user key that is in use on the sending side. By doing
    so, datasets can be efficiently backed up to an
    untrusted system without fear of data being
    compromised.

    Reviewed by: Matthew Ahrens <mahrens@delphix.com>
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Reviewed-by: Jorgen Lundman <lundman@lundman.net>
    Signed-off-by: Tom Caputi <tcaputi@datto.com>

Send / Recv Fixes following b52563

    This patch fixes several issues discovered after
    the encryption patch was merged:

    Fixed a bug where encrypted datasets could attempt
    to receive embedded data records.

    Fixed a bug where dirty records created by the recv
    code wasn't properly setting the dr_raw flag.

    Fixed a typo where a dmu_tx_commit() was changed to
    dmu_tx_abort()

    Fixed a few error handling bugs unrelated to the
    encryption patch in dmu_recv_stream()

    Signed-off-by: Tom Caputi <tcaputi@datto.com>

Encryption patch follow-up

* HKDF implementation moved to its own file and tests added to ensure
correctness.

* Ztest can now create and test encrypted datasets. This is currently
disabled until issue ZOL #6526 is resolved, but otherwise functions as
advertised.

* Several small bug fixes discovered after enabling ztest to run on
encrypted datasets.

* Fixed coverity defects added by the encryption patch.

* Updated man pages for encrypted send / receive behavior.

* Fixed a bug where encrypted datasets could receive
  DRR_WRITE_EMBEDDED records.

* Minor code cleanups / consolidation.

Disable crypto tests in ztest

* Includes fix in dmu_free_long_object_impl

Unless permission is given to compile the crypto framework in userland
the crypto tests in ztest are disabled on IllumOS.
@ahrens
Copy link
Member

ahrens commented Oct 31, 2017

@lundman This PR is just the first encryption commit from ZoL, right? In that case please squash to one commit. To raise the visibility of this, you might open a new PR and mention in the description that you think this is ready to integrate, and also call for any additional help you might need (e.g. testing and code review?).

@lundman
Copy link
Author

lundman commented Oct 31, 2017

@ahrens This is the first PR, followed by all fixes, two more PRs and fixes. It has everything in it that ZoL has. I don't think it is so much about what I need, but what you want to happen next.

@lundman
Copy link
Author

lundman commented Nov 1, 2017

Closed in favour of #489

@lundman lundman closed this Nov 1, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.