Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A manually set zpool 'compatibility' property is not preserved over system reboots #12261

Closed
siebenmann opened this issue Jun 21, 2021 · 25 comments
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@siebenmann
Copy link
Contributor

System information

Type Version/Name
Distribution Name Fedora
Distribution Version 33
Linux Kernel 5.12.11-200.fc33.x86_64
Architecture x86_64
ZFS Version 2.1.99-316_gc4c162c1e (current git tip)
SPL Version 2.1.99-316_gc4c162c1e (current git tip)

Describe the problem you're observing

The zpool compatibility property documented in the zpoolprops manual page appears to be how zpool create creates pools with only certain features enabled and how zpool upgrade implements limited feature upgrades. The latter property makes it quite interesting to set on existing old pools to limit the degree that zpool upgrade will upgrade them. However, experimentation says that a manually set compatibility property does not appear to be preserved over a reboot. If I set eg zpool set compatibility=openzfs-2.0-linux <pool>, zpool get compatibility will show it set at that point, but after a system reboot zpool get compatibility reports that the value is back to off.

Further, zpool history shows the compatibility being set, but it's gone now:

# zpool history ssddata
[...]
2021-06-21.12:30:44 zpool set compatibility=openzfs-2.0-linux ssddata
2021-06-21.14:24:24 zpool import -c /etc/zfs/zpool.cache -aN
# zpool get compatibility ssddata
NAME     PROPERTY       VALUE          SOURCE
ssddata  compatibility  off            default

There is no mention of this behavior in the manual pages, and it would be highly unusual for a pool property setting to be recorded in the pool history and then discarded. If the pool must have some required feature in order for this to work, it should at least be mentioned in the manpage, and better yet zpool set compatibility=... should either fail outright or give an error message.

Describe how to reproduce the problem

Set compatibility to some valid value on an old pool without it that has pending updates. Do zpool get compatibility to observe that it's reported as set, and zpool history to see that it's in the history. Reboot, and check zpool get compatibility. If you're running git tip or a development version and your pool has not yet been upgraded, you can further verify this by setting compatibility to openzfs-2.0-linux, which excludes draid, rebooting, and doing a zpool upgrade, which will now enable draid on you.

@behlendorf
Copy link
Contributor

@siebenmann I agree, not having the property persist would be very strange behavior. It's definitely intended to be persistent. I did a little bit of local testing using the master branch but wasn't able to reproduce the issue. The property remained set after an import/export cycle, and also after a reboot. This is just a guess, but is it possible you were using mismatched user utilities and kernel modules? Perhaps there's a case there which was overlooked.

@siebenmann
Copy link
Contributor Author

Based on the timestamps on my machines, I believe that it was not a mismatched kernel/userland. I normally have a coherent user and kernel package set, because I only upgrade the ZFS packages to the current git tip shortly before I install a new kernel and reboot. zpool history says that on one machine, I set the property at 11:41, then logs say I upgraded the packages at 12:19 and rebooted a bit later (after installing the kernel itself and doing other prep) at 12:28 (then noticed shortly after when I did a zpool upgrade and saw it had added draid). There's a similar time gap on the second machine, the one that the zpool history log in my initial report comes from; the property was set at 12:30, I installed new ZFS RPMs at 14:11, and the kernel reboots at 14:24.

I won't be able to reboot machines and do other more disruptive testing until some time tomorrow or Wednesday. My pool on one machine has been upgraded and so may not reproduce this, but two pools on my other machine are as yet un-upgraded.

@siebenmann
Copy link
Contributor Author

In case it's useful, on the system that hasn't been updated, zpool update says that both my pools are missing:

      redaction_bookmarks
      redacted_datasets
      bookmark_written
      log_spacemap
      livelist
      device_rebuild
      zstd_compress
      draid

@behlendorf
Copy link
Contributor

That's very strange. Before performing an upgrade on another pool you may want to consider creating a checkpoint with zpool checkpoint. This way you'll be able to rollback the entire pool to before the upgrade if you decide you need to.

This property isn't that different from other ZFS property so I don't have a great explanation for what might have happened here. @colmbuckley by chance did you happen to observe anything like this in your testing? Or perhaps have an idea what caused the reported behavior:?

@colmbuckley
Copy link
Contributor

That is indeed very strange. There's no difference (at least, no intentional difference) between how the compatibility property is stored and any other property. I have not encountered this issue during any of my testing (which, admittedly, has all been on Debian rather than Fedora), and I haven't been able to reproduce it.

@siebenmann What versions were you upgrading from/to in your previous comment? Is it possible that the property was lost during the package/module upgrade rather than the reboot? (Mind you, I don't know what mechanism would cause that either, but it feels like it might be slightly more likely than over the reboot.)

If this is the case, then (very handwavey working theory) we might look to confirm that newly-set property values can survive a module unload/reload without an explicit pool export/import. I'll poke at this shortly.

@colmbuckley
Copy link
Contributor

Not able to reproduce here (Debian buster, freshly-installed ZFS from git master head) - the compatibility and comment properties each survived an export/import, module unload/reload and reboot.

Digging in the source, I don't see anything which would cause this property to be stored or behave any differently from any other. If you do get the opportunity to test on another system, could you please try setting an additional property as well as compatibility - eg: comment as below so that we can see if the problem is isolated to just this property, or affects potentially all pool properties?

root@zfstest:~# modprobe zfs
root@zfstest:~# zpool create -d tank /dev/disk/by-id/google-zpool
root@zfstest:~# zpool get comment,compatibility tank
NAME  PROPERTY       VALUE          SOURCE
tank  comment        -              default
tank  compatibility  off            default
root@zfstest:~# zpool set comment="Test Comment" tank
root@zfstest:~# zpool set compatibility=openzfs-2.0-linux tank
root@zfstest:~# zpool get comment,compatibility tank
NAME  PROPERTY       VALUE              SOURCE
tank  comment        Test Comment       local
tank  compatibility  openzfs-2.0-linux  local
root@zfstest:~# umount /tank
root@zfstest:~# modprobe -r zfs
root@zfstest:~# modprobe zfs
root@zfstest:~# zpool import -a
root@zfstest:~# zpool get comment,compatibility tank
NAME  PROPERTY       VALUE              SOURCE
tank  comment        Test Comment       local
tank  compatibility  openzfs-2.0-linux  local
root@zfstest:~# zpool export tank
root@zfstest:~# zpool import -a
root@zfstest:~# zpool get comment,compatibility tank
NAME  PROPERTY       VALUE              SOURCE
tank  comment        Test Comment       local
tank  compatibility  openzfs-2.0-linux  local
root@zfstest:~# reboot
[...]                                                                                                                                                  
root@zfstest:~# modprobe zfs
root@zfstest:~# zpool import -a
root@zfstest:~# zpool get comment,compatibility tank
NAME  PROPERTY       VALUE              SOURCE
tank  comment        Test Comment       local
tank  compatibility  openzfs-2.0-linux  local
root@zfstest:~# zfs --version
zfs-2.1.99-1
zfs-kmod-2.1.99-1
root@zfstest:~# zpool history tank
History for 'tank':
2021-06-22.09:03:28 zpool create -d tank /dev/disk/by-id/google-zpool
2021-06-22.09:03:36 zpool set comment=Test Comment tank
2021-06-22.09:03:42 zpool set compatibility=openzfs-2.0-linux tank
2021-06-22.09:03:56 zpool export tank
2021-06-22.09:04:00 zpool import -a
2021-06-22.09:04:17 zpool import -a
2021-06-22.09:05:53 zpool import -a

@rincebrain
Copy link
Contributor

rincebrain commented Jun 22, 2021

Well, I can reproduce something on Fedora 34 with 5.12.11-300 and ba91311.

$ sudo zpool get all test  | grep compat
test  compatibility                  zol-0.8                        local
$ sudo zpool history test
History for 'test':
2021-06-22.09:05:12 zpool create test /apool -o compatibility=zol-0.8
2021-06-22.09:05:45 zpool export test -F
2021-06-22.09:07:29 zpool import -c /etc/zfs/zpool.cache -aN
2021-06-22.09:07:44 zpool set compatibility=openzfs-2.0-linux test
2021-06-22.09:07:48 zpool export test -F
2021-06-22.09:11:42 zpool import -c /etc/zfs/zpool.cache -aN

(those export -Fs were automatic when I rebooted the host)

Since this didn't reproduce for me just from doing zpool import, I'm going to guess the -c is important? (The reboots or -aN might be too, but that would be unfortunate.)

edit: Nah, no reboot needed.

$ sudo zpool create test /apool  -o compatibility=zol-0.8
$ sudo zpool get compatibility test
NAME  PROPERTY       VALUE          SOURCE
test  compatibility  zol-0.8        local
$ sudo zpool set compatibility=openzfs-2.0-linux test
$ sudo zpool get compatibility test
NAME  PROPERTY       VALUE              SOURCE
test  compatibility  openzfs-2.0-linux  local
$ sudo zpool export test -F
$ sudo zpool import -c /etc/zfs/zpool.cache;
cachefile import failed, retrying
nvpair_value_nvlist(nvp, &rv) == 0 (0x16 == 0)
ASSERT at ../../module/nvpair/fnvpair.c:586:fnvpair_value_nvlist()Aborted
$ sudo zpool import -c /etc/zfs/zpool.cache -aN;
$ sudo zpool get compatibility test
NAME  PROPERTY       VALUE          SOURCE
test  compatibility  zol-0.8        local

@colmbuckley
Copy link
Contributor

colmbuckley commented Jun 22, 2021 via email

@rincebrain
Copy link
Contributor

$ sudo zpool create test /apool  -o compatibility=zol-0.8 -o comment="meow"
$ sudo zpool get all test | egrep '(comment|compat)'
test  comment                        meow                           local
test  compatibility                  zol-0.8                        local
$ sudo zpool set compatibility="openzfs-2.0-linux" test
$ sudo zpool set comment="woof" test
$ sudo zpool get all test | egrep '(comment|compat)'
test  comment                        woof                           local
test  compatibility                  openzfs-2.0-linux              local
$ sudo zpool export -F test
$ sudo zpool import -c /etc/zfs/zpool.cache -aN
$ sudo zpool get all test | egrep '(comment|compat)'
test  comment                        meow                           local
test  compatibility                  zol-0.8                        local
$

@siebenmann
Copy link
Contributor Author

I was previously running 2.1.99-314_gff3175040, which was the current git tip when I built and installed it. Since I have live ZFS pools, I don't believe that my kernel modules were unloaded and reloaded during package update and there's no sign of this in the kernel logs. I've now reproduced this with a test virtual machine that's currently running 2.1.99-314_gff3175040; I booted the VM, set the compatibility property, rebooted, and it wasn't there any more. Zpool history shows:

2021-06-22.10:36:22 zpool set compatibility=openzfs-2.1-linux tank
2021-06-22.10:36:56 zpool import -c /etc/zfs/zpool.cache -aN

There's no export during the reboot, and I don't believe I normally see any.

@colmbuckley
Copy link
Contributor

$ sudo zpool create test /apool  -o compatibility=zol-0.8 -o comment="meow"
$ sudo zpool get all test | egrep '(comment|compat)'
test  comment                        meow                           local
test  compatibility                  zol-0.8                        local
$ sudo zpool set compatibility="openzfs-2.0-linux" test
$ sudo zpool set comment="woof" test
$ sudo zpool get all test | egrep '(comment|compat)'
test  comment                        woof                           local
test  compatibility                  openzfs-2.0-linux              local
$ sudo zpool export -F test
$ sudo zpool import -c /etc/zfs/zpool.cache -aN
$ sudo zpool get all test | egrep '(comment|compat)'
test  comment                        meow                           local
test  compatibility                  zol-0.8                        local
$

This would seem to indicate that it's not just the compatibility property which is getting messed with (whew!), but that there's something in this export/import which is losing settings on all recently-set properties.

I'm not familiar with the -F option to zpool export (as opposed to -f); what does that do? I will see if I can reproduce this on Debian shortly.

@rincebrain
Copy link
Contributor

rincebrain commented Jun 22, 2021

Neither am I, really, I just plundered it from zpool history.

From a quick look at the source, it's undocumented in many places, and labeled "hardforce" where it's processed. A line in spa.c above spa_export_common mentions:

 * configuration from the cache afterwards. If the 'hardforce' flag is set, then
 * we don't sync the labels or remove the configuration cache.

which lines up with the observed behavior of being able to do a zpool import -c afterward.

edit: it also doesn't seem to be all recently-set properties - when I tried setting autoreplace, it persisted as one would expect, even though the other two did not (though that was from a non-default to default value).

even more edit:

$ sudo zpool get all test | egrep '(comment|compat|autoreplace)'
test  autoreplace                    off                            default
test  comment                        meow                           local
test  compatibility                  zol-0.8                        local
$ sudo zpool set autoreplace=on test
$ sudo zpool set comment="woof" test
$ sudo zpool set compatibility="openzfs-2.0-linux" test
$ sudo zpool get all test | egrep '(comment|compat|autoreplace)'
test  autoreplace                    on                             local
test  comment                        woof                           local
test  compatibility                  openzfs-2.0-linux              local
$ sudo zpool export -F test
$ sudo zpool import -c /etc/zfs/zpool.cache -aN
$ sudo zpool get all test | egrep '(comment|compat|autoreplace)'
test  autoreplace                    on                             local
test  comment                        meow                           local
test  compatibility                  zol-0.8                        local
$

@colmbuckley
Copy link
Contributor

My suspicion here is that the -F is exporting the dataset without updating some or all of its information in the cachefile, and the later import is trusting old information. This is probably dangerous, but I don't at this point think it's due to the compatibility property per se.

Compare and contrast:

root@zfstest:~# zpool create -d tank /dev/disk/by-id/google-zpool -o comment="meow"
root@zfstest:~# zpool set comment=woof tank
root@zfstest:~# zpool set cachefile=/etc/zfs/zpool.cache tank
root@zfstest:~# zpool export -F tank
root@zfstest:~# zpool import -c /etc/zfs/zpool.cache -aN
root@zfstest:~# zpool get comment tank
NAME  PROPERTY  VALUE    SOURCE
tank  comment   woof     local

The act of setting the cachefile property seems to force a sync of local state to the cachefile, and the import then works as expected. If the cachefile is not explicitly changed:

root@zfstest:~# zpool create -d tank /dev/disk/by-id/google-zpool -o comment="meow"
root@zfstest:~# zpool set comment=woof tank
root@zfstest:~# zpool export -F tank
root@zfstest:~# zpool import -c /etc/zfs/zpool.cache -aN
root@zfstest:~# zpool get comment tank
NAME  PROPERTY  VALUE    SOURCE
tank  comment   meow     local

So I think we're seeing a strange interaction between the undocumented -F option to zpool export, the set of properties which are synced to the cachefile, and whatever information zpool import -c trusts.

This needs the attention of someone with a little more broad-based understanding of the system logic than I currently have, but it doesn't seem to be related specifically to the compatibility property.

@colmbuckley
Copy link
Contributor

(A question for the OP, or anyone who knows about Fedora's inner workings; does it routinely do zpool export -F on reboot? That seems like a bad idea.)

@rincebrain
Copy link
Contributor

I suspect this is to blame (or the equivalent line in mount-zfs.sh.in), but has been the case for 5 years.

And if I remove the zfs-dracut package, and sudo dracut --force, indeed, it does not do that on reboot, and if I do the reverse, it starts again.

@siebenmann
Copy link
Contributor Author

I don't have the zfs-dracut package installed on my Fedora machines, and it doesn't do any sort of export on shutdown. (I've never installed zfs-dracut because I've never used root on ZFS.)

@colmbuckley
Copy link
Contributor

colmbuckley commented Jun 22, 2021

My suspicion is that this is a long-dormant bug (under some circumstances, certain pool properties are not written synchronously to the cachefile) which has only been noticed now because compatibility is a significant property which happens to trigger it (and because @siebenmann has a keen eye, of course).

It might be that Fedora has been doing unclean shutdowns for years and nobody ever noticed before.

I think this issue might be better described as “Some zpool properties are lost across a zpool export -F / zpool import -c cycle” as it's clearly not limited to the compatibility property. Would you consider renaming it so that it can be further triaged?

@colmbuckley
Copy link
Contributor

I see something interesting at line 6427 of module/zfs/spa.c, in the spa_export_common() function:

                if (!hardforce)
                        spa_write_cachefile(spa, B_TRUE, B_TRUE);

which I suspect is the root of the problem here. Pending config changes will not be written to the cachefile on a hard-force export. It's not clear (yet) to me why the act of changing the zpool properties like comment and compatibility doesn't trigger a cachefile write, nor why a shutdown (on your system at least) is equivalent to a hardforce export.

I suspect there's a 'dirty' flag which should be set by the zpool set operation to trigger a cachefile write which isn't being updated by some of the settings. I will have a poke at this again tomorrow to see if I can find it.

@colmbuckley
Copy link
Contributor

Best guess; this is in spa_sync_props() (module/zfs/spa.c) where we call vdev_config_dirty() when setting comment or compatibility (around line 8730); something might be wrong with the logic here, but I honestly don't know enough about the internals to confirm. @behlendorf any ideas who might know most about the ins and outs of the SPA config manager?

@behlendorf
Copy link
Contributor

Thanks for digging in to this. You're right, the issue is in spa_sync_props() and I'm a bit surprised I didn't notice this when reviewing the original patch. ZFS stores pool properties in the DMU_OT_POOL_PROPS object which gets written out every time a property is modified. So they're always persistent. However, it looks like there was one exception to this rule and that was the comment property was added to the pool configuration object instead. Since the configuration rarely changes it's only updated on a clean pool export or when doing something like adding/removing a device from the pool.

The problem here is that the compatibility property was mistakenly added to the configuration object instead of the DMU_OT_POOL_PROPS object, which is why it can get lost. We're going to need to move it to the right object. That means updating spa_sync_props() something like this:

diff --git a/module/zfs/spa.c b/module/zfs/spa.c
index 26995575a..e2dcaa902 100644
--- a/module/zfs/spa.c
+++ b/module/zfs/spa.c
@@ -8725,19 +8725,6 @@ spa_sync_props(void *arg, dmu_tx_t *tx)
                        spa_history_log_internal(spa, "set", tx,
                            "%s=%s", nvpair_name(elem), strval);
                        break;
-               case ZPOOL_PROP_COMPATIBILITY:
-                       strval = fnvpair_value_string(elem);
-                       if (spa->spa_compatibility != NULL)
-                               spa_strfree(spa->spa_compatibility);
-                       spa->spa_compatibility = spa_strdup(strval);
-                       /*
-                        * Dirty the configuration on vdevs as above.
-                        */
-                       if (tx->tx_txg != TXG_INITIAL)
-                               vdev_config_dirty(spa->spa_root_vdev);
-                       spa_history_log_internal(spa, "set", tx,
-                           "%s=%s", nvpair_name(elem), strval);
-                       break;
 
                default:
                        /*
@@ -8804,6 +8791,11 @@ spa_sync_props(void *arg, dmu_tx_t *tx)
                        case ZPOOL_PROP_MULTIHOST:
                                spa->spa_multihost = intval;
                                break;
+                       case ZPOOL_PROP_COMPATIBILITY:
+                               strval = fnvpair_value_string(elem);
+                               if (spa->spa_compatibility != NULL)
+                                       spa_strfree(spa->spa_compatibility);
+                               spa->spa_compatibility = spa_strdup(strval);
                        default:
                                break;
                        }

And then updating spa_ld_get_props() to properly restore the value and set spa->spa_compatibility on import.

@colmbuckley would you mind taking a crack at this fix? While this code still isn't in a tagged release we may still want to add some compatibility code to first check the DMU_OT_POOL_PROPS object for the property, and then fallback to looking in the configuration object. Just so we don't cause this property to be lost on any systems already using this feature.

@colmbuckley
Copy link
Contributor

Gah; just bad luck that I picked the one exceptional property to base my extension on.

I'll have a look at this during the week, and will think about how we might recover from people using pre-production code with the incorrect location.

@behlendorf
Copy link
Contributor

Well upon second thought maybe this actually is a better location than with the other properties. By storing it in the configuration object this information is more easily available without needing to fully import the pool. That might be handy to get an idea of what level of compatibility is required.

I've gone ahead and opened PR #12276 which fixes this by simply making sure the cache file gets updated as well when the compatibility property is modified. That avoids any ugliness about moving the location of where this value is stored. If you somehow went out of your way to use a stale cache file you could still reproduce this, but that seems pretty unlikely.

@colmbuckley if you've got a few minutes to take a look please do.

@behlendorf behlendorf removed the Status: Triage Needed New issue which needs to be triaged label Jun 23, 2021
@colmbuckley
Copy link
Contributor

What, you mean all the time I spent yesterday learning about how the ZAP works was wasted?!?!!

I was going to suggest that, if we move compatibility into the pool props object, we should probably do the same for comment; the logic is fairly straightforward - on import, look in both locations but prefer the props object to the config object; on write/sync/export, delete these from the config object and make sure they're in the props object.

But I don't have any objections to including them in config instead - I see some benefit to having properties which only affect userland being there, so that they can be inspected before a full import.

Will take a look at yours now.

@behlendorf
Copy link
Contributor

Not necessarily! My initial inclination was to do exactly what you're proposing. However, after a little reflection keeping both properties in the configuration didn't seem so unreasonable. Plus it avoided the (minimal) complexity of needing to check two locations and maybe even has some advantages. So I opted to make the smaller fix for now.

I wouldn't be against moving both the compatibility and comment property in to the props objects. That's what it's for after all and it would be nice to keep all of this information stored in just one place. So if you happen to have put a patch together which does this instead, I'd love to take a look!

@behlendorf
Copy link
Contributor

I've merged #12276 which is really the minimal change need to resolve the immediate bug.

behlendorf added a commit that referenced this issue Jun 24, 2021
Unlike most other properties the 'compatibility' property is stored
in the pool config object and not the DMU_OT_POOL_PROPS object.

This had the advantage that the compatibility information is available
without needing to fully import the pool (it can be read with zdb).
However, this means we need to make sure to update both the copy of
the config in the MOS and the cache file.  This wasn't being done.

This commit adds a call to spa_async_request() to ensure the copy of
the config in the cache file gets updated as well as the one stored
in the pool.  This same change is made for the 'comment' property
which suffers from the same inconsistency.

Reviewed-by: Sean Eric Fagan <sef@ixsystems.com>
Reviewed-by: Colm Buckley <colm@tuatha.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12261 
Closes #12276
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

4 participants