Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zdb <dataset> fails with Remote I/O error when Multihost is enabled #7797

Closed
ofaaland opened this issue Aug 17, 2018 · 4 comments · Fixed by #7801
Closed

zdb <dataset> fails with Remote I/O error when Multihost is enabled #7797

ofaaland opened this issue Aug 17, 2018 · 4 comments · Fixed by #7801

Comments

@ofaaland
Copy link
Contributor

System information

Type Version/Name
Distribution Name
Distribution Version
Linux Kernel
Architecture
ZFS Version master
SPL Version master

Describe the problem you're observing

The command
zdb <dataset>
fails with "failed to own dataset ...: Remote I/O error" if multihost is enabled.
First reported in #6464

Describe how to reproduce the problem

zpool create poolname /dev/some/device
zpool set multihost=on poolname
zdb poolname/

Include any warning/errors/backtraces from the system logs

``
[faaland1@per zfs]$ sudo cmd/zpool/zpool set multihost=on mypool
[faaland1@per zfs]$ cmd/zdb/zdb mypool/ 2>&1 | head -n4
failed to own dataset 'mypool': Remote I/O error
zdb: can't open 'mypool': Remote I/O error

ZFS_DBGMSG(zdb):

@ofaaland
Copy link
Contributor Author

@KnightKu proposed a patch in a comment

#6464 (comment)

diff --git a/cmd/zdb/zdb.c b/cmd/zdb/zdb.c
index 142968d..97adab1 100644
--- a/cmd/zdb/zdb.c
+++ b/cmd/zdb/zdb.c
@@ -5801,6 +5801,15 @@ main(int argc, char **argv)
                                }
                        }
                } else {
+                       /*
+                        * Disable the activity check to allow examination of
+                        * active pools.
+                        */
+                       mutex_enter(&spa_namespace_lock);
+                       if ((spa = spa_lookup(target)) != NULL) {
+                               spa->spa_import_flags |= ZFS_IMPORT_SKIP_MMP;
+                       }
+                       mutex_exit(&spa_namespace_lock);
                        error = open_objset(target, DMU_OST_ANY, FTAG, &os);
                        if (error == 0)
                                spa = dmu_objset_spa(os);

@ofaaland
Copy link
Contributor Author

ofaaland commented Aug 17, 2018

@KnightKu, that would work but the current requirement that every open/import path needs the ZFS_IMPORT_SKIP_MMP flag added seems to me to be too easy to miss, like I did here.

For example, if zdb is used to open a checkpoint with the current code, it will fail in the same way:

[faaland1@per zfs]$ cmd/zdb/zdb -k mypool 2>&1 | head -n4
zdb: Tried to read config of pool "mypool" but spa_get_stats() failed with error 121

I think some other method is needed that automatically covers user-space opens/imports in zdb and zhack.

@ofaaland
Copy link
Contributor Author

@behlendorf , what would you think of a new global variable spa_open_global_flags whose contents are OR'd into spa->spa_import_flags? zdb and zhack could set this to ZFS_IMPORT_SKIP_MMP once and then all opens there would skip MMP activity test, but a FUSE implementation would not set this global (or at least, not this flag) and so the activity test would still occur when appropriate.

@ofaaland
Copy link
Contributor Author

@behlendorf points out that my proposal has the side-effect of making it easier to accidentally disable the MMP activity test when it's not safe to do so, for example in zhack which opens the pool R/W.

behlendorf pushed a commit that referenced this issue Aug 20, 2018
Since zdb opens the pools read-only, it cannot damage the pool in the
event the pool is already imported either on the same host or on
another one.

If the pool vdev structure is changing while zdb is importing the
pool, it may cause zdb to crash.  However this is unlikely, and in any
case it's a user space process and can simply be run again.

For this reason, zdb should disable the multihost activity test on
import that is normally run.

This commit fixes a few zdb code paths where that had been overlooked.
It also adds tests to ensure that several common use cases handle this
properly in the future.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Gu Zheng <guzheng2331314@163.com>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #7797 
Closes #7801
tonyhutter pushed a commit to tonyhutter/zfs that referenced this issue Oct 31, 2018
Since zdb opens the pools read-only, it cannot damage the pool in the
event the pool is already imported either on the same host or on
another one.

If the pool vdev structure is changing while zdb is importing the
pool, it may cause zdb to crash.  However this is unlikely, and in any
case it's a user space process and can simply be run again.

For this reason, zdb should disable the multihost activity test on
import that is normally run.

This commit fixes a few zdb code paths where that had been overlooked.
It also adds tests to ensure that several common use cases handle this
properly in the future.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Gu Zheng <guzheng2331314@163.com>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes openzfs#7797
Closes openzfs#7801
tonyhutter pushed a commit to tonyhutter/zfs that referenced this issue Nov 5, 2018
Since zdb opens the pools read-only, it cannot damage the pool in the
event the pool is already imported either on the same host or on
another one.

If the pool vdev structure is changing while zdb is importing the
pool, it may cause zdb to crash.  However this is unlikely, and in any
case it's a user space process and can simply be run again.

For this reason, zdb should disable the multihost activity test on
import that is normally run.

This commit fixes a few zdb code paths where that had been overlooked.
It also adds tests to ensure that several common use cases handle this
properly in the future.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Gu Zheng <guzheng2331314@163.com>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes openzfs#7797
Closes openzfs#7801
tonyhutter pushed a commit that referenced this issue Nov 13, 2018
Since zdb opens the pools read-only, it cannot damage the pool in the
event the pool is already imported either on the same host or on
another one.

If the pool vdev structure is changing while zdb is importing the
pool, it may cause zdb to crash.  However this is unlikely, and in any
case it's a user space process and can simply be run again.

For this reason, zdb should disable the multihost activity test on
import that is normally run.

This commit fixes a few zdb code paths where that had been overlooked.
It also adds tests to ensure that several common use cases handle this
properly in the future.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Gu Zheng <guzheng2331314@163.com>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes #7797
Closes #7801
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant