Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic on zpool create #2 #2

Closed
lundman opened this issue Mar 3, 2013 · 2 comments
Closed

Panic on zpool create #2 #2

lundman opened this issue Mar 3, 2013 · 2 comments

Comments

@lundman
Copy link
Contributor

lundman commented Mar 3, 2013

#./zpool.sh create -f BOOM pool-image.bin
[zfs] ioctl done 2
[zfs] Yay, got ioctl 0
[zfs] vdev_alloc_common top 0
[vdev] vdev_alloc vd top 0 parent top
[vdev] alloc parent top 0
[zfs] vdev_alloc_common top 0
[vdev] vdev_alloc vd top 0 parent top

(gdb) bt
#0  0xffffff80003105de in current_rootdir () at /SourceCache/xnu/xnu-2050.18.24/bsd/vfs/kpi_vfs.c:2195
#1  0xffffff7f80e8d50a in VOP_GETATTR ()
#2  0xffffff7f80f7e786 in vdev_file_open (vd=0xffffff80079f4800, psize=0xffffff8046b4beb8, max_psize=0xffffff8046b4beb0, ashift=0xffffff8046b4be90) at vdev_file.c:113
#3  0xffffff7f80f74964 in vdev_open (vd=0xffffff80079f4800) at vdev.c:1178
#4  0xffffff7f80f74280 in vdev_open_child (arg=0xffffff80079f4800) at vdev.c:1080
#5  0xffffff7f80e8b0d3 in taskq_thread ()

(gdb) p vf->vf_vnode
$1 = (struct vnode *) 0x40

@wca
Copy link
Contributor

wca commented Mar 3, 2013

This code path has commented out the vn_openat() call that would initialize vp (aka vf->vf_vnode).

@lundman
Copy link
Contributor Author

lundman commented Mar 3, 2013

Yep, that was it. now it dies somewhere random :)

179856a

@lundman lundman closed this as completed Mar 3, 2013
ryao added a commit that referenced this issue Feb 25, 2014
ZoL commit 1421c89 unintentionally changed the disk format in a forward-
compatible, but not backward compatible way. This was accomplished by
adding an entry to zbookmark_t, which is included in a couple of
on-disk structures. That lead to the creation of pools with incorrect
dsl_scan_phys_t objects that could only be imported by versions of ZoL
containing that commit.  Such pools cannot be imported by other versions
of ZFS or past versions of ZoL.

The additional field has been removed by the previous commit.  However,
affected pools must be imported and scrubbed using a version of ZoL with
this commit applied.  This will return the pools to a state in which they
may be imported by other implementations.

The 'zpool import' or 'zpool status' command can be used to determine if
a pool is impacted.  A message similar to one of the following means your
pool must be scrubbed to restore compatibility.

$ zpool import
   pool: zol-0.6.2-173
     id: 1165955789558693437
  state: ONLINE
 status: Errata #1 detected.
 action: The pool can be imported using its name or numeric identifier,
         however there is a compatibility issue which should be corrected
         by running 'zpool scrub'
    see: http://zfsonlinux.org/msg/ZFS-8000-ER
 config:
 ...

$ zpool status
  pool: zol-0.6.2-173
 state: ONLINE
  scan: pool compatibility issue detected.
   see: openzfs/zfs#2094
action: To correct the issue run 'zpool scrub'.
config:
...

If there was an async destroy in progress 'zpool import' will prevent
the pool from being imported.  Further advice on how to proceed will be
provided by the error message as follows.

$ zpool import
   pool: zol-0.6.2-173
     id: 1165955789558693437
  state: ONLINE
 status: Errata #2 detected.
 action: The pool can not be imported with this version of ZFS due to an
         active asynchronous destroy.  Revert to an earlier version and
         allow the destroy to complete before updating.
         see: http://zfsonlinux.org/msg/ZFS-8000-ER
 config:
 ...

Pools affected by the damaged dsl_scan_phys_t can be detected prior to
an upgrade by running the following command as root:

zdb -dddd poolname 1 | grep -P '^\t\tscan = ' | sed -e 's;scan = ;;' | wc -w

Note that `poolname` must be replaced with the name of the pool you wish
to check. A value of 25 indicates the dsl_scan_phys_t has been damaged.
A value of 24 indicates that the dsl_scan_phys_t is normal. A value of 0
indicates that there has never been a scrub run on the pool.

The regression caused by the change to zbookmark_t never made it into a
tagged release, Gentoo backports, Ubuntu, Debian, Fedora, or EPEL
stable respositorys.  Only those using the HEAD version directly from
Github after the 0.6.2 but before the 0.6.3 tag are affected.

This patch does have one limitation that should be mentioned.  It will not
detect errata #2 on a pool unless errata #1 is also present.  It expected
this will not be a significant problem because pools impacted by errata #2
have a high probably of being impacted by errata #1.

End users can ensure they do no hit this unlikely case by waiting for all
asynchronous destroy operations to complete before updating ZoL.  The
presence of any background destroys on any imported pools can be checked
by running `zpool get freeing` as root.  This will display a non-zero
value for any pool with an active asynchronous destroy.

Lastly, it is expected that no user data has been lost as a result of
this erratum.

Original-patch-by: Tim Chase <tim@chase2k.com>
Reworked-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tim Chase <tim@chase2k.com>
Signed-off-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #2094
lundman pushed a commit that referenced this issue Apr 8, 2014
Tagging each zevent with a unique monotonically increasing EID
(Event IDentifier) provides the required infrastructure for a user
space daemon to reliably process zevents.  By writing the EID to
persistent storage the daemon can safely resume where it left off
in the event stream when it's restarted.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chris Dunlap <cdunlap@llnl.gov>
Issue #2
lundman pushed a commit that referenced this issue Apr 8, 2014
The ZFS_IOC_EVENTS_SEEK ioctl was added to allow user space callers
to seek around the zevent file descriptor by EID.  When a specific
EID is passed and it exists the cursor will be positioned there.
If the EID is no longer cached by the kernel ENOENT is returned.
The caller may also pass ZEVENT_SEEK_START or ZEVENT_SEEK_END to seek
to those respective locations.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chris Dunlap <cdunlap@llnl.gov>
Issue #2
lundman pushed a commit that referenced this issue Apr 8, 2014
Due to the very poorly chosen argument name 'cleanup_fd' it was
completely unclear that this file descriptor is used to track the
current cursor location.  When the file descriptor is created by
opening ZFS_DEV a private cursor is created in the kernel for the
returned file descriptor.  Subsequent calls to zpool_events_next()
and zpool_events_seek() then require the file descriptor as an
argument to reposition the cursor.  When the file descriptor is
closed the kernel state tracking the cursor is destroyed.

This patch contains no functional change, it just changes a
few variable names and clarifies the documentation.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chris Dunlap <cdunlap@llnl.gov>
Issue #2
lundman pushed a commit that referenced this issue Apr 8, 2014
Add macro definitions to AM_CPPFLAGS to propagate makefile installation
directory variables for libexecdir, runstatedir, sbindir, and
sysconfdir.

https://www.gnu.org/software/autoconf/manual/autoconf-2.69/html_node/Installation-Directory-Variables.html

  A corollary is that you should not use these variables except
  in makefiles. For instance, instead of trying to evaluate
  datadir in configure and hard-coding it in makefiles using e.g.,
  'AC_DEFINE_UNQUOTED([DATADIR], ["$datadir"], [Data directory.])',
  you should add -DDATADIR='$(datadir)' to your makefile's definition
  of CPPFLAGS (AM_CPPFLAGS if you are also using Automake).

The runstatedir directory is for "installing data files which the
programs modify while they run, that pertain to one specific machine,
and which need not persist longer than the execution of the program".

https://www.gnu.org/prep/standards/html_node/Directory-Variables.html

It will be defined by autoconf 2.70 or later, and default to
"$(localstatedir)/run".

http://git.savannah.gnu.org/gitweb/?p=autoconf.git;a=commit;h=a197431414088a417b407b9b20583b2e8f7363bd

Signed-off-by: Chris Dunlap <cdunlap@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #2
lundman pushed a commit that referenced this issue Apr 8, 2014
zpool_events_next() can be called in blocking mode by specifying a
non-zero value for the "block" parameter.  However, the design of
the ZFS Event Daemon (zed) requires additional functionality from
zpool_events_next().  Instead of adding additional arguments to the
function, it makes more sense to use flags that can be bitwise-or'd
together.

This commit replaces the zpool_events_next() int "block" parameter with
an unsigned bitwise "flags" parameter.  It also defines ZEVENT_NONE
to specify the default behavior.  Since non-blocking mode can be
specified with the existing ZEVENT_NONBLOCK flag, the default behavior
becomes blocking mode.  This, in effect, inverts the previous use
of the "block" parameter.  Existing callers of zpool_events_next()
have been modified to check for the ZEVENT_NONBLOCK flag.

Signed-off-by: Chris Dunlap <cdunlap@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #2
lundman pushed a commit that referenced this issue Apr 8, 2014
zed monitors ZFS events.  When a zevent is posted, zed will run any
scripts that have been enabled for the corresponding zevent class.
Multiple scripts may be invoked for a given zevent.  The zevent
nvpairs are passed to the scripts as environment variables.

Events are processed synchronously by the single thread, and there is
no maximum timeout for script execution.  Consequently, a misbehaving
script can delay (or forever block) the processing of subsequent
zevents.  Plans are to address this in future commits.

Initial scripts have been developed to log events to syslog
and send email in response to checksum/data/io errors and
resilver.finish/scrub.finish events.  By default, email will only
be sent if the ZED_EMAIL variable is configured in zed.rc (which is
serving as a config file of sorts until a proper configuration file
is implemented).

Signed-off-by: Chris Dunlap <cdunlap@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #2
lundman pushed a commit that referenced this issue Apr 8, 2014
This commit adds a systemd unit file for zed.service and integrates
it into the zfs.target from commit 881f45c.

Signed-off-by: Chris Dunlap <cdunlap@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #2108
Issue #2
lundman pushed a commit that referenced this issue Apr 8, 2014
Several of the zfs utilities allow you to pass a vdev's guid rather
than the device name.  However, the utilities are not consistent in
how they parse that guid.  For example, 'zinject' expects the guid
to be passed as a hex value while 'zpool replace' wants it as a
decimal.  The user is forced to just know what format to use.

This patch improve things by making the parsing more tolerant.
When strtol(3) is called using 0 for the base, rather than say
10 or 16, it will then accept hex, decimal, or octal input based
on the prefix.  From the man page.

    If base is zero or 16, the string may then include a "0x"
    prefix, and  the number  will  be read in base 16; otherwise,
    a zero base is taken as 10 (decimal) unless the next character
    is '0', in which case it  is  taken as 8 (octal).

NOTE: There may be additional conversions not caught be this patch.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chris Dunlap <cdunlap@llnl.gov>
Issue #2
lundman pushed a commit that referenced this issue Apr 8, 2014
This functionality has always been missing.  But until now there
were no zevents which included an array of strings so it wasn't
missed.  However, that's now changed so to ensure this information
is output correctly by 'zpool events -v' the DATA_TYPE_STRING_ARRAY
has been implemented.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chris Dunlap <cdunlap@llnl.gov>
Issue #2
lundman pushed a commit that referenced this issue Apr 8, 2014
When a vdev starts getting I/O or checksum errors it is now
possible to automatically rebuild to a hot spare device.

To cleanly support this functionality in a shell script some
additional information was added to all zevent ereports which
include a vdev.  This covers both io and checksum zevents but
may be used but other scripts.

In the Illumos FMA solution the same information is required
but it is retrieved through the libzfs library interface.
Specifically the following members were added:

  vdev_spare_paths  - List of vdev paths for all hot spares.
  vdev_spare_guids  - List of vdev guids for all hot spares.
  vdev_read_errors  - Read errors for the problematic vdev
  vdev_write_errors - Write errors for the problematic vdev
  vdev_cksum_errors - Checksum errors for the problematic vdev.

By default the required hot spare scripts are installed but this
functionality is disabled.  To enable hot sparing uncomment the
ZED_SPARE_ON_IO_ERRORS and ZED_SPARE_ON_CHECKSUM_ERRORS in the
/etc/zfs/zed.d/zed.rc configuration file.

These scripts do no add support for the autoexpand property. At
a minimum this requires adding a new udev rule to detect when
a new device is added to the system.  It also requires that the
autoexpand policy be ported from Illumos, see:

  https://github.com/illumos/illumos-gate/blob/master/usr/src/cmd/syseventd/modules/zfs_mod/zfs_mod.c

Support for detecting the correct name of a vdev when it's not
a whole disk was added by Turbo Fredriksson.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chris Dunlap <cdunlap@llnl.gov>
Signed-off-by: Turbo Fredriksson <turbo@bayour.com>
Issue #2
lundman pushed a commit that referenced this issue Apr 8, 2014
zed supports a '-M' cmdline opt to lock all pages in memory via
mlockall().  The _POSIX_MEMLOCK define is checked to determine whether
this function is supported.  The current test assumes mlockall()
is supported if _POSIX_MEMLOCK is non-zero.  However, this test is
insufficient according to mlock(2) and sysconf(3).  If _POSIX_MEMLOCK
is -1, mlockall() is not supported; but if _POSIX_MEMLOCK is 0,
availability must be checked at runtime.

This commit adds an autoconf check for mlockall() to user.m4.  The zed
code block for mlockall() is now guarded with a test for HAVE_MLOCKALL.
If defined, mlockall() will be called and its runtime availability
checked via its return value.

Signed-off-by: Chris Dunlap <cdunlap@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #2
lundman pushed a commit that referenced this issue Apr 8, 2014
zed monitors ZFS events. When a zevent is posted, zed will run any
scripts that have been enabled for the corresponding zevent class.
Multiple scripts may be invoked for a given zevent. The zevent nvpairs
are passed to the scripts as environment variables. Refer to the zed(8)
manpage for details.

Events are processed synchronously by the single thread, and there is
no maximum timeout for script execution. Consequently, a misbehaving
script can delay (or forever block) the processing of subsequent
zevents. Plans are to address this in future commits.

An EID (Event IDentifier) has been added to each event to uniquely
identify it throughout the lifetime of the loaded ZFS kernel module;
it is a monotonically increasing integer that resets to 1 each time
the module is loaded.

Initial scripts have been developed to log zevents to syslog,
automatically rebuild to a hot spare device, and send email in
response to checksum / data / io / resilver.finish / scrub.finish
zevents. To enable email notifications, configure ZED_EMAIL in zed.rc
(which is serving as a config file of sorts until a proper
configuration file is implemented). To enable hot sparing, uncomment
ZED_SPARE_ON_IO_ERRORS and ZED_SPARE_ON_CHECKSUM_ERRORS in zed.rc;
note that the autoexpand property is not yet supported.

zed is a work-in-progress.

Signed-off-by: Chris Dunlap <cdunlap@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #2
lundman added a commit that referenced this issue Sep 2, 2015
Which the stack like:
    frame #2: 0xffffff800292e4a5 kernel`cache_purge
    frame #3: 0xffffff7f83281ee8 zfs`zfs_vnop_reclaim + 152
    frame #4: 0xffffff80029468f0 kernel`vclean
    frame #6: 0xffffff80029463cb kernel`vnode_reclaim_internal
    frame #9: 0xffffff800293f4b6 kernel`vnode_create
    frame #10: 0xffffff7f83283c41 zfs`zfs_znode_getvnode + 513
    frame #11: 0xffffff7f83288d78 zfs`zfs_zget_internal + 984
    frame #12: 0xffffff7f832764c9 zfs`zfs_vfs_vget + 329
    frame #13: 0xffffff800293979d kernel`namei

It would seem vnop_reclaim can not call into VFS again as it already
holds locks, and verifying with hfs_vnop_reclaim, they do not
call cache_purge().
lundman added a commit that referenced this issue Sep 3, 2015
Which the stack like:
    frame #2: 0xffffff800292e4a5 kernel`cache_purge
    frame #3: 0xffffff7f83281ee8 zfs`zfs_vnop_reclaim + 152
    frame #4: 0xffffff80029468f0 kernel`vclean
    frame #6: 0xffffff80029463cb kernel`vnode_reclaim_internal
    frame #9: 0xffffff800293f4b6 kernel`vnode_create
    frame #10: 0xffffff7f83283c41 zfs`zfs_znode_getvnode + 513
    frame #11: 0xffffff7f83288d78 zfs`zfs_zget_internal + 984
    frame #12: 0xffffff7f832764c9 zfs`zfs_vfs_vget + 329
    frame #13: 0xffffff800293979d kernel`namei

It would seem vnop_reclaim can not call into VFS again as it already
holds locks, and verifying with hfs_vnop_reclaim, they do not
call cache_purge().
lundman added a commit that referenced this issue Oct 21, 2015
Which the stack like:
    frame #2: 0xffffff800292e4a5 kernel`cache_purge
    frame #3: 0xffffff7f83281ee8 zfs`zfs_vnop_reclaim + 152
    frame #4: 0xffffff80029468f0 kernel`vclean
    frame #6: 0xffffff80029463cb kernel`vnode_reclaim_internal
    frame #9: 0xffffff800293f4b6 kernel`vnode_create
    frame #10: 0xffffff7f83283c41 zfs`zfs_znode_getvnode + 513
    frame #11: 0xffffff7f83288d78 zfs`zfs_zget_internal + 984
    frame #12: 0xffffff7f832764c9 zfs`zfs_vfs_vget + 329
    frame #13: 0xffffff800293979d kernel`namei

It would seem vnop_reclaim can not call into VFS again as it already
holds locks, and verifying with hfs_vnop_reclaim, they do not
call cache_purge().
lundman added a commit that referenced this issue Jan 23, 2017
Which the stack like:
    frame #2: 0xffffff800292e4a5 kernel`cache_purge
    frame #3: 0xffffff7f83281ee8 zfs`zfs_vnop_reclaim + 152
    frame #4: 0xffffff80029468f0 kernel`vclean
    frame #6: 0xffffff80029463cb kernel`vnode_reclaim_internal
    frame #9: 0xffffff800293f4b6 kernel`vnode_create
    frame #10: 0xffffff7f83283c41 zfs`zfs_znode_getvnode + 513
    frame #11: 0xffffff7f83288d78 zfs`zfs_zget_internal + 984
    frame #12: 0xffffff7f832764c9 zfs`zfs_vfs_vget + 329
    frame #13: 0xffffff800293979d kernel`namei

It would seem vnop_reclaim can not call into VFS again as it already
holds locks, and verifying with hfs_vnop_reclaim, they do not
call cache_purge().
lundman pushed a commit that referenced this issue Jun 17, 2019
The issue is caused by an incorrect usage of the sizeof() operator
in vdev_obsolete_sm_object(): on 64-bit systems this is not an issue
since both "uint64_t" and "uint64_t*" are 8 bytes in size. However on
32-bit systems pointers are 4 bytes long which is not supported by
zap_lookup_impl(). Trying to remove a top-level vdev on a 32-bit system
will cause the following failure:

VERIFY3(0 == vdev_obsolete_sm_object(vd, &obsolete_sm_object)) failed (0 == 22)
PANIC at vdev_indirect.c:833:vdev_indirect_sync_obsolete()
Showing stack for process 1315
CPU: 6 PID: 1315 Comm: txg_sync Tainted: P           OE   4.4.69+ #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014
 c1abc6e7 0ae10898 00000286 d4ac3bc0 c14397bc da4cd7d8 d4ac3bf0 d4ac3bd0
 d790e7ce d7911cc1 00000523 d4ac3d00 d790e7d7 d7911ce4 da4cd7d8 00000341
 da4ce664 da4cd8c0 da33fa6e 49524556 28335946 3d3d2030 65647620 626f5f76
Call Trace:
 [<>] dump_stack+0x58/0x7c
 [<>] spl_dumpstack+0x23/0x27 [spl]
 [<>] spl_panic.cold.0+0x5/0x41 [spl]
 [<>] ? dbuf_rele+0x3e/0x90 [zfs]
 [<>] ? zap_lookup_norm+0xbe/0xe0 [zfs]
 [<>] ? zap_lookup+0x57/0x70 [zfs]
 [<>] ? vdev_obsolete_sm_object+0x102/0x12b [zfs]
 [<>] vdev_indirect_sync_obsolete+0x3e1/0x64d [zfs]
 [<>] ? txg_verify+0x1d/0x160 [zfs]
 [<>] ? dmu_tx_create_dd+0x80/0xc0 [zfs]
 [<>] vdev_sync+0xbf/0x550 [zfs]
 [<>] ? mutex_lock+0x10/0x30
 [<>] ? txg_list_remove+0x9f/0x1a0 [zfs]
 [<>] ? zap_contains+0x4d/0x70 [zfs]
 [<>] spa_sync+0x9f1/0x1b10 [zfs]
 ...
 [<>] ? kthread_stop+0x110/0x110

This commit simply corrects the "integer_size" parameter used to lookup
the vdev's ZAP object.

Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #8790
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants