New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Race condition between 'zpool export' and 'zfs create' can crash the latter #6096
Conversation
There's nothing stopping us from adding test cases which require systemtap. There are already a few test cases we've added from Illumos which require dtrace. It simply that whenever possible we want to keep the test cases as portable as possible.
Definitely. For a non controversial change like this you can open a matching PR against OpenZFS on GitHub. |
A race condition between 'zpool export' and 'zfs create' can crash the latter: this is because we never check libzfs`zpool_open() return value in libzfs`zfs_create(). Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
e9d73dc
to
8ff0265
Compare
illumos/illumos-gate@690031d illumos/illumos-gate@690031d https://www.illumos.org/issues/8168 If we manage to export the pool on which we are creating a dataset (filesystem or zvol) between entering libzfs`zfs_create() and libzfs`zpool_open() call (for which we never check the return value) we end up dereferencing a NULL pointer in libzfs`zpool_close(). This was discovered on ZFS on Linux. The same issue can be reproduced on Illumos running in parallel: while :; do zpool import -d /tmp testpool ; zpool export testpool ; done while :; do zfs create testpool/fs; zfs destroy testpool/fs ; done Eventually this will result in several core dumps like this one: [root@52-54-00-d3-7a-01 /cores]# mdb core.zfs.4244 Loading modules: [ libumem.so.1 libc.so.1 libtopo.so.1 libavl.so.1 libnvpair.so.1 ld.so.1 ] > ::stack libzfs.so.1`zpool_close+0x17(0, 0, 0, 8047450) libzfs.so.1`zfs_create+0x1bb(8090548, 8047e6f, 1, 808cba8) zfs_do_create+0x545(2, 8047d74, 80778a0, 801, 0, 3) main+0x22c(8047d2c, fef5c6e8, 8047d64, 8055a17, 3, 8047d70) _start+0x83(3, 8047e64, 8047e68, 8047e6f, 0, 8047e7b) > Fix and reproducer (systemtap): openzfs/zfs#6096 Reviewed by: Matt Ahrens <mahrens@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: loli10K <ezomori.nozomu@gmail.com>
illumos/illumos-gate@690031d illumos/illumos-gate@690031d https://www.illumos.org/issues/8168 If we manage to export the pool on which we are creating a dataset (filesystem or zvol) between entering libzfs`zfs_create() and libzfs`zpool_open() call (for which we never check the return value) we end up dereferencing a NULL pointer in libzfs`zpool_close(). This was discovered on ZFS on Linux. The same issue can be reproduced on Illumos running in parallel: while :; do zpool import -d /tmp testpool ; zpool export testpool ; done while :; do zfs create testpool/fs; zfs destroy testpool/fs ; done Eventually this will result in several core dumps like this one: [root@52-54-00-d3-7a-01 /cores]# mdb core.zfs.4244 Loading modules: [ libumem.so.1 libc.so.1 libtopo.so.1 libavl.so.1 libnvpair.so.1 ld.so.1 ] > ::stack libzfs.so.1`zpool_close+0x17(0, 0, 0, 8047450) libzfs.so.1`zfs_create+0x1bb(8090548, 8047e6f, 1, 808cba8) zfs_do_create+0x545(2, 8047d74, 80778a0, 801, 0, 3) main+0x22c(8047d2c, fef5c6e8, 8047d64, 8055a17, 3, 8047d70) _start+0x83(3, 8047e64, 8047e68, 8047e6f, 0, 8047e7b) > Fix and reproducer (systemtap): openzfs/zfs#6096 Reviewed by: Matt Ahrens <mahrens@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: loli10K <ezomori.nozomu@gmail.com> MFC after: 2 weeks git-svn-id: svn+ssh://svn.freebsd.org/base/head@319751 ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
illumos/illumos-gate@690031d illumos/illumos-gate@690031d https://www.illumos.org/issues/8168 If we manage to export the pool on which we are creating a dataset (filesystem or zvol) between entering libzfs`zfs_create() and libzfs`zpool_open() call (for which we never check the return value) we end up dereferencing a NULL pointer in libzfs`zpool_close(). This was discovered on ZFS on Linux. The same issue can be reproduced on Illumos running in parallel: while :; do zpool import -d /tmp testpool ; zpool export testpool ; done while :; do zfs create testpool/fs; zfs destroy testpool/fs ; done Eventually this will result in several core dumps like this one: [root@52-54-00-d3-7a-01 /cores]# mdb core.zfs.4244 Loading modules: [ libumem.so.1 libc.so.1 libtopo.so.1 libavl.so.1 libnvpair.so.1 ld.so.1 ] > ::stack libzfs.so.1`zpool_close+0x17(0, 0, 0, 8047450) libzfs.so.1`zfs_create+0x1bb(8090548, 8047e6f, 1, 808cba8) zfs_do_create+0x545(2, 8047d74, 80778a0, 801, 0, 3) main+0x22c(8047d2c, fef5c6e8, 8047d64, 8055a17, 3, 8047d70) _start+0x83(3, 8047e64, 8047e68, 8047e6f, 0, 8047e7b) > Fix and reproducer (systemtap): openzfs/zfs#6096 Reviewed by: Matt Ahrens <mahrens@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: loli10K <ezomori.nozomu@gmail.com> MFC after: 2 weeks
illumos/illumos-gate@690031d illumos/illumos-gate@690031d https://www.illumos.org/issues/8168 If we manage to export the pool on which we are creating a dataset (filesystem or zvol) between entering libzfs`zfs_create() and libzfs`zpool_open() call (for which we never check the return value) we end up dereferencing a NULL pointer in libzfs`zpool_close(). This was discovered on ZFS on Linux. The same issue can be reproduced on Illumos running in parallel: while :; do zpool import -d /tmp testpool ; zpool export testpool ; done while :; do zfs create testpool/fs; zfs destroy testpool/fs ; done Eventually this will result in several core dumps like this one: [root@52-54-00-d3-7a-01 /cores]# mdb core.zfs.4244 Loading modules: [ libumem.so.1 libc.so.1 libtopo.so.1 libavl.so.1 libnvpair.so.1 ld.so.1 ] > ::stack libzfs.so.1`zpool_close+0x17(0, 0, 0, 8047450) libzfs.so.1`zfs_create+0x1bb(8090548, 8047e6f, 1, 808cba8) zfs_do_create+0x545(2, 8047d74, 80778a0, 801, 0, 3) main+0x22c(8047d2c, fef5c6e8, 8047d64, 8055a17, 3, 8047d70) _start+0x83(3, 8047e64, 8047e68, 8047e6f, 0, 8047e7b) > Fix and reproducer (systemtap): openzfs/zfs#6096 Reviewed by: Matt Ahrens <mahrens@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: loli10K <ezomori.nozomu@gmail.com> MFC after: 2 weeks git-svn-id: https://svn.freebsd.org/base/head@319751 ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
illumos/illumos-gate@690031d illumos/illumos-gate@690031d https://www.illumos.org/issues/8168 If we manage to export the pool on which we are creating a dataset (filesystem or zvol) between entering libzfs`zfs_create() and libzfs`zpool_open() call (for which we never check the return value) we end up dereferencing a NULL pointer in libzfs`zpool_close(). This was discovered on ZFS on Linux. The same issue can be reproduced on Illumos running in parallel: while :; do zpool import -d /tmp testpool ; zpool export testpool ; done while :; do zfs create testpool/fs; zfs destroy testpool/fs ; done Eventually this will result in several core dumps like this one: [root@52-54-00-d3-7a-01 /cores]# mdb core.zfs.4244 Loading modules: [ libumem.so.1 libc.so.1 libtopo.so.1 libavl.so.1 libnvpair.so.1 ld.so.1 ] > ::stack libzfs.so.1`zpool_close+0x17(0, 0, 0, 8047450) libzfs.so.1`zfs_create+0x1bb(8090548, 8047e6f, 1, 808cba8) zfs_do_create+0x545(2, 8047d74, 80778a0, 801, 0, 3) main+0x22c(8047d2c, fef5c6e8, 8047d64, 8055a17, 3, 8047d70) _start+0x83(3, 8047e64, 8047e68, 8047e6f, 0, 8047e7b) > Fix and reproducer (systemtap): openzfs/zfs#6096 Reviewed by: Matt Ahrens <mahrens@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: loli10K <ezomori.nozomu@gmail.com> git-svn-id: https://svn.freebsd.org/base/vendor/illumos/dist@319740 ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
illumos/illumos-gate@690031d illumos/illumos-gate@690031d https://www.illumos.org/issues/8168 If we manage to export the pool on which we are creating a dataset (filesystem or zvol) between entering libzfs`zfs_create() and libzfs`zpool_open() call (for which we never check the return value) we end up dereferencing a NULL pointer in libzfs`zpool_close(). This was discovered on ZFS on Linux. The same issue can be reproduced on Illumos running in parallel: while :; do zpool import -d /tmp testpool ; zpool export testpool ; done while :; do zfs create testpool/fs; zfs destroy testpool/fs ; done Eventually this will result in several core dumps like this one: [root@52-54-00-d3-7a-01 /cores]# mdb core.zfs.4244 Loading modules: [ libumem.so.1 libc.so.1 libtopo.so.1 libavl.so.1 libnvpair.so.1 ld.so.1 ] > ::stack libzfs.so.1`zpool_close+0x17(0, 0, 0, 8047450) libzfs.so.1`zfs_create+0x1bb(8090548, 8047e6f, 1, 808cba8) zfs_do_create+0x545(2, 8047d74, 80778a0, 801, 0, 3) main+0x22c(8047d2c, fef5c6e8, 8047d64, 8055a17, 3, 8047d70) _start+0x83(3, 8047e64, 8047e68, 8047e6f, 0, 8047e7b) > Fix and reproducer (systemtap): openzfs/zfs#6096 Reviewed by: Matt Ahrens <mahrens@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: loli10K <ezomori.nozomu@gmail.com> MFC after: 2 weeks git-svn-id: svn+ssh://svn.freebsd.org/base/head@319751 ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
illumos/illumos-gate@690031d illumos/illumos-gate@690031d https://www.illumos.org/issues/8168 If we manage to export the pool on which we are creating a dataset (filesystem or zvol) between entering libzfs`zfs_create() and libzfs`zpool_open() call (for which we never check the return value) we end up dereferencing a NULL pointer in libzfs`zpool_close(). This was discovered on ZFS on Linux. The same issue can be reproduced on Illumos running in parallel: while :; do zpool import -d /tmp testpool ; zpool export testpool ; done while :; do zfs create testpool/fs; zfs destroy testpool/fs ; done Eventually this will result in several core dumps like this one: [root@52-54-00-d3-7a-01 /cores]# mdb core.zfs.4244 Loading modules: [ libumem.so.1 libc.so.1 libtopo.so.1 libavl.so.1 libnvpair.so.1 ld.so.1 ] > ::stack libzfs.so.1`zpool_close+0x17(0, 0, 0, 8047450) libzfs.so.1`zfs_create+0x1bb(8090548, 8047e6f, 1, 808cba8) zfs_do_create+0x545(2, 8047d74, 80778a0, 801, 0, 3) main+0x22c(8047d2c, fef5c6e8, 8047d64, 8055a17, 3, 8047d70) _start+0x83(3, 8047e64, 8047e68, 8047e6f, 0, 8047e7b) > Fix and reproducer (systemtap): openzfs/zfs#6096 Reviewed by: Matt Ahrens <mahrens@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: loli10K <ezomori.nozomu@gmail.com> MFC after: 2 weeks
illumos/illumos-gate@690031d illumos/illumos-gate@690031d https://www.illumos.org/issues/8168 If we manage to export the pool on which we are creating a dataset (filesystem or zvol) between entering libzfs`zfs_create() and libzfs`zpool_open() call (for which we never check the return value) we end up dereferencing a NULL pointer in libzfs`zpool_close(). This was discovered on ZFS on Linux. The same issue can be reproduced on Illumos running in parallel: while :; do zpool import -d /tmp testpool ; zpool export testpool ; done while :; do zfs create testpool/fs; zfs destroy testpool/fs ; done Eventually this will result in several core dumps like this one: [root@52-54-00-d3-7a-01 /cores]# mdb core.zfs.4244 Loading modules: [ libumem.so.1 libc.so.1 libtopo.so.1 libavl.so.1 libnvpair.so.1 ld.so.1 ] > ::stack libzfs.so.1`zpool_close+0x17(0, 0, 0, 8047450) libzfs.so.1`zfs_create+0x1bb(8090548, 8047e6f, 1, 808cba8) zfs_do_create+0x545(2, 8047d74, 80778a0, 801, 0, 3) main+0x22c(8047d2c, fef5c6e8, 8047d64, 8055a17, 3, 8047d70) _start+0x83(3, 8047e64, 8047e68, 8047e6f, 0, 8047e7b) > Fix and reproducer (systemtap): openzfs/zfs#6096 Reviewed by: Matt Ahrens <mahrens@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: loli10K <ezomori.nozomu@gmail.com>
illumos/illumos-gate@690031d illumos/illumos-gate@690031d https://www.illumos.org/issues/8168 If we manage to export the pool on which we are creating a dataset (filesystem or zvol) between entering libzfs`zfs_create() and libzfs`zpool_open() call (for which we never check the return value) we end up dereferencing a NULL pointer in libzfs`zpool_close(). This was discovered on ZFS on Linux. The same issue can be reproduced on Illumos running in parallel: while :; do zpool import -d /tmp testpool ; zpool export testpool ; done while :; do zfs create testpool/fs; zfs destroy testpool/fs ; done Eventually this will result in several core dumps like this one: [root@52-54-00-d3-7a-01 /cores]# mdb core.zfs.4244 Loading modules: [ libumem.so.1 libc.so.1 libtopo.so.1 libavl.so.1 libnvpair.so.1 ld.so.1 ] > ::stack libzfs.so.1`zpool_close+0x17(0, 0, 0, 8047450) libzfs.so.1`zfs_create+0x1bb(8090548, 8047e6f, 1, 808cba8) zfs_do_create+0x545(2, 8047d74, 80778a0, 801, 0, 3) main+0x22c(8047d2c, fef5c6e8, 8047d64, 8055a17, 3, 8047d70) _start+0x83(3, 8047e64, 8047e68, 8047e6f, 0, 8047e7b) > Fix and reproducer (systemtap): openzfs/zfs#6096 Reviewed by: Matt Ahrens <mahrens@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: loli10K <ezomori.nozomu@gmail.com> git-svn-id: https://svn.freebsd.org/base/stable/11@321576 ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
illumos/illumos-gate@690031d illumos/illumos-gate@690031d https://www.illumos.org/issues/8168 If we manage to export the pool on which we are creating a dataset (filesystem or zvol) between entering libzfs`zfs_create() and libzfs`zpool_open() call (for which we never check the return value) we end up dereferencing a NULL pointer in libzfs`zpool_close(). This was discovered on ZFS on Linux. The same issue can be reproduced on Illumos running in parallel: while :; do zpool import -d /tmp testpool ; zpool export testpool ; done while :; do zfs create testpool/fs; zfs destroy testpool/fs ; done Eventually this will result in several core dumps like this one: [root@52-54-00-d3-7a-01 /cores]# mdb core.zfs.4244 Loading modules: [ libumem.so.1 libc.so.1 libtopo.so.1 libavl.so.1 libnvpair.so.1 ld.so.1 ] > ::stack libzfs.so.1`zpool_close+0x17(0, 0, 0, 8047450) libzfs.so.1`zfs_create+0x1bb(8090548, 8047e6f, 1, 808cba8) zfs_do_create+0x545(2, 8047d74, 80778a0, 801, 0, 3) main+0x22c(8047d2c, fef5c6e8, 8047d64, 8055a17, 3, 8047d70) _start+0x83(3, 8047e64, 8047e68, 8047e6f, 0, 8047e7b) > Fix and reproducer (systemtap): openzfs/zfs#6096 Reviewed by: Matt Ahrens <mahrens@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: loli10K <ezomori.nozomu@gmail.com>
illumos/illumos-gate@690031d illumos/illumos-gate@690031d https://www.illumos.org/issues/8168 If we manage to export the pool on which we are creating a dataset (filesystem or zvol) between entering libzfs`zfs_create() and libzfs`zpool_open() call (for which we never check the return value) we end up dereferencing a NULL pointer in libzfs`zpool_close(). This was discovered on ZFS on Linux. The same issue can be reproduced on Illumos running in parallel: while :; do zpool import -d /tmp testpool ; zpool export testpool ; done while :; do zfs create testpool/fs; zfs destroy testpool/fs ; done Eventually this will result in several core dumps like this one: [root@52-54-00-d3-7a-01 /cores]# mdb core.zfs.4244 Loading modules: [ libumem.so.1 libc.so.1 libtopo.so.1 libavl.so.1 libnvpair.so.1 ld.so.1 ] > ::stack libzfs.so.1`zpool_close+0x17(0, 0, 0, 8047450) libzfs.so.1`zfs_create+0x1bb(8090548, 8047e6f, 1, 808cba8) zfs_do_create+0x545(2, 8047d74, 80778a0, 801, 0, 3) main+0x22c(8047d2c, fef5c6e8, 8047d64, 8055a17, 3, 8047d70) _start+0x83(3, 8047e64, 8047e68, 8047e6f, 0, 8047e7b) > Fix and reproducer (systemtap): openzfs/zfs#6096 Reviewed by: Matt Ahrens <mahrens@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: loli10K <ezomori.nozomu@gmail.com> Former-commit-id: e4f88714877cac6ccb64883311362657477ecf58
illumos/illumos-gate@690031d illumos/illumos-gate@690031d https://www.illumos.org/issues/8168 If we manage to export the pool on which we are creating a dataset (filesystem or zvol) between entering libzfs`zfs_create() and libzfs`zpool_open() call (for which we never check the return value) we end up dereferencing a NULL pointer in libzfs`zpool_close(). This was discovered on ZFS on Linux. The same issue can be reproduced on Illumos running in parallel: while :; do zpool import -d /tmp testpool ; zpool export testpool ; done while :; do zfs create testpool/fs; zfs destroy testpool/fs ; done Eventually this will result in several core dumps like this one: [root@52-54-00-d3-7a-01 /cores]# mdb core.zfs.4244 Loading modules: [ libumem.so.1 libc.so.1 libtopo.so.1 libavl.so.1 libnvpair.so.1 ld.so.1 ] > ::stack libzfs.so.1`zpool_close+0x17(0, 0, 0, 8047450) libzfs.so.1`zfs_create+0x1bb(8090548, 8047e6f, 1, 808cba8) zfs_do_create+0x545(2, 8047d74, 80778a0, 801, 0, 3) main+0x22c(8047d2c, fef5c6e8, 8047d64, 8055a17, 3, 8047d70) _start+0x83(3, 8047e64, 8047e68, 8047e6f, 0, 8047e7b) > Fix and reproducer (systemtap): openzfs/zfs#6096 Reviewed by: Matt Ahrens <mahrens@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: loli10K <ezomori.nozomu@gmail.com>
illumos/illumos-gate@690031d illumos/illumos-gate@690031d https://www.illumos.org/issues/8168 If we manage to export the pool on which we are creating a dataset (filesystem or zvol) between entering libzfs`zfs_create() and libzfs`zpool_open() call (for which we never check the return value) we end up dereferencing a NULL pointer in libzfs`zpool_close(). This was discovered on ZFS on Linux. The same issue can be reproduced on Illumos running in parallel: while :; do zpool import -d /tmp testpool ; zpool export testpool ; done while :; do zfs create testpool/fs; zfs destroy testpool/fs ; done Eventually this will result in several core dumps like this one: [root@52-54-00-d3-7a-01 /cores]# mdb core.zfs.4244 Loading modules: [ libumem.so.1 libc.so.1 libtopo.so.1 libavl.so.1 libnvpair.so.1 ld.so.1 ] > ::stack libzfs.so.1`zpool_close+0x17(0, 0, 0, 8047450) libzfs.so.1`zfs_create+0x1bb(8090548, 8047e6f, 1, 808cba8) zfs_do_create+0x545(2, 8047d74, 80778a0, 801, 0, 3) main+0x22c(8047d2c, fef5c6e8, 8047d64, 8055a17, 3, 8047d70) _start+0x83(3, 8047e64, 8047e68, 8047e6f, 0, 8047e7b) > Fix and reproducer (systemtap): openzfs/zfs#6096 Reviewed by: Matt Ahrens <mahrens@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: loli10K <ezomori.nozomu@gmail.com>
illumos/illumos-gate@690031d illumos/illumos-gate@690031d https://www.illumos.org/issues/8168 If we manage to export the pool on which we are creating a dataset (filesystem or zvol) between entering libzfs`zfs_create() and libzfs`zpool_open() call (for which we never check the return value) we end up dereferencing a NULL pointer in libzfs`zpool_close(). This was discovered on ZFS on Linux. The same issue can be reproduced on Illumos running in parallel: while :; do zpool import -d /tmp testpool ; zpool export testpool ; done while :; do zfs create testpool/fs; zfs destroy testpool/fs ; done Eventually this will result in several core dumps like this one: [root@52-54-00-d3-7a-01 /cores]# mdb core.zfs.4244 Loading modules: [ libumem.so.1 libc.so.1 libtopo.so.1 libavl.so.1 libnvpair.so.1 ld.so.1 ] > ::stack libzfs.so.1`zpool_close+0x17(0, 0, 0, 8047450) libzfs.so.1`zfs_create+0x1bb(8090548, 8047e6f, 1, 808cba8) zfs_do_create+0x545(2, 8047d74, 80778a0, 801, 0, 3) main+0x22c(8047d2c, fef5c6e8, 8047d64, 8055a17, 3, 8047d70) _start+0x83(3, 8047e64, 8047e68, 8047e6f, 0, 8047e7b) > Fix and reproducer (systemtap): openzfs/zfs#6096 Reviewed by: Matt Ahrens <mahrens@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: loli10K <ezomori.nozomu@gmail.com>
Description
Discovered while investigating another issue: if we manage to export the pool on which we are creating a dataset (filesystem or zvol) between
libzfs`zfs_create()
entrypoint andlibzfs`zpool_open()
call (for which we never check the return value) we end up dereferencing a NULL pointer inlibzfs`zpool_close()
Motivation and Context
Minimal reproducer (requires systemtap):
It would be nice if we could leverage systemtap (or any other equivalent technology) to add more regression tests since it's much easier to simulate race conditions this way.
Also i wonder if we should try to bring this change to Illumos since they are affected too:
How Has This Been Tested?
Tested manually (with systemtap to reproduce the race condition)
Types of changes
Checklist:
Signed-off-by
.