building on gentoo hardened #484

Closed
user318 opened this Issue Dec 8, 2011 · 2 comments

Comments

Projects
None yet
3 participants

user318 commented Dec 8, 2011

Not an bug, I think. But may be interesting to somebody. When I have tried to compile zfs on gentoo hardened system, I faced a number of problems. They are using some patches to kernel and plugins to gcc, that makes some structs const by default, as I understand.
This patches I made to make it compile successfully:

Owner

behlendorf commented Dec 13, 2011

Interesting, yes to support the hardened Gentoo kernel we'd have to add autoconf checks for this sort of thing so we could just do the right thing. I'm not opposed to if someone proposes patches.

Member

ryao commented Jun 17, 2012

Pull request #767 is now open with a patch that resolves this.

ryao added a commit to ryao/zfs that referenced this issue Jun 29, 2012

Fix build failures on PaX/GRSecurity patched kernels
Gentoo Hardened kernels include the PaX/GRSecurity patches. They use a
dialect of C that relies on a GCC plugin. In particular, struct
file_operations has been marked do_const in the PaX/GRSecurity dialect,
which causes GCC to consider all instances of it as const. This caused
failures in the autotools checks and the ZFS source code.

To address this, we modify the autotools checks to take into account
differences between the PaX C dialect and the regular C dialect. We also
modify struct zfs_acl's z_ops member to be a pointer to a function
pointer table. Lastly, we modify zpl_put_link() to address a PaX change
to the function prototype of nd_get_link().  This avoids compiler errors
in the PaX/GRSecurity dialect.

Note that the change in zpl_put_link() causes a warning that becomes a
build failure when debugging is enabled. Fixing that warning requires
ryao/spl@5ca50ef.

Closes zfsonlinux/zfs#484

Signed-off-by: Richard Yao <ryao@cs.stonybrook.edu>

ryao added a commit to ryao/zfs that referenced this issue Oct 9, 2014

vdev_raidz_io_start: Ignore empty writes
The below excerpt of a backtrace is from a ztest failure when running
ZoL's ztest.

/#453 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/#454 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350003de0) at ../../module/zfs/vdev_queue.c:747
/#455 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350003de0) at ../../module/zfs/zio.c:2659
/#456 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1399
/#457 zio_nowait (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1456
/#458 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350003a10) at ../../module/zfs/vdev_mirror.c:374
/#459 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1399
/#460 zio_nowait (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1456
/#461 0x00007f03c806464c in vdev_raidz_io_start (zio=0x7f0350003380) at ../../module/zfs/vdev_raidz.c:1607
/#462 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003380) at ../../module/zfs/zio.c:1399
/#463 zio_nowait (zio=0x7f0350003380) at ../../module/zfs/zio.c:1456
/#464 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002fb0) at ../../module/zfs/vdev_mirror.c:374
/#465 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1399
/#466 zio_nowait (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1456
/#467 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033957ebf0) at ../../module/zfs/vdev_mirror.c:499
/#468 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:2707
/#469 0x00007f03c808285b in __zio_execute (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:1399
/#470 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f0390001330, pio=0x7f033957ebf0) at ../../module/zfs/zio.c:547
/#471 zio_done (zio=0x7f0390001330) at ../../module/zfs/zio.c:3278
/#472 0x00007f03c808285b in __zio_execute (zio=0x7f0390001330) at ../../module/zfs/zio.c:1399
/#473 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4013a00, pio=0x7f0390001330) at ../../module/zfs/zio.c:547
/#474 zio_done (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:3278
/#475 0x00007f03c808285b in __zio_execute (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:1399
/#476 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014210, pio=0x7f03b4013a00) at ../../module/zfs/zio.c:547
/#477 zio_done (zio=0x7f03b4014210) at ../../module/zfs/zio.c:3278
/#478 0x00007f03c808285b in __zio_execute (zio=0x7f03b4014210) at ../../module/zfs/zio.c:1399
/#479 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014620, pio=0x7f03b4014210) at ../../module/zfs/zio.c:547
/#480 zio_done (zio=0x7f03b4014620) at ../../module/zfs/zio.c:3278
/#481 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03b4014620) at ../../module/zfs/zio.c:1399
/#482 zio_execute (zio=zio@entry=0x7f03b4014620) at ../../module/zfs/zio.c:1337
/#483 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/#484 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350002be0) at ../../module/zfs/vdev_queue.c:747
/#485 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350002be0) at ../../module/zfs/zio.c:2659
/#486 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1399
/#487 zio_nowait (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1456
/#488 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002810) at ../../module/zfs/vdev_mirror.c:374
/#489 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002810) at ../../module/zfs/zio.c:1399
/#490 zio_nowait (zio=0x7f0350002810) at ../../module/zfs/zio.c:1456
/#491 0x00007f03c8064593 in vdev_raidz_io_start (zio=0x7f0350001270) at ../../module/zfs/vdev_raidz.c:1591
/#492 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001270) at ../../module/zfs/zio.c:1399
/#493 zio_nowait (zio=0x7f0350001270) at ../../module/zfs/zio.c:1456
/#494 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350001e60) at ../../module/zfs/vdev_mirror.c:374
/#495 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1399
/#496 zio_nowait (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1456
/#497 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/vdev_mirror.c:499
/#498 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:2707
/#499 0x00007f03c808285b in __zio_execute (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:1399
/#500 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03a8003c00, pio=0x7f033a0c39c0) at ../../module/zfs/zio.c:547
/#501 zio_done (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:3278
/#502 0x00007f03c808285b in __zio_execute (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:1399
/#503 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800c400, pio=0x7f03a8003c00) at ../../module/zfs/zio.c:547
/#504 zio_done (zio=0x7f038800c400) at ../../module/zfs/zio.c:3278
/#505 0x00007f03c808285b in __zio_execute (zio=0x7f038800c400) at ../../module/zfs/zio.c:1399
/#506 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800da00, pio=0x7f038800c400) at ../../module/zfs/zio.c:547
/#507 zio_done (zio=0x7f038800da00) at ../../module/zfs/zio.c:3278
/#508 0x00007f03c808285b in __zio_execute (zio=0x7f038800da00) at ../../module/zfs/zio.c:1399
/#509 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800fd80, pio=0x7f038800da00) at ../../module/zfs/zio.c:547
/#510 zio_done (zio=0x7f038800fd80) at ../../module/zfs/zio.c:3278
/#511 0x00007f03c807a6d3 in __zio_execute (zio=0x7f038800fd80) at ../../module/zfs/zio.c:1399
/#512 zio_execute (zio=zio@entry=0x7f038800fd80) at ../../module/zfs/zio.c:1337
/#513 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/#514 0x00007f03c806119d in vdev_queue_io_done (zio=zio@entry=0x7f03a0010950) at ../../module/zfs/vdev_queue.c:775
/#515 0x00007f03c807a0e8 in zio_vdev_io_done (zio=0x7f03a0010950) at ../../module/zfs/zio.c:2686
/#516 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1399
/#517 zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1337
/#518 0x00007f03c7fcd0c4 in taskq_thread (arg=0x966d50) at ../../lib/libzpool/taskq.c:215
/#519 0x00007f03c7fc7937 in zk_thread_helper (arg=0x967e90) at ../../lib/libzpool/kernel.c:135
/#520 0x00007f03c78890a3 in start_thread (arg=0x7f03c2703700) at pthread_create.c:309
/#521 0x00007f03c75c50fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

The backtrace was an infinite loop of `vdev_queue_io_to_issue()` invoking
`zio_execute()` until it overran the stack. vdev_queue_io_to_issue() will ony
invoke `zio_execute()` on raidz vdevs when aggregation I/Os are generated to
improve aggregation continuity. These I/Os do not trigger any writes. However,
it appears that they can be generated in such a way that they recurse
infinitely upon return to `vdev_queue_io_to_issue()`. As a consequence, we see
the number of parents by 1 each time the recursion returns to
`vdev_raidz_io_start()`.

Signed-off-by: Richard Yao <ryao@gentoo.org>

ryao added a commit to ryao/zfs that referenced this issue Oct 9, 2014

vdev_raidz_io_start: Ignore empty writes
The below excerpt of a backtrace is from a ztest failure when running
ZoL's ztest.

/#453 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/#454 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350003de0) at ../../module/zfs/vdev_queue.c:747
/#455 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350003de0) at ../../module/zfs/zio.c:2659
/#456 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1399
/#457 zio_nowait (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1456
/#458 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350003a10) at ../../module/zfs/vdev_mirror.c:374
/#459 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1399
/#460 zio_nowait (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1456
/#461 0x00007f03c806464c in vdev_raidz_io_start (zio=0x7f0350003380) at ../../module/zfs/vdev_raidz.c:1607
/#462 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003380) at ../../module/zfs/zio.c:1399
/#463 zio_nowait (zio=0x7f0350003380) at ../../module/zfs/zio.c:1456
/#464 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002fb0) at ../../module/zfs/vdev_mirror.c:374
/#465 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1399
/#466 zio_nowait (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1456
/#467 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033957ebf0) at ../../module/zfs/vdev_mirror.c:499
/#468 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:2707
/#469 0x00007f03c808285b in __zio_execute (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:1399
/#470 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f0390001330, pio=0x7f033957ebf0) at ../../module/zfs/zio.c:547
/#471 zio_done (zio=0x7f0390001330) at ../../module/zfs/zio.c:3278
/#472 0x00007f03c808285b in __zio_execute (zio=0x7f0390001330) at ../../module/zfs/zio.c:1399
/#473 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4013a00, pio=0x7f0390001330) at ../../module/zfs/zio.c:547
/#474 zio_done (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:3278
/#475 0x00007f03c808285b in __zio_execute (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:1399
/#476 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014210, pio=0x7f03b4013a00) at ../../module/zfs/zio.c:547
/#477 zio_done (zio=0x7f03b4014210) at ../../module/zfs/zio.c:3278
/#478 0x00007f03c808285b in __zio_execute (zio=0x7f03b4014210) at ../../module/zfs/zio.c:1399
/#479 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014620, pio=0x7f03b4014210) at ../../module/zfs/zio.c:547
/#480 zio_done (zio=0x7f03b4014620) at ../../module/zfs/zio.c:3278
/#481 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03b4014620) at ../../module/zfs/zio.c:1399
/#482 zio_execute (zio=zio@entry=0x7f03b4014620) at ../../module/zfs/zio.c:1337
/#483 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/#484 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350002be0) at ../../module/zfs/vdev_queue.c:747
/#485 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350002be0) at ../../module/zfs/zio.c:2659
/#486 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1399
/#487 zio_nowait (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1456
/#488 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002810) at ../../module/zfs/vdev_mirror.c:374
/#489 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002810) at ../../module/zfs/zio.c:1399
/#490 zio_nowait (zio=0x7f0350002810) at ../../module/zfs/zio.c:1456
/#491 0x00007f03c8064593 in vdev_raidz_io_start (zio=0x7f0350001270) at ../../module/zfs/vdev_raidz.c:1591
/#492 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001270) at ../../module/zfs/zio.c:1399
/#493 zio_nowait (zio=0x7f0350001270) at ../../module/zfs/zio.c:1456
/#494 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350001e60) at ../../module/zfs/vdev_mirror.c:374
/#495 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1399
/#496 zio_nowait (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1456
/#497 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/vdev_mirror.c:499
/#498 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:2707
/#499 0x00007f03c808285b in __zio_execute (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:1399
/#500 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03a8003c00, pio=0x7f033a0c39c0) at ../../module/zfs/zio.c:547
/#501 zio_done (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:3278
/#502 0x00007f03c808285b in __zio_execute (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:1399
/#503 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800c400, pio=0x7f03a8003c00) at ../../module/zfs/zio.c:547
/#504 zio_done (zio=0x7f038800c400) at ../../module/zfs/zio.c:3278
/#505 0x00007f03c808285b in __zio_execute (zio=0x7f038800c400) at ../../module/zfs/zio.c:1399
/#506 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800da00, pio=0x7f038800c400) at ../../module/zfs/zio.c:547
/#507 zio_done (zio=0x7f038800da00) at ../../module/zfs/zio.c:3278
/#508 0x00007f03c808285b in __zio_execute (zio=0x7f038800da00) at ../../module/zfs/zio.c:1399
/#509 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800fd80, pio=0x7f038800da00) at ../../module/zfs/zio.c:547
/#510 zio_done (zio=0x7f038800fd80) at ../../module/zfs/zio.c:3278
/#511 0x00007f03c807a6d3 in __zio_execute (zio=0x7f038800fd80) at ../../module/zfs/zio.c:1399
/#512 zio_execute (zio=zio@entry=0x7f038800fd80) at ../../module/zfs/zio.c:1337
/#513 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/#514 0x00007f03c806119d in vdev_queue_io_done (zio=zio@entry=0x7f03a0010950) at ../../module/zfs/vdev_queue.c:775
/#515 0x00007f03c807a0e8 in zio_vdev_io_done (zio=0x7f03a0010950) at ../../module/zfs/zio.c:2686
/#516 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1399
/#517 zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1337
/#518 0x00007f03c7fcd0c4 in taskq_thread (arg=0x966d50) at ../../lib/libzpool/taskq.c:215
/#519 0x00007f03c7fc7937 in zk_thread_helper (arg=0x967e90) at ../../lib/libzpool/kernel.c:135
/#520 0x00007f03c78890a3 in start_thread (arg=0x7f03c2703700) at pthread_create.c:309
/#521 0x00007f03c75c50fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

The backtrace was an infinite loop of `vdev_queue_io_to_issue()` invoking
`zio_execute()` until it overran the stack. vdev_queue_io_to_issue() will ony
invoke `zio_execute()` on raidz vdevs when aggregation I/Os are generated to
improve aggregation continuity. These I/Os do not trigger any writes. However,
it appears that they can be generated in such a way that they recurse
infinitely upon return to `vdev_queue_io_to_issue()`. As a consequence, we see
the number of parents by 1 each time the recursion returns to
`vdev_raidz_io_start()`.

Signed-off-by: Richard Yao <richard.yao@clusterhq.com>

ryao added a commit to ryao/zfs that referenced this issue Oct 9, 2014

vdev_raidz_io_start: Ignore empty writes
The below excerpt of a backtrace is from a ztest failure when running
ZoL's ztest.

/#453 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/#454 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350003de0) at ../../module/zfs/vdev_queue.c:747
/#455 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350003de0) at ../../module/zfs/zio.c:2659
/#456 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1399
/#457 zio_nowait (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1456
/#458 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350003a10) at ../../module/zfs/vdev_mirror.c:374
/#459 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1399
/#460 zio_nowait (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1456
/#461 0x00007f03c806464c in vdev_raidz_io_start (zio=0x7f0350003380) at ../../module/zfs/vdev_raidz.c:1607
/#462 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003380) at ../../module/zfs/zio.c:1399
/#463 zio_nowait (zio=0x7f0350003380) at ../../module/zfs/zio.c:1456
/#464 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002fb0) at ../../module/zfs/vdev_mirror.c:374
/#465 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1399
/#466 zio_nowait (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1456
/#467 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033957ebf0) at ../../module/zfs/vdev_mirror.c:499
/#468 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:2707
/#469 0x00007f03c808285b in __zio_execute (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:1399
/#470 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f0390001330, pio=0x7f033957ebf0) at ../../module/zfs/zio.c:547
/#471 zio_done (zio=0x7f0390001330) at ../../module/zfs/zio.c:3278
/#472 0x00007f03c808285b in __zio_execute (zio=0x7f0390001330) at ../../module/zfs/zio.c:1399
/#473 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4013a00, pio=0x7f0390001330) at ../../module/zfs/zio.c:547
/#474 zio_done (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:3278
/#475 0x00007f03c808285b in __zio_execute (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:1399
/#476 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014210, pio=0x7f03b4013a00) at ../../module/zfs/zio.c:547
/#477 zio_done (zio=0x7f03b4014210) at ../../module/zfs/zio.c:3278
/#478 0x00007f03c808285b in __zio_execute (zio=0x7f03b4014210) at ../../module/zfs/zio.c:1399
/#479 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014620, pio=0x7f03b4014210) at ../../module/zfs/zio.c:547
/#480 zio_done (zio=0x7f03b4014620) at ../../module/zfs/zio.c:3278
/#481 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03b4014620) at ../../module/zfs/zio.c:1399
/#482 zio_execute (zio=zio@entry=0x7f03b4014620) at ../../module/zfs/zio.c:1337
/#483 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/#484 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350002be0) at ../../module/zfs/vdev_queue.c:747
/#485 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350002be0) at ../../module/zfs/zio.c:2659
/#486 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1399
/#487 zio_nowait (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1456
/#488 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002810) at ../../module/zfs/vdev_mirror.c:374
/#489 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002810) at ../../module/zfs/zio.c:1399
/#490 zio_nowait (zio=0x7f0350002810) at ../../module/zfs/zio.c:1456
/#491 0x00007f03c8064593 in vdev_raidz_io_start (zio=0x7f0350001270) at ../../module/zfs/vdev_raidz.c:1591
/#492 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001270) at ../../module/zfs/zio.c:1399
/#493 zio_nowait (zio=0x7f0350001270) at ../../module/zfs/zio.c:1456
/#494 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350001e60) at ../../module/zfs/vdev_mirror.c:374
/#495 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1399
/#496 zio_nowait (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1456
/#497 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/vdev_mirror.c:499
/#498 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:2707
/#499 0x00007f03c808285b in __zio_execute (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:1399
/#500 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03a8003c00, pio=0x7f033a0c39c0) at ../../module/zfs/zio.c:547
/#501 zio_done (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:3278
/#502 0x00007f03c808285b in __zio_execute (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:1399
/#503 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800c400, pio=0x7f03a8003c00) at ../../module/zfs/zio.c:547
/#504 zio_done (zio=0x7f038800c400) at ../../module/zfs/zio.c:3278
/#505 0x00007f03c808285b in __zio_execute (zio=0x7f038800c400) at ../../module/zfs/zio.c:1399
/#506 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800da00, pio=0x7f038800c400) at ../../module/zfs/zio.c:547
/#507 zio_done (zio=0x7f038800da00) at ../../module/zfs/zio.c:3278
/#508 0x00007f03c808285b in __zio_execute (zio=0x7f038800da00) at ../../module/zfs/zio.c:1399
/#509 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800fd80, pio=0x7f038800da00) at ../../module/zfs/zio.c:547
/#510 zio_done (zio=0x7f038800fd80) at ../../module/zfs/zio.c:3278
/#511 0x00007f03c807a6d3 in __zio_execute (zio=0x7f038800fd80) at ../../module/zfs/zio.c:1399
/#512 zio_execute (zio=zio@entry=0x7f038800fd80) at ../../module/zfs/zio.c:1337
/#513 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/#514 0x00007f03c806119d in vdev_queue_io_done (zio=zio@entry=0x7f03a0010950) at ../../module/zfs/vdev_queue.c:775
/#515 0x00007f03c807a0e8 in zio_vdev_io_done (zio=0x7f03a0010950) at ../../module/zfs/zio.c:2686
/#516 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1399
/#517 zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1337
/#518 0x00007f03c7fcd0c4 in taskq_thread (arg=0x966d50) at ../../lib/libzpool/taskq.c:215
/#519 0x00007f03c7fc7937 in zk_thread_helper (arg=0x967e90) at ../../lib/libzpool/kernel.c:135
/#520 0x00007f03c78890a3 in start_thread (arg=0x7f03c2703700) at pthread_create.c:309
/#521 0x00007f03c75c50fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

The backtrace was an infinite loop of `vdev_queue_io_to_issue()` invoking
`zio_execute()` until it overran the stack. vdev_queue_io_to_issue() will ony
invoke `zio_execute()` on raidz vdevs when aggregation I/Os are generated to
improve aggregation continuity. These I/Os do not trigger any writes. However,
it appears that they can be generated in such a way that they recurse
infinitely upon return to `vdev_queue_io_to_issue()`.

Signed-off-by: Richard Yao <richard.yao@clusterhq.com>

ryao added a commit to ryao/zfs that referenced this issue Oct 10, 2014

Redispatch ZIOs in deep notification call graphs
The below excerpt of a backtrace is from a ztest failure when running ZoL's
ztest:

/#453 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/#454 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350003de0) at ../../module/zfs/vdev_queue.c:747
/#455 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350003de0) at ../../module/zfs/zio.c:2659
/#456 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1399
/#457 zio_nowait (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1456
/#458 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350003a10) at ../../module/zfs/vdev_mirror.c:374
/#459 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1399
/#460 zio_nowait (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1456
/#461 0x00007f03c806464c in vdev_raidz_io_start (zio=0x7f0350003380) at ../../module/zfs/vdev_raidz.c:1607
/#462 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003380) at ../../module/zfs/zio.c:1399
/#463 zio_nowait (zio=0x7f0350003380) at ../../module/zfs/zio.c:1456
/#464 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002fb0) at ../../module/zfs/vdev_mirror.c:374
/#465 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1399
/#466 zio_nowait (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1456
/#467 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033957ebf0) at ../../module/zfs/vdev_mirror.c:499
/#468 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:2707
/#469 0x00007f03c808285b in __zio_execute (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:1399
/#470 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f0390001330, pio=0x7f033957ebf0) at ../../module/zfs/zio.c:547
/#471 zio_done (zio=0x7f0390001330) at ../../module/zfs/zio.c:3278
/#472 0x00007f03c808285b in __zio_execute (zio=0x7f0390001330) at ../../module/zfs/zio.c:1399
/#473 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4013a00, pio=0x7f0390001330) at ../../module/zfs/zio.c:547
/#474 zio_done (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:3278
/#475 0x00007f03c808285b in __zio_execute (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:1399
/#476 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014210, pio=0x7f03b4013a00) at ../../module/zfs/zio.c:547
/#477 zio_done (zio=0x7f03b4014210) at ../../module/zfs/zio.c:3278
/#478 0x00007f03c808285b in __zio_execute (zio=0x7f03b4014210) at ../../module/zfs/zio.c:1399
/#479 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014620, pio=0x7f03b4014210) at ../../module/zfs/zio.c:547
/#480 zio_done (zio=0x7f03b4014620) at ../../module/zfs/zio.c:3278
/#481 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03b4014620) at ../../module/zfs/zio.c:1399
/#482 zio_execute (zio=zio@entry=0x7f03b4014620) at ../../module/zfs/zio.c:1337
/#483 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/#484 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350002be0) at ../../module/zfs/vdev_queue.c:747
/#485 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350002be0) at ../../module/zfs/zio.c:2659
/#486 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1399
/#487 zio_nowait (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1456
/#488 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002810) at ../../module/zfs/vdev_mirror.c:374
/#489 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002810) at ../../module/zfs/zio.c:1399
/#490 zio_nowait (zio=0x7f0350002810) at ../../module/zfs/zio.c:1456
/#491 0x00007f03c8064593 in vdev_raidz_io_start (zio=0x7f0350001270) at ../../module/zfs/vdev_raidz.c:1591
/#492 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001270) at ../../module/zfs/zio.c:1399
/#493 zio_nowait (zio=0x7f0350001270) at ../../module/zfs/zio.c:1456
/#494 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350001e60) at ../../module/zfs/vdev_mirror.c:374
/#495 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1399
/#496 zio_nowait (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1456
/#497 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/vdev_mirror.c:499
/#498 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:2707
/#499 0x00007f03c808285b in __zio_execute (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:1399
/#500 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03a8003c00, pio=0x7f033a0c39c0) at ../../module/zfs/zio.c:547
/#501 zio_done (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:3278
/#502 0x00007f03c808285b in __zio_execute (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:1399
/#503 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800c400, pio=0x7f03a8003c00) at ../../module/zfs/zio.c:547
/#504 zio_done (zio=0x7f038800c400) at ../../module/zfs/zio.c:3278
/#505 0x00007f03c808285b in __zio_execute (zio=0x7f038800c400) at ../../module/zfs/zio.c:1399
/#506 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800da00, pio=0x7f038800c400) at ../../module/zfs/zio.c:547
/#507 zio_done (zio=0x7f038800da00) at ../../module/zfs/zio.c:3278
/#508 0x00007f03c808285b in __zio_execute (zio=0x7f038800da00) at ../../module/zfs/zio.c:1399
/#509 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800fd80, pio=0x7f038800da00) at ../../module/zfs/zio.c:547
/#510 zio_done (zio=0x7f038800fd80) at ../../module/zfs/zio.c:3278
/#511 0x00007f03c807a6d3 in __zio_execute (zio=0x7f038800fd80) at ../../module/zfs/zio.c:1399
/#512 zio_execute (zio=zio@entry=0x7f038800fd80) at ../../module/zfs/zio.c:1337
/#513 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/#514 0x00007f03c806119d in vdev_queue_io_done (zio=zio@entry=0x7f03a0010950) at ../../module/zfs/vdev_queue.c:775
/#515 0x00007f03c807a0e8 in zio_vdev_io_done (zio=0x7f03a0010950) at ../../module/zfs/zio.c:2686
/#516 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1399
/#517 zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1337
/#518 0x00007f03c7fcd0c4 in taskq_thread (arg=0x966d50) at ../../lib/libzpool/taskq.c:215
/#519 0x00007f03c7fc7937 in zk_thread_helper (arg=0x967e90) at ../../lib/libzpool/kernel.c:135
/#520 0x00007f03c78890a3 in start_thread (arg=0x7f03c2703700) at pthread_create.c:309
/#521 0x00007f03c75c50fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

This occurred when ztest was simulating a scrub under heavy I/O load. Under
those circumstances, it was possible for a mix of noop I/Os for aggregation
continuity and the I/O elevator to generate arbitrarily deep recursion.

This patch modifies ZFS to propapage a recursion counter inside the zio_t
objects such that IOs will be redispatched upon reaching a given recursion
depth.  We can detect long call chains and dispatch to another ZIO taskq. We
cut in-line when we do this to minimize the potential for taskq exhaustion that
can prevent a zio from notifying its parent.

Signed-off-by: Richard Yao <ryao@gentoo.org>

ryao added a commit to ryao/zfs that referenced this issue Oct 10, 2014

Redispatch ZIOs in deep call graphs
The below excerpt of a backtrace is from a ztest failure when running ZoL's
ztest:

/#453 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/#454 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350003de0) at ../../module/zfs/vdev_queue.c:747
/#455 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350003de0) at ../../module/zfs/zio.c:2659
/#456 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1399
/#457 zio_nowait (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1456
/#458 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350003a10) at ../../module/zfs/vdev_mirror.c:374
/#459 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1399
/#460 zio_nowait (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1456
/#461 0x00007f03c806464c in vdev_raidz_io_start (zio=0x7f0350003380) at ../../module/zfs/vdev_raidz.c:1607
/#462 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003380) at ../../module/zfs/zio.c:1399
/#463 zio_nowait (zio=0x7f0350003380) at ../../module/zfs/zio.c:1456
/#464 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002fb0) at ../../module/zfs/vdev_mirror.c:374
/#465 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1399
/#466 zio_nowait (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1456
/#467 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033957ebf0) at ../../module/zfs/vdev_mirror.c:499
/#468 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:2707
/#469 0x00007f03c808285b in __zio_execute (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:1399
/#470 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f0390001330, pio=0x7f033957ebf0) at ../../module/zfs/zio.c:547
/#471 zio_done (zio=0x7f0390001330) at ../../module/zfs/zio.c:3278
/#472 0x00007f03c808285b in __zio_execute (zio=0x7f0390001330) at ../../module/zfs/zio.c:1399
/#473 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4013a00, pio=0x7f0390001330) at ../../module/zfs/zio.c:547
/#474 zio_done (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:3278
/#475 0x00007f03c808285b in __zio_execute (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:1399
/#476 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014210, pio=0x7f03b4013a00) at ../../module/zfs/zio.c:547
/#477 zio_done (zio=0x7f03b4014210) at ../../module/zfs/zio.c:3278
/#478 0x00007f03c808285b in __zio_execute (zio=0x7f03b4014210) at ../../module/zfs/zio.c:1399
/#479 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014620, pio=0x7f03b4014210) at ../../module/zfs/zio.c:547
/#480 zio_done (zio=0x7f03b4014620) at ../../module/zfs/zio.c:3278
/#481 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03b4014620) at ../../module/zfs/zio.c:1399
/#482 zio_execute (zio=zio@entry=0x7f03b4014620) at ../../module/zfs/zio.c:1337
/#483 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/#484 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350002be0) at ../../module/zfs/vdev_queue.c:747
/#485 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350002be0) at ../../module/zfs/zio.c:2659
/#486 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1399
/#487 zio_nowait (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1456
/#488 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002810) at ../../module/zfs/vdev_mirror.c:374
/#489 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002810) at ../../module/zfs/zio.c:1399
/#490 zio_nowait (zio=0x7f0350002810) at ../../module/zfs/zio.c:1456
/#491 0x00007f03c8064593 in vdev_raidz_io_start (zio=0x7f0350001270) at ../../module/zfs/vdev_raidz.c:1591
/#492 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001270) at ../../module/zfs/zio.c:1399
/#493 zio_nowait (zio=0x7f0350001270) at ../../module/zfs/zio.c:1456
/#494 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350001e60) at ../../module/zfs/vdev_mirror.c:374
/#495 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1399
/#496 zio_nowait (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1456
/#497 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/vdev_mirror.c:499
/#498 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:2707
/#499 0x00007f03c808285b in __zio_execute (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:1399
/#500 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03a8003c00, pio=0x7f033a0c39c0) at ../../module/zfs/zio.c:547
/#501 zio_done (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:3278
/#502 0x00007f03c808285b in __zio_execute (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:1399
/#503 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800c400, pio=0x7f03a8003c00) at ../../module/zfs/zio.c:547
/#504 zio_done (zio=0x7f038800c400) at ../../module/zfs/zio.c:3278
/#505 0x00007f03c808285b in __zio_execute (zio=0x7f038800c400) at ../../module/zfs/zio.c:1399
/#506 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800da00, pio=0x7f038800c400) at ../../module/zfs/zio.c:547
/#507 zio_done (zio=0x7f038800da00) at ../../module/zfs/zio.c:3278
/#508 0x00007f03c808285b in __zio_execute (zio=0x7f038800da00) at ../../module/zfs/zio.c:1399
/#509 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800fd80, pio=0x7f038800da00) at ../../module/zfs/zio.c:547
/#510 zio_done (zio=0x7f038800fd80) at ../../module/zfs/zio.c:3278
/#511 0x00007f03c807a6d3 in __zio_execute (zio=0x7f038800fd80) at ../../module/zfs/zio.c:1399
/#512 zio_execute (zio=zio@entry=0x7f038800fd80) at ../../module/zfs/zio.c:1337
/#513 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/#514 0x00007f03c806119d in vdev_queue_io_done (zio=zio@entry=0x7f03a0010950) at ../../module/zfs/vdev_queue.c:775
/#515 0x00007f03c807a0e8 in zio_vdev_io_done (zio=0x7f03a0010950) at ../../module/zfs/zio.c:2686
/#516 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1399
/#517 zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1337
/#518 0x00007f03c7fcd0c4 in taskq_thread (arg=0x966d50) at ../../lib/libzpool/taskq.c:215
/#519 0x00007f03c7fc7937 in zk_thread_helper (arg=0x967e90) at ../../lib/libzpool/kernel.c:135
/#520 0x00007f03c78890a3 in start_thread (arg=0x7f03c2703700) at pthread_create.c:309
/#521 0x00007f03c75c50fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

This occurred when ztest was simulating a scrub under heavy I/O load. Under
those circumstances, it was possible for a mix of noop I/Os for aggregation
continuity and the I/O elevator to generate arbitrarily deep recursion.

This patch modifies ZFS to propapage a recursion counter inside the zio_t
objects such that IOs will be redispatched upon reaching a given recursion
depth.  We can detect long call chains and dispatch to another ZIO taskq. We
cut in-line when we do this to minimize the potential for taskq exhaustion that
can prevent a zio from notifying its parent.

Signed-off-by: Richard Yao <ryao@gentoo.org>

ryao added a commit to ryao/zfs that referenced this issue Oct 10, 2014

Redispatch ZIOs in deep call graphs
The below excerpt of a backtrace is from a ztest failure when running ZoL's
ztest:

/#453 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/#454 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350003de0) at ../../module/zfs/vdev_queue.c:747
/#455 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350003de0) at ../../module/zfs/zio.c:2659
/#456 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1399
/#457 zio_nowait (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1456
/#458 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350003a10) at ../../module/zfs/vdev_mirror.c:374
/#459 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1399
/#460 zio_nowait (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1456
/#461 0x00007f03c806464c in vdev_raidz_io_start (zio=0x7f0350003380) at ../../module/zfs/vdev_raidz.c:1607
/#462 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003380) at ../../module/zfs/zio.c:1399
/#463 zio_nowait (zio=0x7f0350003380) at ../../module/zfs/zio.c:1456
/#464 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002fb0) at ../../module/zfs/vdev_mirror.c:374
/#465 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1399
/#466 zio_nowait (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1456
/#467 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033957ebf0) at ../../module/zfs/vdev_mirror.c:499
/#468 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:2707
/#469 0x00007f03c808285b in __zio_execute (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:1399
/#470 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f0390001330, pio=0x7f033957ebf0) at ../../module/zfs/zio.c:547
/#471 zio_done (zio=0x7f0390001330) at ../../module/zfs/zio.c:3278
/#472 0x00007f03c808285b in __zio_execute (zio=0x7f0390001330) at ../../module/zfs/zio.c:1399
/#473 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4013a00, pio=0x7f0390001330) at ../../module/zfs/zio.c:547
/#474 zio_done (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:3278
/#475 0x00007f03c808285b in __zio_execute (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:1399
/#476 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014210, pio=0x7f03b4013a00) at ../../module/zfs/zio.c:547
/#477 zio_done (zio=0x7f03b4014210) at ../../module/zfs/zio.c:3278
/#478 0x00007f03c808285b in __zio_execute (zio=0x7f03b4014210) at ../../module/zfs/zio.c:1399
/#479 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014620, pio=0x7f03b4014210) at ../../module/zfs/zio.c:547
/#480 zio_done (zio=0x7f03b4014620) at ../../module/zfs/zio.c:3278
/#481 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03b4014620) at ../../module/zfs/zio.c:1399
/#482 zio_execute (zio=zio@entry=0x7f03b4014620) at ../../module/zfs/zio.c:1337
/#483 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/#484 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350002be0) at ../../module/zfs/vdev_queue.c:747
/#485 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350002be0) at ../../module/zfs/zio.c:2659
/#486 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1399
/#487 zio_nowait (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1456
/#488 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002810) at ../../module/zfs/vdev_mirror.c:374
/#489 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002810) at ../../module/zfs/zio.c:1399
/#490 zio_nowait (zio=0x7f0350002810) at ../../module/zfs/zio.c:1456
/#491 0x00007f03c8064593 in vdev_raidz_io_start (zio=0x7f0350001270) at ../../module/zfs/vdev_raidz.c:1591
/#492 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001270) at ../../module/zfs/zio.c:1399
/#493 zio_nowait (zio=0x7f0350001270) at ../../module/zfs/zio.c:1456
/#494 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350001e60) at ../../module/zfs/vdev_mirror.c:374
/#495 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1399
/#496 zio_nowait (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1456
/#497 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/vdev_mirror.c:499
/#498 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:2707
/#499 0x00007f03c808285b in __zio_execute (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:1399
/#500 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03a8003c00, pio=0x7f033a0c39c0) at ../../module/zfs/zio.c:547
/#501 zio_done (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:3278
/#502 0x00007f03c808285b in __zio_execute (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:1399
/#503 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800c400, pio=0x7f03a8003c00) at ../../module/zfs/zio.c:547
/#504 zio_done (zio=0x7f038800c400) at ../../module/zfs/zio.c:3278
/#505 0x00007f03c808285b in __zio_execute (zio=0x7f038800c400) at ../../module/zfs/zio.c:1399
/#506 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800da00, pio=0x7f038800c400) at ../../module/zfs/zio.c:547
/#507 zio_done (zio=0x7f038800da00) at ../../module/zfs/zio.c:3278
/#508 0x00007f03c808285b in __zio_execute (zio=0x7f038800da00) at ../../module/zfs/zio.c:1399
/#509 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800fd80, pio=0x7f038800da00) at ../../module/zfs/zio.c:547
/#510 zio_done (zio=0x7f038800fd80) at ../../module/zfs/zio.c:3278
/#511 0x00007f03c807a6d3 in __zio_execute (zio=0x7f038800fd80) at ../../module/zfs/zio.c:1399
/#512 zio_execute (zio=zio@entry=0x7f038800fd80) at ../../module/zfs/zio.c:1337
/#513 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/#514 0x00007f03c806119d in vdev_queue_io_done (zio=zio@entry=0x7f03a0010950) at ../../module/zfs/vdev_queue.c:775
/#515 0x00007f03c807a0e8 in zio_vdev_io_done (zio=0x7f03a0010950) at ../../module/zfs/zio.c:2686
/#516 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1399
/#517 zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1337
/#518 0x00007f03c7fcd0c4 in taskq_thread (arg=0x966d50) at ../../lib/libzpool/taskq.c:215
/#519 0x00007f03c7fc7937 in zk_thread_helper (arg=0x967e90) at ../../lib/libzpool/kernel.c:135
/#520 0x00007f03c78890a3 in start_thread (arg=0x7f03c2703700) at pthread_create.c:309
/#521 0x00007f03c75c50fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

This occurred when ztest was simulating a scrub under heavy I/O load. Under
those circumstances, it was possible for a mix of noop I/Os for aggregation
continuity and the I/O elevator to generate arbitrarily deep recursion.

This patch modifies ZFS to propapage a recursion counter inside the zio_t
objects such that IOs will be redispatched upon reaching a given recursion
depth.  We can detect long call chains and dispatch to another ZIO taskq. We
cut in-line when we do this to minimize the potential for taskq exhaustion that
can prevent a zio from notifying its parent.

Signed-off-by: Richard Yao <ryao@gentoo.org>

ryao added a commit to ryao/zfs that referenced this issue Oct 10, 2014

Redispatch ZIOs in deep call graphs
The below excerpt of a backtrace is from a ztest failure when running ZoL's
ztest:

/#453 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/#454 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350003de0) at ../../module/zfs/vdev_queue.c:747
/#455 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350003de0) at ../../module/zfs/zio.c:2659
/#456 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1399
/#457 zio_nowait (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1456
/#458 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350003a10) at ../../module/zfs/vdev_mirror.c:374
/#459 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1399
/#460 zio_nowait (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1456
/#461 0x00007f03c806464c in vdev_raidz_io_start (zio=0x7f0350003380) at ../../module/zfs/vdev_raidz.c:1607
/#462 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003380) at ../../module/zfs/zio.c:1399
/#463 zio_nowait (zio=0x7f0350003380) at ../../module/zfs/zio.c:1456
/#464 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002fb0) at ../../module/zfs/vdev_mirror.c:374
/#465 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1399
/#466 zio_nowait (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1456
/#467 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033957ebf0) at ../../module/zfs/vdev_mirror.c:499
/#468 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:2707
/#469 0x00007f03c808285b in __zio_execute (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:1399
/#470 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f0390001330, pio=0x7f033957ebf0) at ../../module/zfs/zio.c:547
/#471 zio_done (zio=0x7f0390001330) at ../../module/zfs/zio.c:3278
/#472 0x00007f03c808285b in __zio_execute (zio=0x7f0390001330) at ../../module/zfs/zio.c:1399
/#473 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4013a00, pio=0x7f0390001330) at ../../module/zfs/zio.c:547
/#474 zio_done (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:3278
/#475 0x00007f03c808285b in __zio_execute (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:1399
/#476 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014210, pio=0x7f03b4013a00) at ../../module/zfs/zio.c:547
/#477 zio_done (zio=0x7f03b4014210) at ../../module/zfs/zio.c:3278
/#478 0x00007f03c808285b in __zio_execute (zio=0x7f03b4014210) at ../../module/zfs/zio.c:1399
/#479 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014620, pio=0x7f03b4014210) at ../../module/zfs/zio.c:547
/#480 zio_done (zio=0x7f03b4014620) at ../../module/zfs/zio.c:3278
/#481 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03b4014620) at ../../module/zfs/zio.c:1399
/#482 zio_execute (zio=zio@entry=0x7f03b4014620) at ../../module/zfs/zio.c:1337
/#483 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/#484 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350002be0) at ../../module/zfs/vdev_queue.c:747
/#485 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350002be0) at ../../module/zfs/zio.c:2659
/#486 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1399
/#487 zio_nowait (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1456
/#488 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002810) at ../../module/zfs/vdev_mirror.c:374
/#489 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002810) at ../../module/zfs/zio.c:1399
/#490 zio_nowait (zio=0x7f0350002810) at ../../module/zfs/zio.c:1456
/#491 0x00007f03c8064593 in vdev_raidz_io_start (zio=0x7f0350001270) at ../../module/zfs/vdev_raidz.c:1591
/#492 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001270) at ../../module/zfs/zio.c:1399
/#493 zio_nowait (zio=0x7f0350001270) at ../../module/zfs/zio.c:1456
/#494 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350001e60) at ../../module/zfs/vdev_mirror.c:374
/#495 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1399
/#496 zio_nowait (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1456
/#497 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/vdev_mirror.c:499
/#498 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:2707
/#499 0x00007f03c808285b in __zio_execute (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:1399
/#500 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03a8003c00, pio=0x7f033a0c39c0) at ../../module/zfs/zio.c:547
/#501 zio_done (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:3278
/#502 0x00007f03c808285b in __zio_execute (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:1399
/#503 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800c400, pio=0x7f03a8003c00) at ../../module/zfs/zio.c:547
/#504 zio_done (zio=0x7f038800c400) at ../../module/zfs/zio.c:3278
/#505 0x00007f03c808285b in __zio_execute (zio=0x7f038800c400) at ../../module/zfs/zio.c:1399
/#506 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800da00, pio=0x7f038800c400) at ../../module/zfs/zio.c:547
/#507 zio_done (zio=0x7f038800da00) at ../../module/zfs/zio.c:3278
/#508 0x00007f03c808285b in __zio_execute (zio=0x7f038800da00) at ../../module/zfs/zio.c:1399
/#509 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800fd80, pio=0x7f038800da00) at ../../module/zfs/zio.c:547
/#510 zio_done (zio=0x7f038800fd80) at ../../module/zfs/zio.c:3278
/#511 0x00007f03c807a6d3 in __zio_execute (zio=0x7f038800fd80) at ../../module/zfs/zio.c:1399
/#512 zio_execute (zio=zio@entry=0x7f038800fd80) at ../../module/zfs/zio.c:1337
/#513 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/#514 0x00007f03c806119d in vdev_queue_io_done (zio=zio@entry=0x7f03a0010950) at ../../module/zfs/vdev_queue.c:775
/#515 0x00007f03c807a0e8 in zio_vdev_io_done (zio=0x7f03a0010950) at ../../module/zfs/zio.c:2686
/#516 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1399
/#517 zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1337
/#518 0x00007f03c7fcd0c4 in taskq_thread (arg=0x966d50) at ../../lib/libzpool/taskq.c:215
/#519 0x00007f03c7fc7937 in zk_thread_helper (arg=0x967e90) at ../../lib/libzpool/kernel.c:135
/#520 0x00007f03c78890a3 in start_thread (arg=0x7f03c2703700) at pthread_create.c:309
/#521 0x00007f03c75c50fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

This occurred when ztest was simulating a scrub under heavy I/O load. Under
those circumstances, it was possible for a mix of noop I/Os for aggregation
continuity and the I/O elevator to generate arbitrarily deep recursion.

This patch modifies ZFS to propapage a recursion counter inside the zio_t
objects such that IOs will be redispatched upon reaching a given recursion
depth.  We can detect long call chains and dispatch to another ZIO taskq. We
cut in-line when we do this to minimize the potential for taskq exhaustion that
can prevent a zio from notifying its parent.

Signed-off-by: Richard Yao <ryao@gentoo.org>

ryao added a commit to ryao/zfs that referenced this issue Oct 10, 2014

Redispatch ZIOs in deep call graphs
The below excerpt of a backtrace is from a ztest failure when running ZoL's
ztest:

/#453 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/#454 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350003de0) at ../../module/zfs/vdev_queue.c:747
/#455 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350003de0) at ../../module/zfs/zio.c:2659
/#456 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1399
/#457 zio_nowait (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1456
/#458 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350003a10) at ../../module/zfs/vdev_mirror.c:374
/#459 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1399
/#460 zio_nowait (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1456
/#461 0x00007f03c806464c in vdev_raidz_io_start (zio=0x7f0350003380) at ../../module/zfs/vdev_raidz.c:1607
/#462 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003380) at ../../module/zfs/zio.c:1399
/#463 zio_nowait (zio=0x7f0350003380) at ../../module/zfs/zio.c:1456
/#464 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002fb0) at ../../module/zfs/vdev_mirror.c:374
/#465 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1399
/#466 zio_nowait (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1456
/#467 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033957ebf0) at ../../module/zfs/vdev_mirror.c:499
/#468 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:2707
/#469 0x00007f03c808285b in __zio_execute (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:1399
/#470 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f0390001330, pio=0x7f033957ebf0) at ../../module/zfs/zio.c:547
/#471 zio_done (zio=0x7f0390001330) at ../../module/zfs/zio.c:3278
/#472 0x00007f03c808285b in __zio_execute (zio=0x7f0390001330) at ../../module/zfs/zio.c:1399
/#473 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4013a00, pio=0x7f0390001330) at ../../module/zfs/zio.c:547
/#474 zio_done (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:3278
/#475 0x00007f03c808285b in __zio_execute (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:1399
/#476 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014210, pio=0x7f03b4013a00) at ../../module/zfs/zio.c:547
/#477 zio_done (zio=0x7f03b4014210) at ../../module/zfs/zio.c:3278
/#478 0x00007f03c808285b in __zio_execute (zio=0x7f03b4014210) at ../../module/zfs/zio.c:1399
/#479 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014620, pio=0x7f03b4014210) at ../../module/zfs/zio.c:547
/#480 zio_done (zio=0x7f03b4014620) at ../../module/zfs/zio.c:3278
/#481 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03b4014620) at ../../module/zfs/zio.c:1399
/#482 zio_execute (zio=zio@entry=0x7f03b4014620) at ../../module/zfs/zio.c:1337
/#483 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/#484 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350002be0) at ../../module/zfs/vdev_queue.c:747
/#485 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350002be0) at ../../module/zfs/zio.c:2659
/#486 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1399
/#487 zio_nowait (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1456
/#488 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002810) at ../../module/zfs/vdev_mirror.c:374
/#489 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002810) at ../../module/zfs/zio.c:1399
/#490 zio_nowait (zio=0x7f0350002810) at ../../module/zfs/zio.c:1456
/#491 0x00007f03c8064593 in vdev_raidz_io_start (zio=0x7f0350001270) at ../../module/zfs/vdev_raidz.c:1591
/#492 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001270) at ../../module/zfs/zio.c:1399
/#493 zio_nowait (zio=0x7f0350001270) at ../../module/zfs/zio.c:1456
/#494 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350001e60) at ../../module/zfs/vdev_mirror.c:374
/#495 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1399
/#496 zio_nowait (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1456
/#497 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/vdev_mirror.c:499
/#498 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:2707
/#499 0x00007f03c808285b in __zio_execute (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:1399
/#500 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03a8003c00, pio=0x7f033a0c39c0) at ../../module/zfs/zio.c:547
/#501 zio_done (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:3278
/#502 0x00007f03c808285b in __zio_execute (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:1399
/#503 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800c400, pio=0x7f03a8003c00) at ../../module/zfs/zio.c:547
/#504 zio_done (zio=0x7f038800c400) at ../../module/zfs/zio.c:3278
/#505 0x00007f03c808285b in __zio_execute (zio=0x7f038800c400) at ../../module/zfs/zio.c:1399
/#506 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800da00, pio=0x7f038800c400) at ../../module/zfs/zio.c:547
/#507 zio_done (zio=0x7f038800da00) at ../../module/zfs/zio.c:3278
/#508 0x00007f03c808285b in __zio_execute (zio=0x7f038800da00) at ../../module/zfs/zio.c:1399
/#509 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800fd80, pio=0x7f038800da00) at ../../module/zfs/zio.c:547
/#510 zio_done (zio=0x7f038800fd80) at ../../module/zfs/zio.c:3278
/#511 0x00007f03c807a6d3 in __zio_execute (zio=0x7f038800fd80) at ../../module/zfs/zio.c:1399
/#512 zio_execute (zio=zio@entry=0x7f038800fd80) at ../../module/zfs/zio.c:1337
/#513 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/#514 0x00007f03c806119d in vdev_queue_io_done (zio=zio@entry=0x7f03a0010950) at ../../module/zfs/vdev_queue.c:775
/#515 0x00007f03c807a0e8 in zio_vdev_io_done (zio=0x7f03a0010950) at ../../module/zfs/zio.c:2686
/#516 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1399
/#517 zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1337
/#518 0x00007f03c7fcd0c4 in taskq_thread (arg=0x966d50) at ../../lib/libzpool/taskq.c:215
/#519 0x00007f03c7fc7937 in zk_thread_helper (arg=0x967e90) at ../../lib/libzpool/kernel.c:135
/#520 0x00007f03c78890a3 in start_thread (arg=0x7f03c2703700) at pthread_create.c:309
/#521 0x00007f03c75c50fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

This occurred when ztest was simulating a scrub under heavy I/O load. Under
those circumstances, it was possible for a mix of noop I/Os for aggregation
continuity and the I/O elevator to generate arbitrarily deep recursion.

This patch modifies ZFS to propapage a recursion counter inside the zio_t
objects such that IOs will be redispatched upon reaching a given recursion
depth.  We can detect long call chains and dispatch to another ZIO taskq. We
cut in-line when we do this to minimize the potential for taskq exhaustion that
can prevent a zio from notifying its parent.

Signed-off-by: Richard Yao <ryao@gentoo.org>

ryao added a commit to ryao/zfs that referenced this issue Oct 10, 2014

Redispatch ZIOs in deep call graphs
The below excerpt of a backtrace is from a ztest failure when running ZoL's
ztest:

/#453 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/#454 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350003de0) at ../../module/zfs/vdev_queue.c:747
/#455 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350003de0) at ../../module/zfs/zio.c:2659
/#456 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1399
/#457 zio_nowait (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1456
/#458 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350003a10) at ../../module/zfs/vdev_mirror.c:374
/#459 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1399
/#460 zio_nowait (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1456
/#461 0x00007f03c806464c in vdev_raidz_io_start (zio=0x7f0350003380) at ../../module/zfs/vdev_raidz.c:1607
/#462 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003380) at ../../module/zfs/zio.c:1399
/#463 zio_nowait (zio=0x7f0350003380) at ../../module/zfs/zio.c:1456
/#464 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002fb0) at ../../module/zfs/vdev_mirror.c:374
/#465 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1399
/#466 zio_nowait (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1456
/#467 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033957ebf0) at ../../module/zfs/vdev_mirror.c:499
/#468 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:2707
/#469 0x00007f03c808285b in __zio_execute (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:1399
/#470 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f0390001330, pio=0x7f033957ebf0) at ../../module/zfs/zio.c:547
/#471 zio_done (zio=0x7f0390001330) at ../../module/zfs/zio.c:3278
/#472 0x00007f03c808285b in __zio_execute (zio=0x7f0390001330) at ../../module/zfs/zio.c:1399
/#473 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4013a00, pio=0x7f0390001330) at ../../module/zfs/zio.c:547
/#474 zio_done (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:3278
/#475 0x00007f03c808285b in __zio_execute (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:1399
/#476 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014210, pio=0x7f03b4013a00) at ../../module/zfs/zio.c:547
/#477 zio_done (zio=0x7f03b4014210) at ../../module/zfs/zio.c:3278
/#478 0x00007f03c808285b in __zio_execute (zio=0x7f03b4014210) at ../../module/zfs/zio.c:1399
/#479 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014620, pio=0x7f03b4014210) at ../../module/zfs/zio.c:547
/#480 zio_done (zio=0x7f03b4014620) at ../../module/zfs/zio.c:3278
/#481 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03b4014620) at ../../module/zfs/zio.c:1399
/#482 zio_execute (zio=zio@entry=0x7f03b4014620) at ../../module/zfs/zio.c:1337
/#483 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/#484 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350002be0) at ../../module/zfs/vdev_queue.c:747
/#485 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350002be0) at ../../module/zfs/zio.c:2659
/#486 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1399
/#487 zio_nowait (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1456
/#488 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002810) at ../../module/zfs/vdev_mirror.c:374
/#489 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002810) at ../../module/zfs/zio.c:1399
/#490 zio_nowait (zio=0x7f0350002810) at ../../module/zfs/zio.c:1456
/#491 0x00007f03c8064593 in vdev_raidz_io_start (zio=0x7f0350001270) at ../../module/zfs/vdev_raidz.c:1591
/#492 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001270) at ../../module/zfs/zio.c:1399
/#493 zio_nowait (zio=0x7f0350001270) at ../../module/zfs/zio.c:1456
/#494 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350001e60) at ../../module/zfs/vdev_mirror.c:374
/#495 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1399
/#496 zio_nowait (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1456
/#497 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/vdev_mirror.c:499
/#498 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:2707
/#499 0x00007f03c808285b in __zio_execute (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:1399
/#500 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03a8003c00, pio=0x7f033a0c39c0) at ../../module/zfs/zio.c:547
/#501 zio_done (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:3278
/#502 0x00007f03c808285b in __zio_execute (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:1399
/#503 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800c400, pio=0x7f03a8003c00) at ../../module/zfs/zio.c:547
/#504 zio_done (zio=0x7f038800c400) at ../../module/zfs/zio.c:3278
/#505 0x00007f03c808285b in __zio_execute (zio=0x7f038800c400) at ../../module/zfs/zio.c:1399
/#506 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800da00, pio=0x7f038800c400) at ../../module/zfs/zio.c:547
/#507 zio_done (zio=0x7f038800da00) at ../../module/zfs/zio.c:3278
/#508 0x00007f03c808285b in __zio_execute (zio=0x7f038800da00) at ../../module/zfs/zio.c:1399
/#509 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800fd80, pio=0x7f038800da00) at ../../module/zfs/zio.c:547
/#510 zio_done (zio=0x7f038800fd80) at ../../module/zfs/zio.c:3278
/#511 0x00007f03c807a6d3 in __zio_execute (zio=0x7f038800fd80) at ../../module/zfs/zio.c:1399
/#512 zio_execute (zio=zio@entry=0x7f038800fd80) at ../../module/zfs/zio.c:1337
/#513 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/#514 0x00007f03c806119d in vdev_queue_io_done (zio=zio@entry=0x7f03a0010950) at ../../module/zfs/vdev_queue.c:775
/#515 0x00007f03c807a0e8 in zio_vdev_io_done (zio=0x7f03a0010950) at ../../module/zfs/zio.c:2686
/#516 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1399
/#517 zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1337
/#518 0x00007f03c7fcd0c4 in taskq_thread (arg=0x966d50) at ../../lib/libzpool/taskq.c:215
/#519 0x00007f03c7fc7937 in zk_thread_helper (arg=0x967e90) at ../../lib/libzpool/kernel.c:135
/#520 0x00007f03c78890a3 in start_thread (arg=0x7f03c2703700) at pthread_create.c:309
/#521 0x00007f03c75c50fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

This occurred when ztest was simulating a scrub under heavy I/O load. Under
those circumstances, it was possible for a mix of noop I/Os for aggregation
continuity and the I/O elevator to generate arbitrarily deep recursion.

This patch modifies ZFS to propapage a recursion counter inside the zio_t
objects such that IOs will be redispatched upon reaching a given recursion
depth.  We can detect long call chains and dispatch to another ZIO taskq. We
cut in-line when we do this to minimize the potential for taskq exhaustion that
can prevent a zio from notifying its parent.

Signed-off-by: Richard Yao <ryao@gentoo.org>

ryao added a commit to ryao/zfs that referenced this issue Oct 11, 2014

Redispatch ZIOs in deep call graphs
The below excerpt of a backtrace is from a ztest failure when running ZoL's
ztest:

/#453 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/#454 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350003de0) at ../../module/zfs/vdev_queue.c:747
/#455 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350003de0) at ../../module/zfs/zio.c:2659
/#456 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1399
/#457 zio_nowait (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1456
/#458 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350003a10) at ../../module/zfs/vdev_mirror.c:374
/#459 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1399
/#460 zio_nowait (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1456
/#461 0x00007f03c806464c in vdev_raidz_io_start (zio=0x7f0350003380) at ../../module/zfs/vdev_raidz.c:1607
/#462 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003380) at ../../module/zfs/zio.c:1399
/#463 zio_nowait (zio=0x7f0350003380) at ../../module/zfs/zio.c:1456
/#464 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002fb0) at ../../module/zfs/vdev_mirror.c:374
/#465 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1399
/#466 zio_nowait (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1456
/#467 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033957ebf0) at ../../module/zfs/vdev_mirror.c:499
/#468 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:2707
/#469 0x00007f03c808285b in __zio_execute (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:1399
/#470 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f0390001330, pio=0x7f033957ebf0) at ../../module/zfs/zio.c:547
/#471 zio_done (zio=0x7f0390001330) at ../../module/zfs/zio.c:3278
/#472 0x00007f03c808285b in __zio_execute (zio=0x7f0390001330) at ../../module/zfs/zio.c:1399
/#473 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4013a00, pio=0x7f0390001330) at ../../module/zfs/zio.c:547
/#474 zio_done (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:3278
/#475 0x00007f03c808285b in __zio_execute (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:1399
/#476 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014210, pio=0x7f03b4013a00) at ../../module/zfs/zio.c:547
/#477 zio_done (zio=0x7f03b4014210) at ../../module/zfs/zio.c:3278
/#478 0x00007f03c808285b in __zio_execute (zio=0x7f03b4014210) at ../../module/zfs/zio.c:1399
/#479 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014620, pio=0x7f03b4014210) at ../../module/zfs/zio.c:547
/#480 zio_done (zio=0x7f03b4014620) at ../../module/zfs/zio.c:3278
/#481 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03b4014620) at ../../module/zfs/zio.c:1399
/#482 zio_execute (zio=zio@entry=0x7f03b4014620) at ../../module/zfs/zio.c:1337
/#483 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/#484 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350002be0) at ../../module/zfs/vdev_queue.c:747
/#485 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350002be0) at ../../module/zfs/zio.c:2659
/#486 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1399
/#487 zio_nowait (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1456
/#488 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002810) at ../../module/zfs/vdev_mirror.c:374
/#489 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002810) at ../../module/zfs/zio.c:1399
/#490 zio_nowait (zio=0x7f0350002810) at ../../module/zfs/zio.c:1456
/#491 0x00007f03c8064593 in vdev_raidz_io_start (zio=0x7f0350001270) at ../../module/zfs/vdev_raidz.c:1591
/#492 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001270) at ../../module/zfs/zio.c:1399
/#493 zio_nowait (zio=0x7f0350001270) at ../../module/zfs/zio.c:1456
/#494 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350001e60) at ../../module/zfs/vdev_mirror.c:374
/#495 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1399
/#496 zio_nowait (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1456
/#497 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/vdev_mirror.c:499
/#498 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:2707
/#499 0x00007f03c808285b in __zio_execute (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:1399
/#500 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03a8003c00, pio=0x7f033a0c39c0) at ../../module/zfs/zio.c:547
/#501 zio_done (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:3278
/#502 0x00007f03c808285b in __zio_execute (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:1399
/#503 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800c400, pio=0x7f03a8003c00) at ../../module/zfs/zio.c:547
/#504 zio_done (zio=0x7f038800c400) at ../../module/zfs/zio.c:3278
/#505 0x00007f03c808285b in __zio_execute (zio=0x7f038800c400) at ../../module/zfs/zio.c:1399
/#506 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800da00, pio=0x7f038800c400) at ../../module/zfs/zio.c:547
/#507 zio_done (zio=0x7f038800da00) at ../../module/zfs/zio.c:3278
/#508 0x00007f03c808285b in __zio_execute (zio=0x7f038800da00) at ../../module/zfs/zio.c:1399
/#509 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800fd80, pio=0x7f038800da00) at ../../module/zfs/zio.c:547
/#510 zio_done (zio=0x7f038800fd80) at ../../module/zfs/zio.c:3278
/#511 0x00007f03c807a6d3 in __zio_execute (zio=0x7f038800fd80) at ../../module/zfs/zio.c:1399
/#512 zio_execute (zio=zio@entry=0x7f038800fd80) at ../../module/zfs/zio.c:1337
/#513 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/#514 0x00007f03c806119d in vdev_queue_io_done (zio=zio@entry=0x7f03a0010950) at ../../module/zfs/vdev_queue.c:775
/#515 0x00007f03c807a0e8 in zio_vdev_io_done (zio=0x7f03a0010950) at ../../module/zfs/zio.c:2686
/#516 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1399
/#517 zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1337
/#518 0x00007f03c7fcd0c4 in taskq_thread (arg=0x966d50) at ../../lib/libzpool/taskq.c:215
/#519 0x00007f03c7fc7937 in zk_thread_helper (arg=0x967e90) at ../../lib/libzpool/kernel.c:135
/#520 0x00007f03c78890a3 in start_thread (arg=0x7f03c2703700) at pthread_create.c:309
/#521 0x00007f03c75c50fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

This occurred when ztest was simulating a scrub under heavy I/O load. Under
those circumstances, it was possible for a mix of noop I/Os for aggregation
continuity and the I/O elevator to generate arbitrarily deep recursion.

This patch modifies ZFS to propapage a recursion counter inside the zio_t
objects such that IOs will be redispatched upon reaching a given recursion
depth.  We can detect long call chains and dispatch to another ZIO taskq. We
cut in-line when we do this to minimize the potential for taskq exhaustion that
can prevent a zio from notifying its parent.

Signed-off-by: Richard Yao <ryao@gentoo.org>

ryao added a commit to ryao/zfs that referenced this issue Oct 11, 2014

Redispatch ZIOs in deep call graphs
The below excerpt of a backtrace is from a ztest failure when running ZoL's
ztest:

/#453 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/#454 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350003de0) at ../../module/zfs/vdev_queue.c:747
/#455 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350003de0) at ../../module/zfs/zio.c:2659
/#456 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1399
/#457 zio_nowait (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1456
/#458 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350003a10) at ../../module/zfs/vdev_mirror.c:374
/#459 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1399
/#460 zio_nowait (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1456
/#461 0x00007f03c806464c in vdev_raidz_io_start (zio=0x7f0350003380) at ../../module/zfs/vdev_raidz.c:1607
/#462 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003380) at ../../module/zfs/zio.c:1399
/#463 zio_nowait (zio=0x7f0350003380) at ../../module/zfs/zio.c:1456
/#464 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002fb0) at ../../module/zfs/vdev_mirror.c:374
/#465 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1399
/#466 zio_nowait (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1456
/#467 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033957ebf0) at ../../module/zfs/vdev_mirror.c:499
/#468 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:2707
/#469 0x00007f03c808285b in __zio_execute (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:1399
/#470 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f0390001330, pio=0x7f033957ebf0) at ../../module/zfs/zio.c:547
/#471 zio_done (zio=0x7f0390001330) at ../../module/zfs/zio.c:3278
/#472 0x00007f03c808285b in __zio_execute (zio=0x7f0390001330) at ../../module/zfs/zio.c:1399
/#473 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4013a00, pio=0x7f0390001330) at ../../module/zfs/zio.c:547
/#474 zio_done (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:3278
/#475 0x00007f03c808285b in __zio_execute (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:1399
/#476 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014210, pio=0x7f03b4013a00) at ../../module/zfs/zio.c:547
/#477 zio_done (zio=0x7f03b4014210) at ../../module/zfs/zio.c:3278
/#478 0x00007f03c808285b in __zio_execute (zio=0x7f03b4014210) at ../../module/zfs/zio.c:1399
/#479 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014620, pio=0x7f03b4014210) at ../../module/zfs/zio.c:547
/#480 zio_done (zio=0x7f03b4014620) at ../../module/zfs/zio.c:3278
/#481 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03b4014620) at ../../module/zfs/zio.c:1399
/#482 zio_execute (zio=zio@entry=0x7f03b4014620) at ../../module/zfs/zio.c:1337
/#483 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/#484 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350002be0) at ../../module/zfs/vdev_queue.c:747
/#485 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350002be0) at ../../module/zfs/zio.c:2659
/#486 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1399
/#487 zio_nowait (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1456
/#488 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002810) at ../../module/zfs/vdev_mirror.c:374
/#489 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002810) at ../../module/zfs/zio.c:1399
/#490 zio_nowait (zio=0x7f0350002810) at ../../module/zfs/zio.c:1456
/#491 0x00007f03c8064593 in vdev_raidz_io_start (zio=0x7f0350001270) at ../../module/zfs/vdev_raidz.c:1591
/#492 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001270) at ../../module/zfs/zio.c:1399
/#493 zio_nowait (zio=0x7f0350001270) at ../../module/zfs/zio.c:1456
/#494 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350001e60) at ../../module/zfs/vdev_mirror.c:374
/#495 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1399
/#496 zio_nowait (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1456
/#497 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/vdev_mirror.c:499
/#498 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:2707
/#499 0x00007f03c808285b in __zio_execute (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:1399
/#500 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03a8003c00, pio=0x7f033a0c39c0) at ../../module/zfs/zio.c:547
/#501 zio_done (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:3278
/#502 0x00007f03c808285b in __zio_execute (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:1399
/#503 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800c400, pio=0x7f03a8003c00) at ../../module/zfs/zio.c:547
/#504 zio_done (zio=0x7f038800c400) at ../../module/zfs/zio.c:3278
/#505 0x00007f03c808285b in __zio_execute (zio=0x7f038800c400) at ../../module/zfs/zio.c:1399
/#506 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800da00, pio=0x7f038800c400) at ../../module/zfs/zio.c:547
/#507 zio_done (zio=0x7f038800da00) at ../../module/zfs/zio.c:3278
/#508 0x00007f03c808285b in __zio_execute (zio=0x7f038800da00) at ../../module/zfs/zio.c:1399
/#509 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800fd80, pio=0x7f038800da00) at ../../module/zfs/zio.c:547
/#510 zio_done (zio=0x7f038800fd80) at ../../module/zfs/zio.c:3278
/#511 0x00007f03c807a6d3 in __zio_execute (zio=0x7f038800fd80) at ../../module/zfs/zio.c:1399
/#512 zio_execute (zio=zio@entry=0x7f038800fd80) at ../../module/zfs/zio.c:1337
/#513 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/#514 0x00007f03c806119d in vdev_queue_io_done (zio=zio@entry=0x7f03a0010950) at ../../module/zfs/vdev_queue.c:775
/#515 0x00007f03c807a0e8 in zio_vdev_io_done (zio=0x7f03a0010950) at ../../module/zfs/zio.c:2686
/#516 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1399
/#517 zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1337
/#518 0x00007f03c7fcd0c4 in taskq_thread (arg=0x966d50) at ../../lib/libzpool/taskq.c:215
/#519 0x00007f03c7fc7937 in zk_thread_helper (arg=0x967e90) at ../../lib/libzpool/kernel.c:135
/#520 0x00007f03c78890a3 in start_thread (arg=0x7f03c2703700) at pthread_create.c:309
/#521 0x00007f03c75c50fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

This occurred when ztest was simulating a scrub under heavy I/O load. Under
those circumstances, it was possible for a mix of noop I/Os for aggregation
continuity and the I/O elevator to generate arbitrarily deep recursion.

This patch modifies ZFS to propapage a recursion counter inside the zio_t
objects such that IOs will be redispatched upon reaching a given recursion
depth.  We can detect long call chains and dispatch to another ZIO taskq. We
cut in-line when we do this to minimize the potential for taskq exhaustion that
can prevent a zio from notifying its parent.

Signed-off-by: Richard Yao <ryao@gentoo.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment