Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iotop ascribes app disk activity to ZFS kernel processes #472

Closed
Rudd-O opened this issue Nov 29, 2011 · 6 comments
Closed

iotop ascribes app disk activity to ZFS kernel processes #472

Rudd-O opened this issue Nov 29, 2011 · 6 comments
Milestone

Comments

@Rudd-O
Copy link
Contributor

Rudd-O commented Nov 29, 2011

run iotop

run a command that generates IO activity like find

notice how iotop shows ZFS kernel processes rather than find itself.

@behlendorf
Copy link
Contributor

At least for writes this is what I'd expect. The user space process basically just copies its data quickly in to the ARC and then the kernel threads are responsible for moving that dirty data to disk. Since those kernel threads are the ones performing the actual IO the activity get's attributed to them. For reads the story is a little bit different since it's a synchronous operation either from cache or disk.

@Rudd-O
Copy link
Contributor Author

Rudd-O commented Nov 29, 2011

But then iotop is useless to monitor writes -- to the extent that I know other filesystems do it, they work appropriately. Also, the workload I tested with was a purely read workload, and it still showed READS coming from ZFS processes...

@behlendorf
Copy link
Contributor

You would see the same effect for reads assuming they aren't cached in the ARC since the actual IO get's delegated to the zfs thread pool. That said, I'll leave this as an open issue to see if we can't find some clever way to reattribute the I/O to a different process.

@Rudd-O
Copy link
Contributor Author

Rudd-O commented Nov 29, 2011

For the record ext4 gets it right, even though it also has kernel threads (ext4-dio-write and the like). So they must be preserving the PID and associating it with the inflight requests.

@devsk
Copy link

devsk commented Dec 29, 2011

AFAIK, iotop only works right if CONFIG_TASK_IO_ACCOUNTING is set.

Apart from that, the dirty IO is written to disk by the process or the background flusher thread based on dirty_* parameters. If you make dirty_background_bytes/ratio smaller than dirty_bytes/ratio, I think the IO will be done by background flusher thread ahead of the process and you will see them in the iotop instead of the process. I am not sure if the IOs in this case are also tagged and tracked with task ID if CONFIG_TASK_IO_ACCOUNTING is set.

Does ZFS have parameters which put the responsibility of the write-out on the process itself depending on how much dirty data the process is generating?

@behlendorf
Copy link
Contributor

This was addressed by commit e3dc14b which was part of 0.6.3.

@behlendorf behlendorf modified the milestones: 0.6.3, 0.7.0 Oct 3, 2014
ryao added a commit to ryao/zfs that referenced this issue Oct 9, 2014
The below excerpt of a backtrace is from a ztest failure when running
ZoL's ztest.

/openzfs#453 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#454 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350003de0) at ../../module/zfs/vdev_queue.c:747
/openzfs#455 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350003de0) at ../../module/zfs/zio.c:2659
/openzfs#456 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1399
/openzfs#457 zio_nowait (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1456
/openzfs#458 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350003a10) at ../../module/zfs/vdev_mirror.c:374
/openzfs#459 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1399
/openzfs#460 zio_nowait (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1456
/openzfs#461 0x00007f03c806464c in vdev_raidz_io_start (zio=0x7f0350003380) at ../../module/zfs/vdev_raidz.c:1607
/openzfs#462 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003380) at ../../module/zfs/zio.c:1399
/openzfs#463 zio_nowait (zio=0x7f0350003380) at ../../module/zfs/zio.c:1456
/openzfs#464 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002fb0) at ../../module/zfs/vdev_mirror.c:374
/openzfs#465 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1399
/openzfs#466 zio_nowait (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1456
/openzfs#467 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033957ebf0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#468 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:2707
/openzfs#469 0x00007f03c808285b in __zio_execute (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:1399
/openzfs#470 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f0390001330, pio=0x7f033957ebf0) at ../../module/zfs/zio.c:547
/openzfs#471 zio_done (zio=0x7f0390001330) at ../../module/zfs/zio.c:3278
/openzfs#472 0x00007f03c808285b in __zio_execute (zio=0x7f0390001330) at ../../module/zfs/zio.c:1399
/openzfs#473 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4013a00, pio=0x7f0390001330) at ../../module/zfs/zio.c:547
/openzfs#474 zio_done (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:3278
/openzfs#475 0x00007f03c808285b in __zio_execute (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:1399
/openzfs#476 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014210, pio=0x7f03b4013a00) at ../../module/zfs/zio.c:547
/openzfs#477 zio_done (zio=0x7f03b4014210) at ../../module/zfs/zio.c:3278
/openzfs#478 0x00007f03c808285b in __zio_execute (zio=0x7f03b4014210) at ../../module/zfs/zio.c:1399
/openzfs#479 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014620, pio=0x7f03b4014210) at ../../module/zfs/zio.c:547
/openzfs#480 zio_done (zio=0x7f03b4014620) at ../../module/zfs/zio.c:3278
/openzfs#481 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03b4014620) at ../../module/zfs/zio.c:1399
/openzfs#482 zio_execute (zio=zio@entry=0x7f03b4014620) at ../../module/zfs/zio.c:1337
/openzfs#483 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#484 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350002be0) at ../../module/zfs/vdev_queue.c:747
/openzfs#485 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350002be0) at ../../module/zfs/zio.c:2659
/openzfs#486 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1399
/openzfs#487 zio_nowait (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1456
/openzfs#488 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002810) at ../../module/zfs/vdev_mirror.c:374
/openzfs#489 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002810) at ../../module/zfs/zio.c:1399
/openzfs#490 zio_nowait (zio=0x7f0350002810) at ../../module/zfs/zio.c:1456
/openzfs#491 0x00007f03c8064593 in vdev_raidz_io_start (zio=0x7f0350001270) at ../../module/zfs/vdev_raidz.c:1591
/openzfs#492 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001270) at ../../module/zfs/zio.c:1399
/openzfs#493 zio_nowait (zio=0x7f0350001270) at ../../module/zfs/zio.c:1456
/openzfs#494 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350001e60) at ../../module/zfs/vdev_mirror.c:374
/openzfs#495 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1399
/openzfs#496 zio_nowait (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1456
/openzfs#497 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#498 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:2707
/openzfs#499 0x00007f03c808285b in __zio_execute (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:1399
/openzfs#500 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03a8003c00, pio=0x7f033a0c39c0) at ../../module/zfs/zio.c:547
/openzfs#501 zio_done (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:3278
/openzfs#502 0x00007f03c808285b in __zio_execute (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:1399
/openzfs#503 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800c400, pio=0x7f03a8003c00) at ../../module/zfs/zio.c:547
/openzfs#504 zio_done (zio=0x7f038800c400) at ../../module/zfs/zio.c:3278
/openzfs#505 0x00007f03c808285b in __zio_execute (zio=0x7f038800c400) at ../../module/zfs/zio.c:1399
/openzfs#506 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800da00, pio=0x7f038800c400) at ../../module/zfs/zio.c:547
/openzfs#507 zio_done (zio=0x7f038800da00) at ../../module/zfs/zio.c:3278
/openzfs#508 0x00007f03c808285b in __zio_execute (zio=0x7f038800da00) at ../../module/zfs/zio.c:1399
/openzfs#509 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800fd80, pio=0x7f038800da00) at ../../module/zfs/zio.c:547
/openzfs#510 zio_done (zio=0x7f038800fd80) at ../../module/zfs/zio.c:3278
/openzfs#511 0x00007f03c807a6d3 in __zio_execute (zio=0x7f038800fd80) at ../../module/zfs/zio.c:1399
/openzfs#512 zio_execute (zio=zio@entry=0x7f038800fd80) at ../../module/zfs/zio.c:1337
/openzfs#513 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#514 0x00007f03c806119d in vdev_queue_io_done (zio=zio@entry=0x7f03a0010950) at ../../module/zfs/vdev_queue.c:775
/openzfs#515 0x00007f03c807a0e8 in zio_vdev_io_done (zio=0x7f03a0010950) at ../../module/zfs/zio.c:2686
/openzfs#516 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1399
/openzfs#517 zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1337
/openzfs#518 0x00007f03c7fcd0c4 in taskq_thread (arg=0x966d50) at ../../lib/libzpool/taskq.c:215
/openzfs#519 0x00007f03c7fc7937 in zk_thread_helper (arg=0x967e90) at ../../lib/libzpool/kernel.c:135
/openzfs#520 0x00007f03c78890a3 in start_thread (arg=0x7f03c2703700) at pthread_create.c:309
/openzfs#521 0x00007f03c75c50fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

The backtrace was an infinite loop of `vdev_queue_io_to_issue()` invoking
`zio_execute()` until it overran the stack. vdev_queue_io_to_issue() will ony
invoke `zio_execute()` on raidz vdevs when aggregation I/Os are generated to
improve aggregation continuity. These I/Os do not trigger any writes. However,
it appears that they can be generated in such a way that they recurse
infinitely upon return to `vdev_queue_io_to_issue()`. As a consequence, we see
the number of parents by 1 each time the recursion returns to
`vdev_raidz_io_start()`.

Signed-off-by: Richard Yao <ryao@gentoo.org>
ryao added a commit to ryao/zfs that referenced this issue Oct 9, 2014
The below excerpt of a backtrace is from a ztest failure when running
ZoL's ztest.

/openzfs#453 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#454 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350003de0) at ../../module/zfs/vdev_queue.c:747
/openzfs#455 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350003de0) at ../../module/zfs/zio.c:2659
/openzfs#456 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1399
/openzfs#457 zio_nowait (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1456
/openzfs#458 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350003a10) at ../../module/zfs/vdev_mirror.c:374
/openzfs#459 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1399
/openzfs#460 zio_nowait (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1456
/openzfs#461 0x00007f03c806464c in vdev_raidz_io_start (zio=0x7f0350003380) at ../../module/zfs/vdev_raidz.c:1607
/openzfs#462 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003380) at ../../module/zfs/zio.c:1399
/openzfs#463 zio_nowait (zio=0x7f0350003380) at ../../module/zfs/zio.c:1456
/openzfs#464 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002fb0) at ../../module/zfs/vdev_mirror.c:374
/openzfs#465 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1399
/openzfs#466 zio_nowait (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1456
/openzfs#467 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033957ebf0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#468 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:2707
/openzfs#469 0x00007f03c808285b in __zio_execute (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:1399
/openzfs#470 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f0390001330, pio=0x7f033957ebf0) at ../../module/zfs/zio.c:547
/openzfs#471 zio_done (zio=0x7f0390001330) at ../../module/zfs/zio.c:3278
/openzfs#472 0x00007f03c808285b in __zio_execute (zio=0x7f0390001330) at ../../module/zfs/zio.c:1399
/openzfs#473 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4013a00, pio=0x7f0390001330) at ../../module/zfs/zio.c:547
/openzfs#474 zio_done (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:3278
/openzfs#475 0x00007f03c808285b in __zio_execute (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:1399
/openzfs#476 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014210, pio=0x7f03b4013a00) at ../../module/zfs/zio.c:547
/openzfs#477 zio_done (zio=0x7f03b4014210) at ../../module/zfs/zio.c:3278
/openzfs#478 0x00007f03c808285b in __zio_execute (zio=0x7f03b4014210) at ../../module/zfs/zio.c:1399
/openzfs#479 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014620, pio=0x7f03b4014210) at ../../module/zfs/zio.c:547
/openzfs#480 zio_done (zio=0x7f03b4014620) at ../../module/zfs/zio.c:3278
/openzfs#481 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03b4014620) at ../../module/zfs/zio.c:1399
/openzfs#482 zio_execute (zio=zio@entry=0x7f03b4014620) at ../../module/zfs/zio.c:1337
/openzfs#483 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#484 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350002be0) at ../../module/zfs/vdev_queue.c:747
/openzfs#485 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350002be0) at ../../module/zfs/zio.c:2659
/openzfs#486 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1399
/openzfs#487 zio_nowait (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1456
/openzfs#488 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002810) at ../../module/zfs/vdev_mirror.c:374
/openzfs#489 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002810) at ../../module/zfs/zio.c:1399
/openzfs#490 zio_nowait (zio=0x7f0350002810) at ../../module/zfs/zio.c:1456
/openzfs#491 0x00007f03c8064593 in vdev_raidz_io_start (zio=0x7f0350001270) at ../../module/zfs/vdev_raidz.c:1591
/openzfs#492 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001270) at ../../module/zfs/zio.c:1399
/openzfs#493 zio_nowait (zio=0x7f0350001270) at ../../module/zfs/zio.c:1456
/openzfs#494 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350001e60) at ../../module/zfs/vdev_mirror.c:374
/openzfs#495 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1399
/openzfs#496 zio_nowait (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1456
/openzfs#497 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#498 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:2707
/openzfs#499 0x00007f03c808285b in __zio_execute (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:1399
/openzfs#500 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03a8003c00, pio=0x7f033a0c39c0) at ../../module/zfs/zio.c:547
/openzfs#501 zio_done (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:3278
/openzfs#502 0x00007f03c808285b in __zio_execute (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:1399
/openzfs#503 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800c400, pio=0x7f03a8003c00) at ../../module/zfs/zio.c:547
/openzfs#504 zio_done (zio=0x7f038800c400) at ../../module/zfs/zio.c:3278
/openzfs#505 0x00007f03c808285b in __zio_execute (zio=0x7f038800c400) at ../../module/zfs/zio.c:1399
/openzfs#506 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800da00, pio=0x7f038800c400) at ../../module/zfs/zio.c:547
/openzfs#507 zio_done (zio=0x7f038800da00) at ../../module/zfs/zio.c:3278
/openzfs#508 0x00007f03c808285b in __zio_execute (zio=0x7f038800da00) at ../../module/zfs/zio.c:1399
/openzfs#509 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800fd80, pio=0x7f038800da00) at ../../module/zfs/zio.c:547
/openzfs#510 zio_done (zio=0x7f038800fd80) at ../../module/zfs/zio.c:3278
/openzfs#511 0x00007f03c807a6d3 in __zio_execute (zio=0x7f038800fd80) at ../../module/zfs/zio.c:1399
/openzfs#512 zio_execute (zio=zio@entry=0x7f038800fd80) at ../../module/zfs/zio.c:1337
/openzfs#513 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#514 0x00007f03c806119d in vdev_queue_io_done (zio=zio@entry=0x7f03a0010950) at ../../module/zfs/vdev_queue.c:775
/openzfs#515 0x00007f03c807a0e8 in zio_vdev_io_done (zio=0x7f03a0010950) at ../../module/zfs/zio.c:2686
/openzfs#516 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1399
/openzfs#517 zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1337
/openzfs#518 0x00007f03c7fcd0c4 in taskq_thread (arg=0x966d50) at ../../lib/libzpool/taskq.c:215
/openzfs#519 0x00007f03c7fc7937 in zk_thread_helper (arg=0x967e90) at ../../lib/libzpool/kernel.c:135
/openzfs#520 0x00007f03c78890a3 in start_thread (arg=0x7f03c2703700) at pthread_create.c:309
/openzfs#521 0x00007f03c75c50fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

The backtrace was an infinite loop of `vdev_queue_io_to_issue()` invoking
`zio_execute()` until it overran the stack. vdev_queue_io_to_issue() will ony
invoke `zio_execute()` on raidz vdevs when aggregation I/Os are generated to
improve aggregation continuity. These I/Os do not trigger any writes. However,
it appears that they can be generated in such a way that they recurse
infinitely upon return to `vdev_queue_io_to_issue()`. As a consequence, we see
the number of parents by 1 each time the recursion returns to
`vdev_raidz_io_start()`.

Signed-off-by: Richard Yao <richard.yao@clusterhq.com>
ryao added a commit to ryao/zfs that referenced this issue Oct 9, 2014
The below excerpt of a backtrace is from a ztest failure when running
ZoL's ztest.

/openzfs#453 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#454 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350003de0) at ../../module/zfs/vdev_queue.c:747
/openzfs#455 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350003de0) at ../../module/zfs/zio.c:2659
/openzfs#456 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1399
/openzfs#457 zio_nowait (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1456
/openzfs#458 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350003a10) at ../../module/zfs/vdev_mirror.c:374
/openzfs#459 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1399
/openzfs#460 zio_nowait (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1456
/openzfs#461 0x00007f03c806464c in vdev_raidz_io_start (zio=0x7f0350003380) at ../../module/zfs/vdev_raidz.c:1607
/openzfs#462 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003380) at ../../module/zfs/zio.c:1399
/openzfs#463 zio_nowait (zio=0x7f0350003380) at ../../module/zfs/zio.c:1456
/openzfs#464 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002fb0) at ../../module/zfs/vdev_mirror.c:374
/openzfs#465 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1399
/openzfs#466 zio_nowait (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1456
/openzfs#467 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033957ebf0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#468 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:2707
/openzfs#469 0x00007f03c808285b in __zio_execute (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:1399
/openzfs#470 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f0390001330, pio=0x7f033957ebf0) at ../../module/zfs/zio.c:547
/openzfs#471 zio_done (zio=0x7f0390001330) at ../../module/zfs/zio.c:3278
/openzfs#472 0x00007f03c808285b in __zio_execute (zio=0x7f0390001330) at ../../module/zfs/zio.c:1399
/openzfs#473 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4013a00, pio=0x7f0390001330) at ../../module/zfs/zio.c:547
/openzfs#474 zio_done (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:3278
/openzfs#475 0x00007f03c808285b in __zio_execute (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:1399
/openzfs#476 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014210, pio=0x7f03b4013a00) at ../../module/zfs/zio.c:547
/openzfs#477 zio_done (zio=0x7f03b4014210) at ../../module/zfs/zio.c:3278
/openzfs#478 0x00007f03c808285b in __zio_execute (zio=0x7f03b4014210) at ../../module/zfs/zio.c:1399
/openzfs#479 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014620, pio=0x7f03b4014210) at ../../module/zfs/zio.c:547
/openzfs#480 zio_done (zio=0x7f03b4014620) at ../../module/zfs/zio.c:3278
/openzfs#481 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03b4014620) at ../../module/zfs/zio.c:1399
/openzfs#482 zio_execute (zio=zio@entry=0x7f03b4014620) at ../../module/zfs/zio.c:1337
/openzfs#483 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#484 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350002be0) at ../../module/zfs/vdev_queue.c:747
/openzfs#485 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350002be0) at ../../module/zfs/zio.c:2659
/openzfs#486 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1399
/openzfs#487 zio_nowait (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1456
/openzfs#488 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002810) at ../../module/zfs/vdev_mirror.c:374
/openzfs#489 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002810) at ../../module/zfs/zio.c:1399
/openzfs#490 zio_nowait (zio=0x7f0350002810) at ../../module/zfs/zio.c:1456
/openzfs#491 0x00007f03c8064593 in vdev_raidz_io_start (zio=0x7f0350001270) at ../../module/zfs/vdev_raidz.c:1591
/openzfs#492 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001270) at ../../module/zfs/zio.c:1399
/openzfs#493 zio_nowait (zio=0x7f0350001270) at ../../module/zfs/zio.c:1456
/openzfs#494 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350001e60) at ../../module/zfs/vdev_mirror.c:374
/openzfs#495 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1399
/openzfs#496 zio_nowait (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1456
/openzfs#497 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#498 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:2707
/openzfs#499 0x00007f03c808285b in __zio_execute (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:1399
/openzfs#500 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03a8003c00, pio=0x7f033a0c39c0) at ../../module/zfs/zio.c:547
/openzfs#501 zio_done (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:3278
/openzfs#502 0x00007f03c808285b in __zio_execute (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:1399
/openzfs#503 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800c400, pio=0x7f03a8003c00) at ../../module/zfs/zio.c:547
/openzfs#504 zio_done (zio=0x7f038800c400) at ../../module/zfs/zio.c:3278
/openzfs#505 0x00007f03c808285b in __zio_execute (zio=0x7f038800c400) at ../../module/zfs/zio.c:1399
/openzfs#506 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800da00, pio=0x7f038800c400) at ../../module/zfs/zio.c:547
/openzfs#507 zio_done (zio=0x7f038800da00) at ../../module/zfs/zio.c:3278
/openzfs#508 0x00007f03c808285b in __zio_execute (zio=0x7f038800da00) at ../../module/zfs/zio.c:1399
/openzfs#509 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800fd80, pio=0x7f038800da00) at ../../module/zfs/zio.c:547
/openzfs#510 zio_done (zio=0x7f038800fd80) at ../../module/zfs/zio.c:3278
/openzfs#511 0x00007f03c807a6d3 in __zio_execute (zio=0x7f038800fd80) at ../../module/zfs/zio.c:1399
/openzfs#512 zio_execute (zio=zio@entry=0x7f038800fd80) at ../../module/zfs/zio.c:1337
/openzfs#513 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#514 0x00007f03c806119d in vdev_queue_io_done (zio=zio@entry=0x7f03a0010950) at ../../module/zfs/vdev_queue.c:775
/openzfs#515 0x00007f03c807a0e8 in zio_vdev_io_done (zio=0x7f03a0010950) at ../../module/zfs/zio.c:2686
/openzfs#516 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1399
/openzfs#517 zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1337
/openzfs#518 0x00007f03c7fcd0c4 in taskq_thread (arg=0x966d50) at ../../lib/libzpool/taskq.c:215
/openzfs#519 0x00007f03c7fc7937 in zk_thread_helper (arg=0x967e90) at ../../lib/libzpool/kernel.c:135
/openzfs#520 0x00007f03c78890a3 in start_thread (arg=0x7f03c2703700) at pthread_create.c:309
/openzfs#521 0x00007f03c75c50fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

The backtrace was an infinite loop of `vdev_queue_io_to_issue()` invoking
`zio_execute()` until it overran the stack. vdev_queue_io_to_issue() will ony
invoke `zio_execute()` on raidz vdevs when aggregation I/Os are generated to
improve aggregation continuity. These I/Os do not trigger any writes. However,
it appears that they can be generated in such a way that they recurse
infinitely upon return to `vdev_queue_io_to_issue()`.

Signed-off-by: Richard Yao <richard.yao@clusterhq.com>
ryao added a commit to ryao/zfs that referenced this issue Oct 10, 2014
The below excerpt of a backtrace is from a ztest failure when running ZoL's
ztest:

/openzfs#453 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#454 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350003de0) at ../../module/zfs/vdev_queue.c:747
/openzfs#455 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350003de0) at ../../module/zfs/zio.c:2659
/openzfs#456 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1399
/openzfs#457 zio_nowait (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1456
/openzfs#458 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350003a10) at ../../module/zfs/vdev_mirror.c:374
/openzfs#459 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1399
/openzfs#460 zio_nowait (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1456
/openzfs#461 0x00007f03c806464c in vdev_raidz_io_start (zio=0x7f0350003380) at ../../module/zfs/vdev_raidz.c:1607
/openzfs#462 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003380) at ../../module/zfs/zio.c:1399
/openzfs#463 zio_nowait (zio=0x7f0350003380) at ../../module/zfs/zio.c:1456
/openzfs#464 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002fb0) at ../../module/zfs/vdev_mirror.c:374
/openzfs#465 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1399
/openzfs#466 zio_nowait (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1456
/openzfs#467 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033957ebf0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#468 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:2707
/openzfs#469 0x00007f03c808285b in __zio_execute (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:1399
/openzfs#470 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f0390001330, pio=0x7f033957ebf0) at ../../module/zfs/zio.c:547
/openzfs#471 zio_done (zio=0x7f0390001330) at ../../module/zfs/zio.c:3278
/openzfs#472 0x00007f03c808285b in __zio_execute (zio=0x7f0390001330) at ../../module/zfs/zio.c:1399
/openzfs#473 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4013a00, pio=0x7f0390001330) at ../../module/zfs/zio.c:547
/openzfs#474 zio_done (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:3278
/openzfs#475 0x00007f03c808285b in __zio_execute (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:1399
/openzfs#476 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014210, pio=0x7f03b4013a00) at ../../module/zfs/zio.c:547
/openzfs#477 zio_done (zio=0x7f03b4014210) at ../../module/zfs/zio.c:3278
/openzfs#478 0x00007f03c808285b in __zio_execute (zio=0x7f03b4014210) at ../../module/zfs/zio.c:1399
/openzfs#479 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014620, pio=0x7f03b4014210) at ../../module/zfs/zio.c:547
/openzfs#480 zio_done (zio=0x7f03b4014620) at ../../module/zfs/zio.c:3278
/openzfs#481 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03b4014620) at ../../module/zfs/zio.c:1399
/openzfs#482 zio_execute (zio=zio@entry=0x7f03b4014620) at ../../module/zfs/zio.c:1337
/openzfs#483 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#484 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350002be0) at ../../module/zfs/vdev_queue.c:747
/openzfs#485 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350002be0) at ../../module/zfs/zio.c:2659
/openzfs#486 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1399
/openzfs#487 zio_nowait (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1456
/openzfs#488 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002810) at ../../module/zfs/vdev_mirror.c:374
/openzfs#489 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002810) at ../../module/zfs/zio.c:1399
/openzfs#490 zio_nowait (zio=0x7f0350002810) at ../../module/zfs/zio.c:1456
/openzfs#491 0x00007f03c8064593 in vdev_raidz_io_start (zio=0x7f0350001270) at ../../module/zfs/vdev_raidz.c:1591
/openzfs#492 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001270) at ../../module/zfs/zio.c:1399
/openzfs#493 zio_nowait (zio=0x7f0350001270) at ../../module/zfs/zio.c:1456
/openzfs#494 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350001e60) at ../../module/zfs/vdev_mirror.c:374
/openzfs#495 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1399
/openzfs#496 zio_nowait (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1456
/openzfs#497 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#498 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:2707
/openzfs#499 0x00007f03c808285b in __zio_execute (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:1399
/openzfs#500 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03a8003c00, pio=0x7f033a0c39c0) at ../../module/zfs/zio.c:547
/openzfs#501 zio_done (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:3278
/openzfs#502 0x00007f03c808285b in __zio_execute (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:1399
/openzfs#503 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800c400, pio=0x7f03a8003c00) at ../../module/zfs/zio.c:547
/openzfs#504 zio_done (zio=0x7f038800c400) at ../../module/zfs/zio.c:3278
/openzfs#505 0x00007f03c808285b in __zio_execute (zio=0x7f038800c400) at ../../module/zfs/zio.c:1399
/openzfs#506 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800da00, pio=0x7f038800c400) at ../../module/zfs/zio.c:547
/openzfs#507 zio_done (zio=0x7f038800da00) at ../../module/zfs/zio.c:3278
/openzfs#508 0x00007f03c808285b in __zio_execute (zio=0x7f038800da00) at ../../module/zfs/zio.c:1399
/openzfs#509 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800fd80, pio=0x7f038800da00) at ../../module/zfs/zio.c:547
/openzfs#510 zio_done (zio=0x7f038800fd80) at ../../module/zfs/zio.c:3278
/openzfs#511 0x00007f03c807a6d3 in __zio_execute (zio=0x7f038800fd80) at ../../module/zfs/zio.c:1399
/openzfs#512 zio_execute (zio=zio@entry=0x7f038800fd80) at ../../module/zfs/zio.c:1337
/openzfs#513 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#514 0x00007f03c806119d in vdev_queue_io_done (zio=zio@entry=0x7f03a0010950) at ../../module/zfs/vdev_queue.c:775
/openzfs#515 0x00007f03c807a0e8 in zio_vdev_io_done (zio=0x7f03a0010950) at ../../module/zfs/zio.c:2686
/openzfs#516 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1399
/openzfs#517 zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1337
/openzfs#518 0x00007f03c7fcd0c4 in taskq_thread (arg=0x966d50) at ../../lib/libzpool/taskq.c:215
/openzfs#519 0x00007f03c7fc7937 in zk_thread_helper (arg=0x967e90) at ../../lib/libzpool/kernel.c:135
/openzfs#520 0x00007f03c78890a3 in start_thread (arg=0x7f03c2703700) at pthread_create.c:309
/openzfs#521 0x00007f03c75c50fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

This occurred when ztest was simulating a scrub under heavy I/O load. Under
those circumstances, it was possible for a mix of noop I/Os for aggregation
continuity and the I/O elevator to generate arbitrarily deep recursion.

This patch modifies ZFS to propapage a recursion counter inside the zio_t
objects such that IOs will be redispatched upon reaching a given recursion
depth.  We can detect long call chains and dispatch to another ZIO taskq. We
cut in-line when we do this to minimize the potential for taskq exhaustion that
can prevent a zio from notifying its parent.

Signed-off-by: Richard Yao <ryao@gentoo.org>
ryao added a commit to ryao/zfs that referenced this issue Oct 10, 2014
The below excerpt of a backtrace is from a ztest failure when running ZoL's
ztest:

/openzfs#453 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#454 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350003de0) at ../../module/zfs/vdev_queue.c:747
/openzfs#455 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350003de0) at ../../module/zfs/zio.c:2659
/openzfs#456 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1399
/openzfs#457 zio_nowait (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1456
/openzfs#458 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350003a10) at ../../module/zfs/vdev_mirror.c:374
/openzfs#459 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1399
/openzfs#460 zio_nowait (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1456
/openzfs#461 0x00007f03c806464c in vdev_raidz_io_start (zio=0x7f0350003380) at ../../module/zfs/vdev_raidz.c:1607
/openzfs#462 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003380) at ../../module/zfs/zio.c:1399
/openzfs#463 zio_nowait (zio=0x7f0350003380) at ../../module/zfs/zio.c:1456
/openzfs#464 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002fb0) at ../../module/zfs/vdev_mirror.c:374
/openzfs#465 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1399
/openzfs#466 zio_nowait (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1456
/openzfs#467 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033957ebf0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#468 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:2707
/openzfs#469 0x00007f03c808285b in __zio_execute (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:1399
/openzfs#470 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f0390001330, pio=0x7f033957ebf0) at ../../module/zfs/zio.c:547
/openzfs#471 zio_done (zio=0x7f0390001330) at ../../module/zfs/zio.c:3278
/openzfs#472 0x00007f03c808285b in __zio_execute (zio=0x7f0390001330) at ../../module/zfs/zio.c:1399
/openzfs#473 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4013a00, pio=0x7f0390001330) at ../../module/zfs/zio.c:547
/openzfs#474 zio_done (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:3278
/openzfs#475 0x00007f03c808285b in __zio_execute (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:1399
/openzfs#476 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014210, pio=0x7f03b4013a00) at ../../module/zfs/zio.c:547
/openzfs#477 zio_done (zio=0x7f03b4014210) at ../../module/zfs/zio.c:3278
/openzfs#478 0x00007f03c808285b in __zio_execute (zio=0x7f03b4014210) at ../../module/zfs/zio.c:1399
/openzfs#479 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014620, pio=0x7f03b4014210) at ../../module/zfs/zio.c:547
/openzfs#480 zio_done (zio=0x7f03b4014620) at ../../module/zfs/zio.c:3278
/openzfs#481 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03b4014620) at ../../module/zfs/zio.c:1399
/openzfs#482 zio_execute (zio=zio@entry=0x7f03b4014620) at ../../module/zfs/zio.c:1337
/openzfs#483 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#484 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350002be0) at ../../module/zfs/vdev_queue.c:747
/openzfs#485 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350002be0) at ../../module/zfs/zio.c:2659
/openzfs#486 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1399
/openzfs#487 zio_nowait (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1456
/openzfs#488 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002810) at ../../module/zfs/vdev_mirror.c:374
/openzfs#489 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002810) at ../../module/zfs/zio.c:1399
/openzfs#490 zio_nowait (zio=0x7f0350002810) at ../../module/zfs/zio.c:1456
/openzfs#491 0x00007f03c8064593 in vdev_raidz_io_start (zio=0x7f0350001270) at ../../module/zfs/vdev_raidz.c:1591
/openzfs#492 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001270) at ../../module/zfs/zio.c:1399
/openzfs#493 zio_nowait (zio=0x7f0350001270) at ../../module/zfs/zio.c:1456
/openzfs#494 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350001e60) at ../../module/zfs/vdev_mirror.c:374
/openzfs#495 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1399
/openzfs#496 zio_nowait (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1456
/openzfs#497 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#498 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:2707
/openzfs#499 0x00007f03c808285b in __zio_execute (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:1399
/openzfs#500 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03a8003c00, pio=0x7f033a0c39c0) at ../../module/zfs/zio.c:547
/openzfs#501 zio_done (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:3278
/openzfs#502 0x00007f03c808285b in __zio_execute (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:1399
/openzfs#503 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800c400, pio=0x7f03a8003c00) at ../../module/zfs/zio.c:547
/openzfs#504 zio_done (zio=0x7f038800c400) at ../../module/zfs/zio.c:3278
/openzfs#505 0x00007f03c808285b in __zio_execute (zio=0x7f038800c400) at ../../module/zfs/zio.c:1399
/openzfs#506 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800da00, pio=0x7f038800c400) at ../../module/zfs/zio.c:547
/openzfs#507 zio_done (zio=0x7f038800da00) at ../../module/zfs/zio.c:3278
/openzfs#508 0x00007f03c808285b in __zio_execute (zio=0x7f038800da00) at ../../module/zfs/zio.c:1399
/openzfs#509 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800fd80, pio=0x7f038800da00) at ../../module/zfs/zio.c:547
/openzfs#510 zio_done (zio=0x7f038800fd80) at ../../module/zfs/zio.c:3278
/openzfs#511 0x00007f03c807a6d3 in __zio_execute (zio=0x7f038800fd80) at ../../module/zfs/zio.c:1399
/openzfs#512 zio_execute (zio=zio@entry=0x7f038800fd80) at ../../module/zfs/zio.c:1337
/openzfs#513 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#514 0x00007f03c806119d in vdev_queue_io_done (zio=zio@entry=0x7f03a0010950) at ../../module/zfs/vdev_queue.c:775
/openzfs#515 0x00007f03c807a0e8 in zio_vdev_io_done (zio=0x7f03a0010950) at ../../module/zfs/zio.c:2686
/openzfs#516 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1399
/openzfs#517 zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1337
/openzfs#518 0x00007f03c7fcd0c4 in taskq_thread (arg=0x966d50) at ../../lib/libzpool/taskq.c:215
/openzfs#519 0x00007f03c7fc7937 in zk_thread_helper (arg=0x967e90) at ../../lib/libzpool/kernel.c:135
/openzfs#520 0x00007f03c78890a3 in start_thread (arg=0x7f03c2703700) at pthread_create.c:309
/openzfs#521 0x00007f03c75c50fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

This occurred when ztest was simulating a scrub under heavy I/O load. Under
those circumstances, it was possible for a mix of noop I/Os for aggregation
continuity and the I/O elevator to generate arbitrarily deep recursion.

This patch modifies ZFS to propapage a recursion counter inside the zio_t
objects such that IOs will be redispatched upon reaching a given recursion
depth.  We can detect long call chains and dispatch to another ZIO taskq. We
cut in-line when we do this to minimize the potential for taskq exhaustion that
can prevent a zio from notifying its parent.

Signed-off-by: Richard Yao <ryao@gentoo.org>
ryao added a commit to ryao/zfs that referenced this issue Oct 10, 2014
The below excerpt of a backtrace is from a ztest failure when running ZoL's
ztest:

/openzfs#453 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#454 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350003de0) at ../../module/zfs/vdev_queue.c:747
/openzfs#455 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350003de0) at ../../module/zfs/zio.c:2659
/openzfs#456 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1399
/openzfs#457 zio_nowait (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1456
/openzfs#458 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350003a10) at ../../module/zfs/vdev_mirror.c:374
/openzfs#459 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1399
/openzfs#460 zio_nowait (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1456
/openzfs#461 0x00007f03c806464c in vdev_raidz_io_start (zio=0x7f0350003380) at ../../module/zfs/vdev_raidz.c:1607
/openzfs#462 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003380) at ../../module/zfs/zio.c:1399
/openzfs#463 zio_nowait (zio=0x7f0350003380) at ../../module/zfs/zio.c:1456
/openzfs#464 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002fb0) at ../../module/zfs/vdev_mirror.c:374
/openzfs#465 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1399
/openzfs#466 zio_nowait (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1456
/openzfs#467 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033957ebf0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#468 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:2707
/openzfs#469 0x00007f03c808285b in __zio_execute (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:1399
/openzfs#470 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f0390001330, pio=0x7f033957ebf0) at ../../module/zfs/zio.c:547
/openzfs#471 zio_done (zio=0x7f0390001330) at ../../module/zfs/zio.c:3278
/openzfs#472 0x00007f03c808285b in __zio_execute (zio=0x7f0390001330) at ../../module/zfs/zio.c:1399
/openzfs#473 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4013a00, pio=0x7f0390001330) at ../../module/zfs/zio.c:547
/openzfs#474 zio_done (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:3278
/openzfs#475 0x00007f03c808285b in __zio_execute (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:1399
/openzfs#476 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014210, pio=0x7f03b4013a00) at ../../module/zfs/zio.c:547
/openzfs#477 zio_done (zio=0x7f03b4014210) at ../../module/zfs/zio.c:3278
/openzfs#478 0x00007f03c808285b in __zio_execute (zio=0x7f03b4014210) at ../../module/zfs/zio.c:1399
/openzfs#479 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014620, pio=0x7f03b4014210) at ../../module/zfs/zio.c:547
/openzfs#480 zio_done (zio=0x7f03b4014620) at ../../module/zfs/zio.c:3278
/openzfs#481 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03b4014620) at ../../module/zfs/zio.c:1399
/openzfs#482 zio_execute (zio=zio@entry=0x7f03b4014620) at ../../module/zfs/zio.c:1337
/openzfs#483 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#484 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350002be0) at ../../module/zfs/vdev_queue.c:747
/openzfs#485 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350002be0) at ../../module/zfs/zio.c:2659
/openzfs#486 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1399
/openzfs#487 zio_nowait (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1456
/openzfs#488 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002810) at ../../module/zfs/vdev_mirror.c:374
/openzfs#489 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002810) at ../../module/zfs/zio.c:1399
/openzfs#490 zio_nowait (zio=0x7f0350002810) at ../../module/zfs/zio.c:1456
/openzfs#491 0x00007f03c8064593 in vdev_raidz_io_start (zio=0x7f0350001270) at ../../module/zfs/vdev_raidz.c:1591
/openzfs#492 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001270) at ../../module/zfs/zio.c:1399
/openzfs#493 zio_nowait (zio=0x7f0350001270) at ../../module/zfs/zio.c:1456
/openzfs#494 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350001e60) at ../../module/zfs/vdev_mirror.c:374
/openzfs#495 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1399
/openzfs#496 zio_nowait (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1456
/openzfs#497 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#498 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:2707
/openzfs#499 0x00007f03c808285b in __zio_execute (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:1399
/openzfs#500 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03a8003c00, pio=0x7f033a0c39c0) at ../../module/zfs/zio.c:547
/openzfs#501 zio_done (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:3278
/openzfs#502 0x00007f03c808285b in __zio_execute (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:1399
/openzfs#503 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800c400, pio=0x7f03a8003c00) at ../../module/zfs/zio.c:547
/openzfs#504 zio_done (zio=0x7f038800c400) at ../../module/zfs/zio.c:3278
/openzfs#505 0x00007f03c808285b in __zio_execute (zio=0x7f038800c400) at ../../module/zfs/zio.c:1399
/openzfs#506 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800da00, pio=0x7f038800c400) at ../../module/zfs/zio.c:547
/openzfs#507 zio_done (zio=0x7f038800da00) at ../../module/zfs/zio.c:3278
/openzfs#508 0x00007f03c808285b in __zio_execute (zio=0x7f038800da00) at ../../module/zfs/zio.c:1399
/openzfs#509 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800fd80, pio=0x7f038800da00) at ../../module/zfs/zio.c:547
/openzfs#510 zio_done (zio=0x7f038800fd80) at ../../module/zfs/zio.c:3278
/openzfs#511 0x00007f03c807a6d3 in __zio_execute (zio=0x7f038800fd80) at ../../module/zfs/zio.c:1399
/openzfs#512 zio_execute (zio=zio@entry=0x7f038800fd80) at ../../module/zfs/zio.c:1337
/openzfs#513 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#514 0x00007f03c806119d in vdev_queue_io_done (zio=zio@entry=0x7f03a0010950) at ../../module/zfs/vdev_queue.c:775
/openzfs#515 0x00007f03c807a0e8 in zio_vdev_io_done (zio=0x7f03a0010950) at ../../module/zfs/zio.c:2686
/openzfs#516 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1399
/openzfs#517 zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1337
/openzfs#518 0x00007f03c7fcd0c4 in taskq_thread (arg=0x966d50) at ../../lib/libzpool/taskq.c:215
/openzfs#519 0x00007f03c7fc7937 in zk_thread_helper (arg=0x967e90) at ../../lib/libzpool/kernel.c:135
/openzfs#520 0x00007f03c78890a3 in start_thread (arg=0x7f03c2703700) at pthread_create.c:309
/openzfs#521 0x00007f03c75c50fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

This occurred when ztest was simulating a scrub under heavy I/O load. Under
those circumstances, it was possible for a mix of noop I/Os for aggregation
continuity and the I/O elevator to generate arbitrarily deep recursion.

This patch modifies ZFS to propapage a recursion counter inside the zio_t
objects such that IOs will be redispatched upon reaching a given recursion
depth.  We can detect long call chains and dispatch to another ZIO taskq. We
cut in-line when we do this to minimize the potential for taskq exhaustion that
can prevent a zio from notifying its parent.

Signed-off-by: Richard Yao <ryao@gentoo.org>
ryao added a commit to ryao/zfs that referenced this issue Oct 10, 2014
The below excerpt of a backtrace is from a ztest failure when running ZoL's
ztest:

/openzfs#453 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#454 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350003de0) at ../../module/zfs/vdev_queue.c:747
/openzfs#455 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350003de0) at ../../module/zfs/zio.c:2659
/openzfs#456 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1399
/openzfs#457 zio_nowait (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1456
/openzfs#458 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350003a10) at ../../module/zfs/vdev_mirror.c:374
/openzfs#459 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1399
/openzfs#460 zio_nowait (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1456
/openzfs#461 0x00007f03c806464c in vdev_raidz_io_start (zio=0x7f0350003380) at ../../module/zfs/vdev_raidz.c:1607
/openzfs#462 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003380) at ../../module/zfs/zio.c:1399
/openzfs#463 zio_nowait (zio=0x7f0350003380) at ../../module/zfs/zio.c:1456
/openzfs#464 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002fb0) at ../../module/zfs/vdev_mirror.c:374
/openzfs#465 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1399
/openzfs#466 zio_nowait (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1456
/openzfs#467 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033957ebf0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#468 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:2707
/openzfs#469 0x00007f03c808285b in __zio_execute (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:1399
/openzfs#470 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f0390001330, pio=0x7f033957ebf0) at ../../module/zfs/zio.c:547
/openzfs#471 zio_done (zio=0x7f0390001330) at ../../module/zfs/zio.c:3278
/openzfs#472 0x00007f03c808285b in __zio_execute (zio=0x7f0390001330) at ../../module/zfs/zio.c:1399
/openzfs#473 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4013a00, pio=0x7f0390001330) at ../../module/zfs/zio.c:547
/openzfs#474 zio_done (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:3278
/openzfs#475 0x00007f03c808285b in __zio_execute (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:1399
/openzfs#476 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014210, pio=0x7f03b4013a00) at ../../module/zfs/zio.c:547
/openzfs#477 zio_done (zio=0x7f03b4014210) at ../../module/zfs/zio.c:3278
/openzfs#478 0x00007f03c808285b in __zio_execute (zio=0x7f03b4014210) at ../../module/zfs/zio.c:1399
/openzfs#479 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014620, pio=0x7f03b4014210) at ../../module/zfs/zio.c:547
/openzfs#480 zio_done (zio=0x7f03b4014620) at ../../module/zfs/zio.c:3278
/openzfs#481 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03b4014620) at ../../module/zfs/zio.c:1399
/openzfs#482 zio_execute (zio=zio@entry=0x7f03b4014620) at ../../module/zfs/zio.c:1337
/openzfs#483 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#484 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350002be0) at ../../module/zfs/vdev_queue.c:747
/openzfs#485 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350002be0) at ../../module/zfs/zio.c:2659
/openzfs#486 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1399
/openzfs#487 zio_nowait (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1456
/openzfs#488 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002810) at ../../module/zfs/vdev_mirror.c:374
/openzfs#489 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002810) at ../../module/zfs/zio.c:1399
/openzfs#490 zio_nowait (zio=0x7f0350002810) at ../../module/zfs/zio.c:1456
/openzfs#491 0x00007f03c8064593 in vdev_raidz_io_start (zio=0x7f0350001270) at ../../module/zfs/vdev_raidz.c:1591
/openzfs#492 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001270) at ../../module/zfs/zio.c:1399
/openzfs#493 zio_nowait (zio=0x7f0350001270) at ../../module/zfs/zio.c:1456
/openzfs#494 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350001e60) at ../../module/zfs/vdev_mirror.c:374
/openzfs#495 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1399
/openzfs#496 zio_nowait (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1456
/openzfs#497 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#498 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:2707
/openzfs#499 0x00007f03c808285b in __zio_execute (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:1399
/openzfs#500 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03a8003c00, pio=0x7f033a0c39c0) at ../../module/zfs/zio.c:547
/openzfs#501 zio_done (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:3278
/openzfs#502 0x00007f03c808285b in __zio_execute (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:1399
/openzfs#503 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800c400, pio=0x7f03a8003c00) at ../../module/zfs/zio.c:547
/openzfs#504 zio_done (zio=0x7f038800c400) at ../../module/zfs/zio.c:3278
/openzfs#505 0x00007f03c808285b in __zio_execute (zio=0x7f038800c400) at ../../module/zfs/zio.c:1399
/openzfs#506 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800da00, pio=0x7f038800c400) at ../../module/zfs/zio.c:547
/openzfs#507 zio_done (zio=0x7f038800da00) at ../../module/zfs/zio.c:3278
/openzfs#508 0x00007f03c808285b in __zio_execute (zio=0x7f038800da00) at ../../module/zfs/zio.c:1399
/openzfs#509 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800fd80, pio=0x7f038800da00) at ../../module/zfs/zio.c:547
/openzfs#510 zio_done (zio=0x7f038800fd80) at ../../module/zfs/zio.c:3278
/openzfs#511 0x00007f03c807a6d3 in __zio_execute (zio=0x7f038800fd80) at ../../module/zfs/zio.c:1399
/openzfs#512 zio_execute (zio=zio@entry=0x7f038800fd80) at ../../module/zfs/zio.c:1337
/openzfs#513 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#514 0x00007f03c806119d in vdev_queue_io_done (zio=zio@entry=0x7f03a0010950) at ../../module/zfs/vdev_queue.c:775
/openzfs#515 0x00007f03c807a0e8 in zio_vdev_io_done (zio=0x7f03a0010950) at ../../module/zfs/zio.c:2686
/openzfs#516 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1399
/openzfs#517 zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1337
/openzfs#518 0x00007f03c7fcd0c4 in taskq_thread (arg=0x966d50) at ../../lib/libzpool/taskq.c:215
/openzfs#519 0x00007f03c7fc7937 in zk_thread_helper (arg=0x967e90) at ../../lib/libzpool/kernel.c:135
/openzfs#520 0x00007f03c78890a3 in start_thread (arg=0x7f03c2703700) at pthread_create.c:309
/openzfs#521 0x00007f03c75c50fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

This occurred when ztest was simulating a scrub under heavy I/O load. Under
those circumstances, it was possible for a mix of noop I/Os for aggregation
continuity and the I/O elevator to generate arbitrarily deep recursion.

This patch modifies ZFS to propapage a recursion counter inside the zio_t
objects such that IOs will be redispatched upon reaching a given recursion
depth.  We can detect long call chains and dispatch to another ZIO taskq. We
cut in-line when we do this to minimize the potential for taskq exhaustion that
can prevent a zio from notifying its parent.

Signed-off-by: Richard Yao <ryao@gentoo.org>
ryao added a commit to ryao/zfs that referenced this issue Oct 10, 2014
The below excerpt of a backtrace is from a ztest failure when running ZoL's
ztest:

/openzfs#453 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#454 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350003de0) at ../../module/zfs/vdev_queue.c:747
/openzfs#455 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350003de0) at ../../module/zfs/zio.c:2659
/openzfs#456 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1399
/openzfs#457 zio_nowait (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1456
/openzfs#458 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350003a10) at ../../module/zfs/vdev_mirror.c:374
/openzfs#459 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1399
/openzfs#460 zio_nowait (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1456
/openzfs#461 0x00007f03c806464c in vdev_raidz_io_start (zio=0x7f0350003380) at ../../module/zfs/vdev_raidz.c:1607
/openzfs#462 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003380) at ../../module/zfs/zio.c:1399
/openzfs#463 zio_nowait (zio=0x7f0350003380) at ../../module/zfs/zio.c:1456
/openzfs#464 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002fb0) at ../../module/zfs/vdev_mirror.c:374
/openzfs#465 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1399
/openzfs#466 zio_nowait (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1456
/openzfs#467 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033957ebf0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#468 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:2707
/openzfs#469 0x00007f03c808285b in __zio_execute (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:1399
/openzfs#470 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f0390001330, pio=0x7f033957ebf0) at ../../module/zfs/zio.c:547
/openzfs#471 zio_done (zio=0x7f0390001330) at ../../module/zfs/zio.c:3278
/openzfs#472 0x00007f03c808285b in __zio_execute (zio=0x7f0390001330) at ../../module/zfs/zio.c:1399
/openzfs#473 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4013a00, pio=0x7f0390001330) at ../../module/zfs/zio.c:547
/openzfs#474 zio_done (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:3278
/openzfs#475 0x00007f03c808285b in __zio_execute (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:1399
/openzfs#476 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014210, pio=0x7f03b4013a00) at ../../module/zfs/zio.c:547
/openzfs#477 zio_done (zio=0x7f03b4014210) at ../../module/zfs/zio.c:3278
/openzfs#478 0x00007f03c808285b in __zio_execute (zio=0x7f03b4014210) at ../../module/zfs/zio.c:1399
/openzfs#479 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014620, pio=0x7f03b4014210) at ../../module/zfs/zio.c:547
/openzfs#480 zio_done (zio=0x7f03b4014620) at ../../module/zfs/zio.c:3278
/openzfs#481 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03b4014620) at ../../module/zfs/zio.c:1399
/openzfs#482 zio_execute (zio=zio@entry=0x7f03b4014620) at ../../module/zfs/zio.c:1337
/openzfs#483 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#484 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350002be0) at ../../module/zfs/vdev_queue.c:747
/openzfs#485 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350002be0) at ../../module/zfs/zio.c:2659
/openzfs#486 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1399
/openzfs#487 zio_nowait (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1456
/openzfs#488 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002810) at ../../module/zfs/vdev_mirror.c:374
/openzfs#489 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002810) at ../../module/zfs/zio.c:1399
/openzfs#490 zio_nowait (zio=0x7f0350002810) at ../../module/zfs/zio.c:1456
/openzfs#491 0x00007f03c8064593 in vdev_raidz_io_start (zio=0x7f0350001270) at ../../module/zfs/vdev_raidz.c:1591
/openzfs#492 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001270) at ../../module/zfs/zio.c:1399
/openzfs#493 zio_nowait (zio=0x7f0350001270) at ../../module/zfs/zio.c:1456
/openzfs#494 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350001e60) at ../../module/zfs/vdev_mirror.c:374
/openzfs#495 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1399
/openzfs#496 zio_nowait (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1456
/openzfs#497 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#498 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:2707
/openzfs#499 0x00007f03c808285b in __zio_execute (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:1399
/openzfs#500 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03a8003c00, pio=0x7f033a0c39c0) at ../../module/zfs/zio.c:547
/openzfs#501 zio_done (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:3278
/openzfs#502 0x00007f03c808285b in __zio_execute (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:1399
/openzfs#503 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800c400, pio=0x7f03a8003c00) at ../../module/zfs/zio.c:547
/openzfs#504 zio_done (zio=0x7f038800c400) at ../../module/zfs/zio.c:3278
/openzfs#505 0x00007f03c808285b in __zio_execute (zio=0x7f038800c400) at ../../module/zfs/zio.c:1399
/openzfs#506 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800da00, pio=0x7f038800c400) at ../../module/zfs/zio.c:547
/openzfs#507 zio_done (zio=0x7f038800da00) at ../../module/zfs/zio.c:3278
/openzfs#508 0x00007f03c808285b in __zio_execute (zio=0x7f038800da00) at ../../module/zfs/zio.c:1399
/openzfs#509 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800fd80, pio=0x7f038800da00) at ../../module/zfs/zio.c:547
/openzfs#510 zio_done (zio=0x7f038800fd80) at ../../module/zfs/zio.c:3278
/openzfs#511 0x00007f03c807a6d3 in __zio_execute (zio=0x7f038800fd80) at ../../module/zfs/zio.c:1399
/openzfs#512 zio_execute (zio=zio@entry=0x7f038800fd80) at ../../module/zfs/zio.c:1337
/openzfs#513 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#514 0x00007f03c806119d in vdev_queue_io_done (zio=zio@entry=0x7f03a0010950) at ../../module/zfs/vdev_queue.c:775
/openzfs#515 0x00007f03c807a0e8 in zio_vdev_io_done (zio=0x7f03a0010950) at ../../module/zfs/zio.c:2686
/openzfs#516 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1399
/openzfs#517 zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1337
/openzfs#518 0x00007f03c7fcd0c4 in taskq_thread (arg=0x966d50) at ../../lib/libzpool/taskq.c:215
/openzfs#519 0x00007f03c7fc7937 in zk_thread_helper (arg=0x967e90) at ../../lib/libzpool/kernel.c:135
/openzfs#520 0x00007f03c78890a3 in start_thread (arg=0x7f03c2703700) at pthread_create.c:309
/openzfs#521 0x00007f03c75c50fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

This occurred when ztest was simulating a scrub under heavy I/O load. Under
those circumstances, it was possible for a mix of noop I/Os for aggregation
continuity and the I/O elevator to generate arbitrarily deep recursion.

This patch modifies ZFS to propapage a recursion counter inside the zio_t
objects such that IOs will be redispatched upon reaching a given recursion
depth.  We can detect long call chains and dispatch to another ZIO taskq. We
cut in-line when we do this to minimize the potential for taskq exhaustion that
can prevent a zio from notifying its parent.

Signed-off-by: Richard Yao <ryao@gentoo.org>
ryao added a commit to ryao/zfs that referenced this issue Oct 10, 2014
The below excerpt of a backtrace is from a ztest failure when running ZoL's
ztest:

/openzfs#453 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#454 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350003de0) at ../../module/zfs/vdev_queue.c:747
/openzfs#455 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350003de0) at ../../module/zfs/zio.c:2659
/openzfs#456 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1399
/openzfs#457 zio_nowait (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1456
/openzfs#458 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350003a10) at ../../module/zfs/vdev_mirror.c:374
/openzfs#459 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1399
/openzfs#460 zio_nowait (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1456
/openzfs#461 0x00007f03c806464c in vdev_raidz_io_start (zio=0x7f0350003380) at ../../module/zfs/vdev_raidz.c:1607
/openzfs#462 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003380) at ../../module/zfs/zio.c:1399
/openzfs#463 zio_nowait (zio=0x7f0350003380) at ../../module/zfs/zio.c:1456
/openzfs#464 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002fb0) at ../../module/zfs/vdev_mirror.c:374
/openzfs#465 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1399
/openzfs#466 zio_nowait (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1456
/openzfs#467 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033957ebf0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#468 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:2707
/openzfs#469 0x00007f03c808285b in __zio_execute (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:1399
/openzfs#470 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f0390001330, pio=0x7f033957ebf0) at ../../module/zfs/zio.c:547
/openzfs#471 zio_done (zio=0x7f0390001330) at ../../module/zfs/zio.c:3278
/openzfs#472 0x00007f03c808285b in __zio_execute (zio=0x7f0390001330) at ../../module/zfs/zio.c:1399
/openzfs#473 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4013a00, pio=0x7f0390001330) at ../../module/zfs/zio.c:547
/openzfs#474 zio_done (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:3278
/openzfs#475 0x00007f03c808285b in __zio_execute (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:1399
/openzfs#476 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014210, pio=0x7f03b4013a00) at ../../module/zfs/zio.c:547
/openzfs#477 zio_done (zio=0x7f03b4014210) at ../../module/zfs/zio.c:3278
/openzfs#478 0x00007f03c808285b in __zio_execute (zio=0x7f03b4014210) at ../../module/zfs/zio.c:1399
/openzfs#479 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014620, pio=0x7f03b4014210) at ../../module/zfs/zio.c:547
/openzfs#480 zio_done (zio=0x7f03b4014620) at ../../module/zfs/zio.c:3278
/openzfs#481 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03b4014620) at ../../module/zfs/zio.c:1399
/openzfs#482 zio_execute (zio=zio@entry=0x7f03b4014620) at ../../module/zfs/zio.c:1337
/openzfs#483 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#484 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350002be0) at ../../module/zfs/vdev_queue.c:747
/openzfs#485 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350002be0) at ../../module/zfs/zio.c:2659
/openzfs#486 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1399
/openzfs#487 zio_nowait (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1456
/openzfs#488 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002810) at ../../module/zfs/vdev_mirror.c:374
/openzfs#489 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002810) at ../../module/zfs/zio.c:1399
/openzfs#490 zio_nowait (zio=0x7f0350002810) at ../../module/zfs/zio.c:1456
/openzfs#491 0x00007f03c8064593 in vdev_raidz_io_start (zio=0x7f0350001270) at ../../module/zfs/vdev_raidz.c:1591
/openzfs#492 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001270) at ../../module/zfs/zio.c:1399
/openzfs#493 zio_nowait (zio=0x7f0350001270) at ../../module/zfs/zio.c:1456
/openzfs#494 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350001e60) at ../../module/zfs/vdev_mirror.c:374
/openzfs#495 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1399
/openzfs#496 zio_nowait (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1456
/openzfs#497 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#498 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:2707
/openzfs#499 0x00007f03c808285b in __zio_execute (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:1399
/openzfs#500 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03a8003c00, pio=0x7f033a0c39c0) at ../../module/zfs/zio.c:547
/openzfs#501 zio_done (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:3278
/openzfs#502 0x00007f03c808285b in __zio_execute (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:1399
/openzfs#503 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800c400, pio=0x7f03a8003c00) at ../../module/zfs/zio.c:547
/openzfs#504 zio_done (zio=0x7f038800c400) at ../../module/zfs/zio.c:3278
/openzfs#505 0x00007f03c808285b in __zio_execute (zio=0x7f038800c400) at ../../module/zfs/zio.c:1399
/openzfs#506 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800da00, pio=0x7f038800c400) at ../../module/zfs/zio.c:547
/openzfs#507 zio_done (zio=0x7f038800da00) at ../../module/zfs/zio.c:3278
/openzfs#508 0x00007f03c808285b in __zio_execute (zio=0x7f038800da00) at ../../module/zfs/zio.c:1399
/openzfs#509 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800fd80, pio=0x7f038800da00) at ../../module/zfs/zio.c:547
/openzfs#510 zio_done (zio=0x7f038800fd80) at ../../module/zfs/zio.c:3278
/openzfs#511 0x00007f03c807a6d3 in __zio_execute (zio=0x7f038800fd80) at ../../module/zfs/zio.c:1399
/openzfs#512 zio_execute (zio=zio@entry=0x7f038800fd80) at ../../module/zfs/zio.c:1337
/openzfs#513 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#514 0x00007f03c806119d in vdev_queue_io_done (zio=zio@entry=0x7f03a0010950) at ../../module/zfs/vdev_queue.c:775
/openzfs#515 0x00007f03c807a0e8 in zio_vdev_io_done (zio=0x7f03a0010950) at ../../module/zfs/zio.c:2686
/openzfs#516 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1399
/openzfs#517 zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1337
/openzfs#518 0x00007f03c7fcd0c4 in taskq_thread (arg=0x966d50) at ../../lib/libzpool/taskq.c:215
/openzfs#519 0x00007f03c7fc7937 in zk_thread_helper (arg=0x967e90) at ../../lib/libzpool/kernel.c:135
/openzfs#520 0x00007f03c78890a3 in start_thread (arg=0x7f03c2703700) at pthread_create.c:309
/openzfs#521 0x00007f03c75c50fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

This occurred when ztest was simulating a scrub under heavy I/O load. Under
those circumstances, it was possible for a mix of noop I/Os for aggregation
continuity and the I/O elevator to generate arbitrarily deep recursion.

This patch modifies ZFS to propapage a recursion counter inside the zio_t
objects such that IOs will be redispatched upon reaching a given recursion
depth.  We can detect long call chains and dispatch to another ZIO taskq. We
cut in-line when we do this to minimize the potential for taskq exhaustion that
can prevent a zio from notifying its parent.

Signed-off-by: Richard Yao <ryao@gentoo.org>
ryao added a commit to ryao/zfs that referenced this issue Oct 11, 2014
The below excerpt of a backtrace is from a ztest failure when running ZoL's
ztest:

/openzfs#453 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#454 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350003de0) at ../../module/zfs/vdev_queue.c:747
/openzfs#455 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350003de0) at ../../module/zfs/zio.c:2659
/openzfs#456 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1399
/openzfs#457 zio_nowait (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1456
/openzfs#458 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350003a10) at ../../module/zfs/vdev_mirror.c:374
/openzfs#459 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1399
/openzfs#460 zio_nowait (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1456
/openzfs#461 0x00007f03c806464c in vdev_raidz_io_start (zio=0x7f0350003380) at ../../module/zfs/vdev_raidz.c:1607
/openzfs#462 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003380) at ../../module/zfs/zio.c:1399
/openzfs#463 zio_nowait (zio=0x7f0350003380) at ../../module/zfs/zio.c:1456
/openzfs#464 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002fb0) at ../../module/zfs/vdev_mirror.c:374
/openzfs#465 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1399
/openzfs#466 zio_nowait (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1456
/openzfs#467 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033957ebf0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#468 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:2707
/openzfs#469 0x00007f03c808285b in __zio_execute (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:1399
/openzfs#470 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f0390001330, pio=0x7f033957ebf0) at ../../module/zfs/zio.c:547
/openzfs#471 zio_done (zio=0x7f0390001330) at ../../module/zfs/zio.c:3278
/openzfs#472 0x00007f03c808285b in __zio_execute (zio=0x7f0390001330) at ../../module/zfs/zio.c:1399
/openzfs#473 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4013a00, pio=0x7f0390001330) at ../../module/zfs/zio.c:547
/openzfs#474 zio_done (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:3278
/openzfs#475 0x00007f03c808285b in __zio_execute (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:1399
/openzfs#476 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014210, pio=0x7f03b4013a00) at ../../module/zfs/zio.c:547
/openzfs#477 zio_done (zio=0x7f03b4014210) at ../../module/zfs/zio.c:3278
/openzfs#478 0x00007f03c808285b in __zio_execute (zio=0x7f03b4014210) at ../../module/zfs/zio.c:1399
/openzfs#479 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014620, pio=0x7f03b4014210) at ../../module/zfs/zio.c:547
/openzfs#480 zio_done (zio=0x7f03b4014620) at ../../module/zfs/zio.c:3278
/openzfs#481 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03b4014620) at ../../module/zfs/zio.c:1399
/openzfs#482 zio_execute (zio=zio@entry=0x7f03b4014620) at ../../module/zfs/zio.c:1337
/openzfs#483 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#484 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350002be0) at ../../module/zfs/vdev_queue.c:747
/openzfs#485 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350002be0) at ../../module/zfs/zio.c:2659
/openzfs#486 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1399
/openzfs#487 zio_nowait (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1456
/openzfs#488 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002810) at ../../module/zfs/vdev_mirror.c:374
/openzfs#489 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002810) at ../../module/zfs/zio.c:1399
/openzfs#490 zio_nowait (zio=0x7f0350002810) at ../../module/zfs/zio.c:1456
/openzfs#491 0x00007f03c8064593 in vdev_raidz_io_start (zio=0x7f0350001270) at ../../module/zfs/vdev_raidz.c:1591
/openzfs#492 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001270) at ../../module/zfs/zio.c:1399
/openzfs#493 zio_nowait (zio=0x7f0350001270) at ../../module/zfs/zio.c:1456
/openzfs#494 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350001e60) at ../../module/zfs/vdev_mirror.c:374
/openzfs#495 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1399
/openzfs#496 zio_nowait (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1456
/openzfs#497 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#498 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:2707
/openzfs#499 0x00007f03c808285b in __zio_execute (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:1399
/openzfs#500 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03a8003c00, pio=0x7f033a0c39c0) at ../../module/zfs/zio.c:547
/openzfs#501 zio_done (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:3278
/openzfs#502 0x00007f03c808285b in __zio_execute (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:1399
/openzfs#503 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800c400, pio=0x7f03a8003c00) at ../../module/zfs/zio.c:547
/openzfs#504 zio_done (zio=0x7f038800c400) at ../../module/zfs/zio.c:3278
/openzfs#505 0x00007f03c808285b in __zio_execute (zio=0x7f038800c400) at ../../module/zfs/zio.c:1399
/openzfs#506 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800da00, pio=0x7f038800c400) at ../../module/zfs/zio.c:547
/openzfs#507 zio_done (zio=0x7f038800da00) at ../../module/zfs/zio.c:3278
/openzfs#508 0x00007f03c808285b in __zio_execute (zio=0x7f038800da00) at ../../module/zfs/zio.c:1399
/openzfs#509 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800fd80, pio=0x7f038800da00) at ../../module/zfs/zio.c:547
/openzfs#510 zio_done (zio=0x7f038800fd80) at ../../module/zfs/zio.c:3278
/openzfs#511 0x00007f03c807a6d3 in __zio_execute (zio=0x7f038800fd80) at ../../module/zfs/zio.c:1399
/openzfs#512 zio_execute (zio=zio@entry=0x7f038800fd80) at ../../module/zfs/zio.c:1337
/openzfs#513 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#514 0x00007f03c806119d in vdev_queue_io_done (zio=zio@entry=0x7f03a0010950) at ../../module/zfs/vdev_queue.c:775
/openzfs#515 0x00007f03c807a0e8 in zio_vdev_io_done (zio=0x7f03a0010950) at ../../module/zfs/zio.c:2686
/openzfs#516 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1399
/openzfs#517 zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1337
/openzfs#518 0x00007f03c7fcd0c4 in taskq_thread (arg=0x966d50) at ../../lib/libzpool/taskq.c:215
/openzfs#519 0x00007f03c7fc7937 in zk_thread_helper (arg=0x967e90) at ../../lib/libzpool/kernel.c:135
/openzfs#520 0x00007f03c78890a3 in start_thread (arg=0x7f03c2703700) at pthread_create.c:309
/openzfs#521 0x00007f03c75c50fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

This occurred when ztest was simulating a scrub under heavy I/O load. Under
those circumstances, it was possible for a mix of noop I/Os for aggregation
continuity and the I/O elevator to generate arbitrarily deep recursion.

This patch modifies ZFS to propapage a recursion counter inside the zio_t
objects such that IOs will be redispatched upon reaching a given recursion
depth.  We can detect long call chains and dispatch to another ZIO taskq. We
cut in-line when we do this to minimize the potential for taskq exhaustion that
can prevent a zio from notifying its parent.

Signed-off-by: Richard Yao <ryao@gentoo.org>
ryao added a commit to ryao/zfs that referenced this issue Oct 11, 2014
The below excerpt of a backtrace is from a ztest failure when running ZoL's
ztest:

/openzfs#453 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#454 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350003de0) at ../../module/zfs/vdev_queue.c:747
/openzfs#455 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350003de0) at ../../module/zfs/zio.c:2659
/openzfs#456 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1399
/openzfs#457 zio_nowait (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1456
/openzfs#458 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350003a10) at ../../module/zfs/vdev_mirror.c:374
/openzfs#459 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1399
/openzfs#460 zio_nowait (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1456
/openzfs#461 0x00007f03c806464c in vdev_raidz_io_start (zio=0x7f0350003380) at ../../module/zfs/vdev_raidz.c:1607
/openzfs#462 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003380) at ../../module/zfs/zio.c:1399
/openzfs#463 zio_nowait (zio=0x7f0350003380) at ../../module/zfs/zio.c:1456
/openzfs#464 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002fb0) at ../../module/zfs/vdev_mirror.c:374
/openzfs#465 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1399
/openzfs#466 zio_nowait (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1456
/openzfs#467 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033957ebf0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#468 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:2707
/openzfs#469 0x00007f03c808285b in __zio_execute (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:1399
/openzfs#470 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f0390001330, pio=0x7f033957ebf0) at ../../module/zfs/zio.c:547
/openzfs#471 zio_done (zio=0x7f0390001330) at ../../module/zfs/zio.c:3278
/openzfs#472 0x00007f03c808285b in __zio_execute (zio=0x7f0390001330) at ../../module/zfs/zio.c:1399
/openzfs#473 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4013a00, pio=0x7f0390001330) at ../../module/zfs/zio.c:547
/openzfs#474 zio_done (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:3278
/openzfs#475 0x00007f03c808285b in __zio_execute (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:1399
/openzfs#476 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014210, pio=0x7f03b4013a00) at ../../module/zfs/zio.c:547
/openzfs#477 zio_done (zio=0x7f03b4014210) at ../../module/zfs/zio.c:3278
/openzfs#478 0x00007f03c808285b in __zio_execute (zio=0x7f03b4014210) at ../../module/zfs/zio.c:1399
/openzfs#479 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014620, pio=0x7f03b4014210) at ../../module/zfs/zio.c:547
/openzfs#480 zio_done (zio=0x7f03b4014620) at ../../module/zfs/zio.c:3278
/openzfs#481 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03b4014620) at ../../module/zfs/zio.c:1399
/openzfs#482 zio_execute (zio=zio@entry=0x7f03b4014620) at ../../module/zfs/zio.c:1337
/openzfs#483 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#484 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350002be0) at ../../module/zfs/vdev_queue.c:747
/openzfs#485 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350002be0) at ../../module/zfs/zio.c:2659
/openzfs#486 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1399
/openzfs#487 zio_nowait (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1456
/openzfs#488 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002810) at ../../module/zfs/vdev_mirror.c:374
/openzfs#489 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002810) at ../../module/zfs/zio.c:1399
/openzfs#490 zio_nowait (zio=0x7f0350002810) at ../../module/zfs/zio.c:1456
/openzfs#491 0x00007f03c8064593 in vdev_raidz_io_start (zio=0x7f0350001270) at ../../module/zfs/vdev_raidz.c:1591
/openzfs#492 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001270) at ../../module/zfs/zio.c:1399
/openzfs#493 zio_nowait (zio=0x7f0350001270) at ../../module/zfs/zio.c:1456
/openzfs#494 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350001e60) at ../../module/zfs/vdev_mirror.c:374
/openzfs#495 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1399
/openzfs#496 zio_nowait (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1456
/openzfs#497 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#498 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:2707
/openzfs#499 0x00007f03c808285b in __zio_execute (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:1399
/openzfs#500 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03a8003c00, pio=0x7f033a0c39c0) at ../../module/zfs/zio.c:547
/openzfs#501 zio_done (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:3278
/openzfs#502 0x00007f03c808285b in __zio_execute (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:1399
/openzfs#503 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800c400, pio=0x7f03a8003c00) at ../../module/zfs/zio.c:547
/openzfs#504 zio_done (zio=0x7f038800c400) at ../../module/zfs/zio.c:3278
/openzfs#505 0x00007f03c808285b in __zio_execute (zio=0x7f038800c400) at ../../module/zfs/zio.c:1399
/openzfs#506 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800da00, pio=0x7f038800c400) at ../../module/zfs/zio.c:547
/openzfs#507 zio_done (zio=0x7f038800da00) at ../../module/zfs/zio.c:3278
/openzfs#508 0x00007f03c808285b in __zio_execute (zio=0x7f038800da00) at ../../module/zfs/zio.c:1399
/openzfs#509 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800fd80, pio=0x7f038800da00) at ../../module/zfs/zio.c:547
/openzfs#510 zio_done (zio=0x7f038800fd80) at ../../module/zfs/zio.c:3278
/openzfs#511 0x00007f03c807a6d3 in __zio_execute (zio=0x7f038800fd80) at ../../module/zfs/zio.c:1399
/openzfs#512 zio_execute (zio=zio@entry=0x7f038800fd80) at ../../module/zfs/zio.c:1337
/openzfs#513 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#514 0x00007f03c806119d in vdev_queue_io_done (zio=zio@entry=0x7f03a0010950) at ../../module/zfs/vdev_queue.c:775
/openzfs#515 0x00007f03c807a0e8 in zio_vdev_io_done (zio=0x7f03a0010950) at ../../module/zfs/zio.c:2686
/openzfs#516 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1399
/openzfs#517 zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1337
/openzfs#518 0x00007f03c7fcd0c4 in taskq_thread (arg=0x966d50) at ../../lib/libzpool/taskq.c:215
/openzfs#519 0x00007f03c7fc7937 in zk_thread_helper (arg=0x967e90) at ../../lib/libzpool/kernel.c:135
/openzfs#520 0x00007f03c78890a3 in start_thread (arg=0x7f03c2703700) at pthread_create.c:309
/openzfs#521 0x00007f03c75c50fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

This occurred when ztest was simulating a scrub under heavy I/O load. Under
those circumstances, it was possible for a mix of noop I/Os for aggregation
continuity and the I/O elevator to generate arbitrarily deep recursion.

This patch modifies ZFS to propapage a recursion counter inside the zio_t
objects such that IOs will be redispatched upon reaching a given recursion
depth.  We can detect long call chains and dispatch to another ZIO taskq. We
cut in-line when we do this to minimize the potential for taskq exhaustion that
can prevent a zio from notifying its parent.

Signed-off-by: Richard Yao <ryao@gentoo.org>
behlendorf pushed a commit to behlendorf/zfs that referenced this issue May 21, 2018
When dynamic taskq is enabled and all threads for a taskq are occupied,
a recursive dispatch can cause a deadlock if calling thread depends on
the recursively-dispatched thread for its return condition.

This patch attempts to create a new thread for recursive dispatch when
none are available.

Signed-off-by: Tim Chase <tim@chase2k.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#472
behlendorf added a commit to behlendorf/zfs that referenced this issue May 21, 2018
…ary"

This reverts commit 076821e due to a locking issue uncovered in
subsequent testing.  An ASSERT is hit due to tq->tq_nspawn being
updated outside the lock.  The patch will need to be reworked.

VERIFY3(0 == tq->tq_nspawn) failed (0 == -1)

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue openzfs#472
behlendorf pushed a commit to behlendorf/zfs that referenced this issue May 21, 2018
When dynamic taskq is enabled and all threads for a taskq are occupied,
a recursive dispatch can cause a deadlock if calling thread depends on
the recursively-dispatched thread for its return condition.

This patch attempts to create a new thread for recursive dispatch when
none are available.

Signed-off-by: Tim Chase <tim@chase2k.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#472
ahrens added a commit to ahrens/zfs that referenced this issue Sep 29, 2021
Change the way that the ObjectAccess layer lists and deletes objects to
use Streams (which are like async iterators).  This allows entries to be
processed as they become available, and allows for more elegantly
manipuating the list/stream of object names (keys).  Note that this
doesn't introduce additional concurrency (except as noted below), but it
reduces copying around of the lists of keys (i.e. there's no
Vec<String>, just a series of String's that are moved from producer to
consumer).

Consolidate the consumers of the various listing helper functions around
just 2 calls: list_objects() and list_prefixes().

The only delimiter used is "/"; change the parameter to be a boolean
indicating whether it should be used.

Note: the concurrency in `cleanup_data_objects()` (listing all prefixes
at the same time) is maintained by use of `select_all()`.  Additionally,
we now do a separate, concurrent DeleteObjects call for the contents of
each prefix.

Side benefit: "list pools" command is 10x faster, by virtue of reading
all the pools uberblocks concurrently, using
`StreamExt::for_each_concurrent()`.  This required introducing a lock
around the results, since each pool may be added to the results
concurrently.  (However, the time for a scanning `zpool import` is often
dominated by serially opening each valid pool.)

Note: this probably introduces even more bugs with dealing with a
non-empty AWS_PREFIX.  A future PR will remove this functionality.
pcd1193182 pushed a commit to pcd1193182/zfs that referenced this issue Jun 27, 2022
Pure code reorganization, no change in behavior.

`struct Progress` is moved out of `fn merge_task`, its new full name is
`::zettacache::zettacache::merge::Progress`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants