Set spl_taskq_thread_dynamic=0 by default #484
Conversation
Disable dynamic taskqs by default. They have been implicated in several lockups and disabling them typically resolves the issue. These lockups may only occur with certain kernel CONFIG_* options and versions but until the root cause is identified it's safest to disable this functionality by default. End users may opt to re-enable them if they have not observed any problems in their environment. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
|
@behlendorf I would tend to agree. Part of my recent testing regimen has been to try to stress dynamic thread creation but so far, I've not been able to reproduce any problems. I'm thinking of adding some more kstats to get a handle on the relationship between the number of sequentially launched tasks and those which actually get a new thread. Since these deadlocks don't all seem to be related to memory allocation, my current half-baked theory is that there may be some sort of dependency, maybe in the zio pipeline, which can cause deadlocks if certain tasks are run sequentially. I also wonder whether there may be some interaction with the taskq reconfiguration in |
|
That sounds like my experience, I haven't been able to trigger any issues with them either. Issue #483 contains a warning from lockdep but it looks like a false positive. The code just needs to have a subclass added to make lockdep happy. I like the idea of adding a kstat to provide visibility in to the taskqs. Being able to easily spot check how they're behaving could be tremendously helpful when investigating performance issues. Since there are a relatively small number of them you could track some stats for each of them and output them in a kstat one taskq per-line. As for this deadlock I've been mulling over it over and it sure seems like we're somehow taking the tq->tq_lock lock twice. Although it's not clear to be how that's possible. |
|
Merged to the release branch, this functionally was not been disabled in master. We'll fix it there. 4d3d716 Set spl_taskq_thread_dynamic=0 by default |
|
@behlendorf FYI, I've got taskq kstats mostly working in https://github.com/dweeezil/spl/tree/taskq-kstat and am working on tuning the ouput. I'm also mulling over getting some per tqent output as well. So far it looks kinda like this: I move the sequential task counter into the taskq struct to give it visibility. I'll convert this to a pull request when I'm happy with it. |
Bug Fixes * Fix CPU hotplug openzfs/spl#482 * Disable dynamic taskqs by default to avoid deadlock openzfs/spl#484 * Don't import all visible pools in zfs-import init script openzfs/zfs#3777 * Fix use-after-free in vdev_disk_physio_completion openzfs/zfs#3920 * Fix avl_is_empty(&dn->dn_dbufs) assertion openzfs/zfs#3865 Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com> Change-Id: I36347630be2506bee4ff0a05f1b236ba2ba7a0ae Reviewed-on: http://review.whamcloud.com/16877 Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com> Reviewed-by: Andreas Dilger <andreas.dilger@intel.com> Tested-by: Jenkins Tested-by: Maloo <hpdd-maloo@intel.com>
Disable dynamic taskqs by default. They have been implicated in
several lockups and disabling them typically resolves the issue.
These lockups may only occur with certain kernel CONFIG_* options
and versions but until the root cause is identified it's safest
to disable this functionality by default. End users may opt to
re-enable them if they have not observed any problems in their
environment.
Signed-off-by: Brian Behlendorf behlendorf1@llnl.gov