Fix zio_change_priority() lock inversion #7301
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
A deadlock in the I/O pipeline involving
zio_change_priority()
is possible because thezio->io_lock
andvdev->vq_lock
are need to be acquired in the incorrect order.This patch resolves the lock inversion by using
mutex_tryenter()
to detect when thevq->vq_lock
is contended in thezio_change_priority()
call path. When contended all the locks are released and acquiring them in retried.Motivation and Context
Any deadlock like this in the pipeline could manifest itself as a hang. It may explain issues like #7241 and #7059 which have been observed in master. This specific issue was introduced in a8b2e30 (zfs-0.7.0-223-ga8b2e30) and does not impact the 0.7 release branch.
How Has This Been Tested?
The issue was reliably reproducible with the
sequential_reads
test case from the perf-regression tests.With this patch applied the issue can no longer be reproduced, and initial performance results are promising but there are several outliers for long latency times.
Types of changes
Checklist:
Signed-off-by
.