Fix deadlock between zfs umount & snapentry_expire #7752

rohan-puri · 2018-07-28T13:32:00Z

zfs umount -> zfsctl_destroy() takes zfs_snapshot_lock WRITER &
calls zfsctl_snapshot_unmount_cancel() which waits for
snapentry_expire() if present(only when snap is automounted).
This snapentry_expire() itself then waits for zfs_snapshot_lock
READER, resulting in a deadlock.

Fix is to take READER in zfsctl_destroy() & upgrade to WRITER
after call to zfsctl_snapshot_unmount_cancel() before calling
zfsctl_snapshot_remove(). This allows snapentry_expire() to
progress too with the READER.

Signed-off-by: Rohan Puri rohan.puri15@gmail.com

Motivation and Context

This change fixes the deadlock between zfs umount and snapentry_expire() #7751

Description

Fix is to take READER in zfsctl_destroy() & upgrade to WRITER

after call to zfsctl_snapshot_unmount_cancel() before calling
zfsctl_snapshot_remove(). This allows snapentry_expire() to
progress too with the READER.

How Has This Been Tested?

I reproduced the problem by instrumenting the code, details are mentioned in the bug details. With the repro instrumentation of the code, I tested the fix to see that the deadlock does not happen again.

Types of changes

[x] Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Performance enhancement (non-breaking change which improves efficiency)
Code cleanup (non-breaking change which makes code smaller or more readable)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation (a change to man pages or other documentation)

Checklist:

[x ] My code follows the ZFS on Linux code style requirements.
I have updated the documentation accordingly.
[x ] I have read the contributing document.
I have added tests to cover my changes.
All new and existing tests passed.
[x ] All commit messages are properly formatted and contain Signed-off-by.
Change has been approved by a ZFS on Linux member.

codecov · 2018-07-28T16:12:38Z

Codecov Report

Merging #7752 into master will increase coverage by 0.41%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #7752      +/-   ##
==========================================
+ Coverage   77.68%    78.1%   +0.41%     
==========================================
  Files         359      368       +9     
  Lines      111916   111981      +65     
==========================================
+ Hits        86945    87461     +516     
+ Misses      24971    24520     -451

Flag	Coverage Δ
#kernel	`78.7% <100%> (+16.47%)`	⬆️
#user	`67.03% <ø> (-0.3%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 492f64e...8e4d086. Read the comment docs.

behlendorf

Thanks for getting to the bottom on this! Your analysis looks good to me but I've suggested a slightly different approach to handle this case.

It'd be great if you could add the test case you came up with to the functional/snapshot/ tests. The more coverage we have of corner cases like this the better.

behlendorf · 2018-07-30T17:10:26Z

module/zfs/zfs_ctldir.c

 		if ((se = zfsctl_snapshot_find_by_objsetid(spa, objsetid))
 		    != NULL) {
 			zfsctl_snapshot_unmount_cancel(se);
+			if (!rw_tryupgrade(&zfs_snapshot_lock)) {
+				rw_exit(&zfs_snapshot_lock);


My concern with this is that dropping and reacquiring the lock here it leaves a small window where the zfs_snapentry_t can still be found on the _by_name and _by_objsetid list just before it's going to be destroyed.

Since the RW_WRITER lock is there to protect the lookup and and list removal. it makes sense to do both of those things first while holding the lock. Then drop the lock, and call zfsctl_snapshot_unmount_cancel. Since the validity of the se is being protected by the se->se_refcount. It looks like you will need to remove the ASSERT from zfsctl_snapshot_unmount_cancel.

Fair concern, reworked the change as suggested.

The test case is to reproduce actually, needs to add a sleep for few seconds in snapentry_expire(), before se->se_taskqid = TASKQID_INVALID; and during this sleep the zfs umount zfs_name( for which the snapshot existed/created) has to be fired.
But this sleep is only needed for testing purposes. But i do not see any flags under which I can add this sleep. Any idea?

behlendorf

But i do not see any flags under which I can add this sleep. Any idea?

Not really. None of the existing test infrastructure really allows for that. I know I suggested adding a test for this, but I'd OK with just manually verifying the fix.

behlendorf · 2018-07-31T19:09:44Z

module/zfs/zfs_ctldir.c

 			zfsctl_snapshot_remove(se);
 			zfsctl_snapshot_rele(se);
 		}
 		rw_exit(&zfs_snapshot_lock);
+		if (se) {
+			zfsctl_snapshot_unmount_cancel(se);


You want to release the reference taken in zfsctl_snapshot_find_by_objsetid() with zfsctl_snapshot_rele() after calling zfsctl_snapshot_unmount_cancel(). This way the object can't be freed out from underneath you.

Fair point, made the changes. Thanks for the clarification on testing. This latest updated patch is manually tested.

zfs umount -> zfsctl_destroy() takes zfs_snapshot_lock WRITER & calls zfsctl_snapshot_unmount_cancel() which waits for snapentry_expire() if present(only when snap is automounted). This snapentry_expire() itself then waits for zfs_snapshot_lock READER, resulting in a deadlock. Fix is, in zfsctl_destroy() do avl_tree lookup & removal with WRITER zfs_snapshot_lock & leave this lock before zfsctl_snapshot_unmount_cancel() call, since the validity of se is protected by se->se_refcount. Also remove the corresponding lock assertion from zfsctl_snapshot_unmount_cancel() too. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rohan Puri <rohan.puri15@gmail.com> Closes openzfs#7751

behlendorf requested changes Jul 30, 2018

View reviewed changes

behlendorf added the Status: Work in Progress Not yet ready for general review label Jul 30, 2018

rohan-puri force-pushed the snap_umount_deadlock_fix branch from 26d4de9 to 55ee838 Compare July 31, 2018 06:15

behlendorf reviewed Jul 31, 2018

View reviewed changes

rohan-puri force-pushed the snap_umount_deadlock_fix branch from 55ee838 to 8e4d086 Compare August 1, 2018 05:02

behlendorf added Reviewed and removed Status: Work in Progress Not yet ready for general review labels Aug 1, 2018

behlendorf approved these changes Aug 1, 2018

View reviewed changes

behlendorf added the Status: Accepted Ready to integrate (reviewed, tested) label Aug 1, 2018

behlendorf closed this in fd7265c Aug 1, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix deadlock between zfs umount & snapentry_expire #7752

Fix deadlock between zfs umount & snapentry_expire #7752

rohan-puri commented Jul 28, 2018

codecov bot commented Jul 28, 2018 •

edited

Loading

behlendorf left a comment

behlendorf Jul 30, 2018

rohan-puri Jul 31, 2018

behlendorf left a comment

behlendorf Jul 31, 2018

rohan-puri Aug 1, 2018

Fix deadlock between zfs umount & snapentry_expire #7752

Fix deadlock between zfs umount & snapentry_expire #7752

Conversation

rohan-puri commented Jul 28, 2018

Motivation and Context

Description

How Has This Been Tested?

Types of changes

Checklist:

codecov bot commented Jul 28, 2018 • edited Loading

Codecov Report

behlendorf left a comment

Choose a reason for hiding this comment

behlendorf Jul 30, 2018

Choose a reason for hiding this comment

rohan-puri Jul 31, 2018

Choose a reason for hiding this comment

behlendorf left a comment

Choose a reason for hiding this comment

behlendorf Jul 31, 2018

Choose a reason for hiding this comment

rohan-puri Aug 1, 2018

Choose a reason for hiding this comment

codecov bot commented Jul 28, 2018 •

edited

Loading