Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix deadlock between zfs umount & snapentry_expire #7752

Closed

Conversation

rohan-puri
Copy link
Contributor

zfs umount -> zfsctl_destroy() takes zfs_snapshot_lock WRITER &
calls zfsctl_snapshot_unmount_cancel() which waits for
snapentry_expire() if present(only when snap is automounted).
This snapentry_expire() itself then waits for zfs_snapshot_lock
READER, resulting in a deadlock.

Fix is to take READER in zfsctl_destroy() & upgrade to WRITER
after call to zfsctl_snapshot_unmount_cancel() before calling
zfsctl_snapshot_remove(). This allows snapentry_expire() to
progress too with the READER.

Signed-off-by: Rohan Puri rohan.puri15@gmail.com

Motivation and Context

This change fixes the deadlock between zfs umount and snapentry_expire() #7751

Description

Fix is to take READER in zfsctl_destroy() & upgrade to WRITER
after call to zfsctl_snapshot_unmount_cancel() before calling
zfsctl_snapshot_remove(). This allows snapentry_expire() to
progress too with the READER.

How Has This Been Tested?

I reproduced the problem by instrumenting the code, details are mentioned in the bug details. With the repro instrumentation of the code, I tested the fix to see that the deadlock does not happen again.

Types of changes

  • [x] Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

  • [x ] My code follows the ZFS on Linux code style requirements.
  • I have updated the documentation accordingly.
  • [x ] I have read the contributing document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.
  • [x ] All commit messages are properly formatted and contain Signed-off-by.
  • Change has been approved by a ZFS on Linux member.

@codecov
Copy link

codecov bot commented Jul 28, 2018

Codecov Report

Merging #7752 into master will increase coverage by 0.41%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #7752      +/-   ##
==========================================
+ Coverage   77.68%    78.1%   +0.41%     
==========================================
  Files         359      368       +9     
  Lines      111916   111981      +65     
==========================================
+ Hits        86945    87461     +516     
+ Misses      24971    24520     -451
Flag Coverage Δ
#kernel 78.7% <100%> (+16.47%) ⬆️
#user 67.03% <ø> (-0.3%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 492f64e...8e4d086. Read the comment docs.

Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for getting to the bottom on this! Your analysis looks good to me but I've suggested a slightly different approach to handle this case.

It'd be great if you could add the test case you came up with to the functional/snapshot/ tests. The more coverage we have of corner cases like this the better.

if ((se = zfsctl_snapshot_find_by_objsetid(spa, objsetid))
!= NULL) {
zfsctl_snapshot_unmount_cancel(se);
if (!rw_tryupgrade(&zfs_snapshot_lock)) {
rw_exit(&zfs_snapshot_lock);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern with this is that dropping and reacquiring the lock here it leaves a small window where the zfs_snapentry_t can still be found on the _by_name and _by_objsetid list just before it's going to be destroyed.

Since the RW_WRITER lock is there to protect the lookup and and list removal. it makes sense to do both of those things first while holding the lock. Then drop the lock, and call zfsctl_snapshot_unmount_cancel. Since the validity of the se is being protected by the se->se_refcount. It looks like you will need to remove the ASSERT from zfsctl_snapshot_unmount_cancel.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair concern, reworked the change as suggested.

The test case is to reproduce actually, needs to add a sleep for few seconds in snapentry_expire(), before se->se_taskqid = TASKQID_INVALID; and during this sleep the zfs umount zfs_name( for which the snapshot existed/created) has to be fired.
But this sleep is only needed for testing purposes. But i do not see any flags under which I can add this sleep. Any idea?

@behlendorf behlendorf added the Status: Work in Progress Not yet ready for general review label Jul 30, 2018
Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But i do not see any flags under which I can add this sleep. Any idea?

Not really. None of the existing test infrastructure really allows for that. I know I suggested adding a test for this, but I'd OK with just manually verifying the fix.

zfsctl_snapshot_remove(se);
zfsctl_snapshot_rele(se);
}
rw_exit(&zfs_snapshot_lock);
if (se) {
zfsctl_snapshot_unmount_cancel(se);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You want to release the reference taken in zfsctl_snapshot_find_by_objsetid() with zfsctl_snapshot_rele() after calling zfsctl_snapshot_unmount_cancel(). This way the object can't be freed out from underneath you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point, made the changes. Thanks for the clarification on testing. This latest updated patch is manually tested.

zfs umount -> zfsctl_destroy() takes zfs_snapshot_lock WRITER &
calls zfsctl_snapshot_unmount_cancel() which waits for
snapentry_expire() if present(only when snap is automounted).
This snapentry_expire() itself then waits for zfs_snapshot_lock
READER, resulting in a deadlock.

Fix is, in zfsctl_destroy() do avl_tree lookup & removal with
WRITER zfs_snapshot_lock & leave this lock before
zfsctl_snapshot_unmount_cancel() call, since the validity of se
is protected by se->se_refcount.
Also remove the corresponding lock assertion from
zfsctl_snapshot_unmount_cancel() too.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rohan Puri <rohan.puri15@gmail.com>
Closes openzfs#7751
@behlendorf behlendorf added Reviewed and removed Status: Work in Progress Not yet ready for general review labels Aug 1, 2018
@behlendorf behlendorf added the Status: Accepted Ready to integrate (reviewed, tested) label Aug 1, 2018
@behlendorf behlendorf closed this in fd7265c Aug 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants