-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix zfs_write() / mmap update_time() lock inversion #7942
Conversation
module/zfs/zfs_vnops.c
Outdated
*/ | ||
if (flags == I_DIRTY_TIME) { | ||
if (flags & I_DIRTY_TIME & |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
&&
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
module/zfs/zfs_vnops.c
Outdated
* cannot be assigned set z_atime_dirty=1 so at least the times will | ||
* be updated when the file is closed. | ||
*/ | ||
boolean_t waited = B_FALSE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand this. Is this initialization going to take effect once or every time?
We should move this to beginning of the function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, of course thank you!
f7a900f
to
cc20bb3
Compare
* cannot be assigned set z_atime_dirty=1 so at least the times will | ||
* be updated when the file is closed. | ||
*/ | ||
error = dmu_tx_assign(tx, (waited ? TXG_NOTHROTTLE : 0) | TXG_NOWAIT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In dmu_tx_try_assign
:
if (dn->dn_assigned_txg == tx->tx_txg - 1) {
mutex_exit(&dn->dn_mtx);
tx->tx_needassign_txh = txh;
DMU_TX_STAT_BUMP(dmu_tx_group);
return (SET_ERROR(ERESTART));
}
It doesn't seem that TXG_NOTHROTTLE will prevent dmu_tx_assign from failing? Perhaps we should bail out on waited && error == ERESTART
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. You're right, it won't prevent it from failing under all circumstances, only due to the write throttle. I assume your specific concern is that we could end up spinning here, which would be just as bad.
dmu_tx_abort(tx); | ||
goto out; | ||
zp->z_atime_dirty = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with this is that we only do lazy update atime, but not mtime and ctime. We need to add that otherwise we will lose those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true, unfortunate we really don't have a mechanism for that currently. Deferring the atime
update to zfs_inactive
already isn't really correct, but there was already machinery which I figured was better than nothing for this unlikely case. In order to handle all cases here I think we'd need to redirty the inode.
Codecov Report
@@ Coverage Diff @@
## master #7942 +/- ##
==========================================
+ Coverage 43.95% 43.96% +0.01%
==========================================
Files 318 318
Lines 103561 103566 +5
==========================================
+ Hits 45521 45535 +14
+ Misses 58040 58031 -9
Continue to review full report at Codecov.
|
When a page is faulted in by filemap_page_mkwrite() this function may be called by update_time() with the file's mmap_sem held. Therefore it's necessary to use TXG_NOWAIT since we cannot release the mmap_sem, and even if we could, it would be undesirable to delay the page fault. TXG_NOTHROTTLE will be set as needed to bypass the write throttle. In the unlikely case the transaction cannot be assigned set z_atime_dirty=1 so at least the times will be updated when the file is closed. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
cc20bb3
to
4715f1a
Compare
Closing in favor of the proposal in #7939. |
Motivation and Context
Alternate approach to #7939, which is intended to address #7512.
Pushed for the purposes and review feedback, testing, and letting
the CI test it out.
Description
When a page is faulted in by filemap_page_mkwrite() this function
may be called by update_time() with the file's mmap_sem held.
Therefore it's necessary to use TXG_NOWAIT since we cannot release
the mmap_sem, and even if we could, it would be undesirable to
delay the page fault. TXG_NOTHROTTLE will be set as needed to
bypass the write throttle. In the unlikely case the transaction
cannot be assigned set z_atime_dirty=1 so at least the times will
be updated when the file is closed.
How Has This Been Tested?
Locally built and tested using the reproduced provided in #7939.
Types of changes
Checklist:
Signed-off-by
.