New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
making sure the last quiesced txg is synced #8239
base: master
Are you sure you want to change the base?
Conversation
Fixed a potential bug as described in openzfs#8233: Consider this scenario (see [txg.c](https://github.com/zfsonlinux/zfs/blob/06f3fc2a4b097545259935d54634c5c6f49ed20f/module/zfs/txg.c) ): There is heavy write load when the pool exports. After `txg_sync_stop`'s call of `txg_wait_synced` returns, many more txgs get processed, but right before` txg_sync_stop` gets `tx_sync_lock`, the following happens: - `txg_sync_thread` begins waiting on `tx_sync_more_cv`. - `txg_quiesce_thread` gets done with `txg_quiesce(dp, txg)`. - `txg_sync_stop` gets `tx_sync_lock` first, calls `cv_broadcast`s with `tx_exiting` == 1, and waits for exits. - `txg_sync_thread` wakes up first and exits. - Finally, `txg_quiesce_thread` gets `tx_sync_lock`, and calls `cv_broadcast(&tx->tx_sync_more_cv)`, but `txg_sync_thread` is already gone, and the txg in `txg_quiesce(dp, txg)` above never gets synced. Signed-off-by: Leap Second <leapsecond@protonmail.com>
Codecov Report
@@ Coverage Diff @@
## master #8239 +/- ##
==========================================
- Coverage 78.57% 78.45% -0.12%
==========================================
Files 379 379
Lines 114924 114927 +3
==========================================
- Hits 90299 90166 -133
- Misses 24625 24761 +136
Continue to review full report at Codecov.
|
Fixed checkstyle complaints: ./module/zfs/txg.c: 558: line > 80 characters ./module/zfs/txg.c: 562: line > 80 characters Signed-off-by: Leap Second <leapsecond@protonmail.com>
Addressed checkstype complaints: ./module/zfs/txg.c: 559: continuation should be indented 4 spaces ./module/zfs/txg.c: 564: continuation should be indented 4 spaces Signed-off-by: Leap Second <leapsecond@protonmail.com>
Addressed checkstyle complaints: ./module/zfs/txg.c: 559: spaces instead of tabs ./module/zfs/txg.c: 559: continuation should be indented 4 spaces ./module/zfs/txg.c: 564: spaces instead of tabs ./module/zfs/txg.c: 564: continuation should be indented 4 spaces Signed-off-by: Leap Second <leapsecond@protonmail.com>
Kernel.org Built-in x86_64 (BUILD) keeps failing with the following error in the make log:
Any thoughts why? EDIT:
|
Since this happened to my fork instead of the master, I am not sure about starting a new issue. I was making trivial edits to please checkstyle (hope I didn't rival others on a early saturday morning ;) ). After three successful runs on
Here are some snippets from relevant logs: Last few lines from 7.2.tests of the failed run:
The last fews lines from 7.3.log:
And the relevant part in 7.4.console:
Looks like @behlendorf can you take a look when you got time? |
@seekfirstleapsecond thanks for opening the PR and looking in to the failures. Don't worry about the kernel.org failures, they appear to be due to recent changes to an unreleased kernel and will need to be investigated independently. As for that last |
Signed-off-by: Leap Second <leapsecond@protonmail.com>
Signed-off-by: Leap Second <leapsecond@protonmail.com>
ping |
Motivation and Context
See #8233
Consider this scenario (see txg.c ):
There is heavy write load when the pool exports.
After
txg_sync_stop
's call oftxg_wait_synced
returns, many more txgs get processed, but right beforetxg_sync_stop
getstx_sync_lock
, the following happens:txg_sync_thread
begins waiting ontx_sync_more_cv
.txg_quiesce_thread
gets done withtxg_quiesce(dp, txg)
.txg_sync_stop
getstx_sync_lock
first, callscv_broadcast
s withtx_exiting
== 1, and waits for exits.txg_sync_thread
wakes up first and exits.txg_quiesce_thread
getstx_sync_lock
, and callscv_broadcast(&tx->tx_sync_more_cv)
,but
txg_sync_thread
is already gone, and the txg intxg_quiesce(dp, txg)
above never gets synced.Description
txg_sync_thread
now waits fortxg_quiesce_thread
to exit and maybe run one more sync before exiting.How Has This Been Tested?
Did not test.
Types of changes
Checklist:
Signed-off-by
.