-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIO reports verify: bad header rand_seed, in the RW mode when iodepth > 1 #1526
Comments
@KuribohG Thanks for your submission. I was able to reproduce your issue using the current tip of master. As you discovered, the issue occurs when an I/O request gets requeued. This occurs because of Lines 703 to 708 in b5904c0
Lines 50 to 63 in b5904c0
The fix will likely be to modify |
On configurations that can cause I/Os to be internally requeued from FIO_Q_BUSY such as '--iodepth_batch_complete_max', and the workload has verify enabled, the subsequent verification of the data fails with a bad verify rand_seed because the pattern for the I/O is generated twice for the same I/O, causing the seed to become out of sync when the verify is later performed. The seed is generate twice because do_io() handles the I/O twice, first when it originates the I/O and again when it later gets the same I/O back from get_io_u() after it's is pulled from the requeue list, which is where the first submission landed due to the workload reaching '--iodepth_batch_complete_max'. The fix is for do_io() to track when it has generated the verify pattern for an I/O via a new io_u flag 'IO_U_F_PATTERN_DONE', avoiding a second call to populate_verify_io_u() when that flag is detected. Note IO_U_F_PATTERN_DONE is set to 1<<9 instead of the next available 1<<8 because there is already a commit pending that uses 1<<8 (IO_U_F_OVERLAP_LOCK). Link: axboe#1526 Signed-off-by: Adam Horshack (horshack@live.com)
On configurations that can cause I/Os to be internally requeued from FIO_Q_BUSY such as '--iodepth_batch_complete_max', and the workload has verify enabled, the subsequent verification of the data fails with a bad verify rand_seed because the pattern for the I/O is generated twice for the same I/O, causing the seed to become out of sync when the verify is later performed. The seed is generate twice because do_io() handles the I/O twice, first when it originates the I/O and again when it later gets the same I/O back from get_io_u() after it's is pulled from the requeue list, which is where the first submission landed due to the workload reaching '--iodepth_batch_complete_max'. The fix is for do_io() to track when it has generated the verify pattern for an I/O via a new io_u flag 'IO_U_F_PATTERN_DONE', avoiding a second call to populate_verify_io_u() when that flag is detected. Link: axboe#1526 Signed-off-by: Adam Horshack (horshack@live.com)
@KuribohG , The fix for this issue has been merged into master. Can you please verify it fixes on your config and if so, close this issue? Thanks. |
Thanks! The PR fixed my issue. I will close this. |
Please acknowledge the following before creating a ticket
Description of the bug:
FIO reports verify error in the RW mode when iodepth > 1.
After debugging, I found this may be the reason: Under RW mode, there may be some IO requests that must be requeued, if there is a read/write switching when iodepth > 1. The
rand_seed
field inio_u
changed when requeue (which I think is not desirable), but fio is still using the originalrand_seed
for verifying.I'm not sure if this problem is related to #1503.
Environment: Ubuntu 20.04
fio version: 3.32
Reproduction steps
Run fio with the configuration above, and the bug will reproduce consistently.
The text was updated successfully, but these errors were encountered: