Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verify issue #1352

Closed
LamNguyen25 opened this issue Feb 24, 2022 · 5 comments
Closed

Verify issue #1352

LamNguyen25 opened this issue Feb 24, 2022 · 5 comments

Comments

@LamNguyen25
Copy link

LamNguyen25 commented Feb 24, 2022

When I replay the previous write job and then verify if we can read the same size of data on new volumes, I got the error below
throwing from here https://github.com/axboe/fio/blob/master/engines/windowsaio.c#L469. But if I dont use --do_verify=1, the error wont show up. The write job executes successfully:

write job:
cmd=['PowerShell.exe', '& $Env:ProgramFiles\fio\fio.exe', '--output=write_vol-096da7c71479c12cf_ev_writes.out', '--direct=1', '--ioengine=windowsaio', '--filename=\\.\PHYSICALDRIVE3', '--iodepth=32', '--verify=crc32c', '--rw=randwrite', '--verify_dump=1', '--verify_fatal=1', '--randseed=6387', '--do_verify=0', '--name=vol-096da7c71479c12cf_ev_writes', '--bsrange=4k-128k', '--time_based=1', '--runtime=10', '--verify_pattern=0xFF', '--norandommap=1', '--write_iolog=write_vol-096da7c71479c12cf_ev_writes.iolog']

verify job: (issue is from here)

cmd = ['PowerShell.exe', '& $Env:ProgramFiles\fio\fio.exe', '--output=verify_vol-096da7c71479c12cf_ev_writes.out', '--direct=1', '--ioengine=windowsaio', '--filename=\\.\PHYSICALDRIVE3', '--iodepth=32', '--verify=crc32c', '--rw=randwrite', '--
verify_dump=1', '--verify_fatal=1', '--randseed=6387', '--do_verify=1', '--name=vol-096da7c71479c12cf_ev_writes', '--bsrange=4k-128k', '--verify_pattern=0xFF', '--norandommap=1', '--replay_no_stall=1', '--read_iolog=read_vol-096da7c71479c12cf_ev_writes.iolog']

Do you know if this can be a bug/issue on windowsaio or this can be handled?

error:
windowsaio: failed to flush file buffers
fio: pid=636, err=13/file:ioengines.c:390, func=td_io_queue, error=Permission denied

Here is the output log:

fio: norandommap given for variable block sizes, verify limited
vol-096da7c71479c12cf_ev_writes: (g=0): rw=randwrite, bs=(R) 4096B-128KiB, (W) 4096B-128KiB, (T) 4096B-128KiB, ioengine=windowsaio, iodepth=32
fio-3.27
Starting 1 thread
windowsaio: failed to flush file buffers
fio: pid=636, err=13/file:ioengines.c:390, func=td_io_queue, error=Permission denied

vol-096da7c71479c12cf_ev_writes: (groupid=0, jobs=1): err=13 (file:ioengines.c:390, func=td_io_queue, error=Permission denied): pid=636: Sat Feb 19 06:36:34 2022
read: IOPS=1992, BW=128MiB/s (135MB/s)(75.3GiB/600917msec)
slat (nsec): min=1845, max=328282, avg=6763.18, stdev=6002.32
clat (usec): min=177, max=81673, avg=16029.83, stdev=1852.24
lat (usec): min=196, max=81681, avg=16036.60, stdev=1852.21
clat percentiles (usec):
| 1.00th=[12125], 5.00th=[13304], 10.00th=[13960], 20.00th=[14615],
| 30.00th=[15139], 40.00th=[15664], 50.00th=[16057], 60.00th=[16450],
| 70.00th=[16909], 80.00th=[17433], 90.00th=[18220], 95.00th=[18744],
| 99.00th=[20055], 99.50th=[20579], 99.90th=[21890], 99.95th=[24773],
| 99.99th=[44827]
bw ( KiB/s): min=130741, max=282812, per=100.00%, avg=131586.39, stdev=4779.09, samples=1184
iops : min= 1872, max= 4374, avg=1994.53, stdev=82.81, samples=1184
lat (usec) : 250=0.01%, 500=0.01%, 750=0.02%, 1000=0.03%
lat (msec) : 2=0.09%, 4=0.07%, 10=0.05%, 20=98.77%, 50=0.95%
lat (msec) : 100=0.01%
cpu : usr=1.83%, sys=0.67%, ctx=0, majf=0, minf=0
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=3.4%, 32=96.6%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.1%, 32=0.1%, 64=0.0%, >=64=0.0%
issued rwts: total=1197568,0,0,1 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=32

Run status group 0 (all jobs):
READ: bw=128MiB/s (135MB/s), 128MiB/s-128MiB/s (135MB/s-135MB/s), io=75.3GiB (80.9GB), run=600917-600917msec

@axboe
Copy link
Owner

axboe commented Feb 26, 2022

Sounds like Windows requires the file to be opened for write if we are flushing out buffers, which will only be done by fio if we're writing to the file. That's why it works from the write+verify job, but fails from a job that purely does reads.

Let me try and commit a fix for this.

axboe added a commit that referenced this issue Feb 26, 2022
It's not enough to just track writes, some operating systems require
a file to be opened for write to issue a file sync. Which does kind
of make sense...

Add such a flag and set it for iolog/blktrace replay, if we see a sync
in there.

This does mean we need to bump the IO engine version, as the engine
flags need to get shifted.

Link: #1352
Signed-off-by: Jens Axboe <axboe@kernel.dk>
@axboe axboe closed this as completed in c3773c1 Feb 26, 2022
@axboe
Copy link
Owner

axboe commented Feb 26, 2022

Pushed out two related commits, please give it a go. Reopen if it doesn't fix it for you.

@LamNguyen25
Copy link
Author

LamNguyen25 commented Feb 27, 2022

Hi, Thanks for taking a look at this issue.
I ran the tests again, but they still failed with the same error. I'm wondering if I need to update the Fio version on my windows instances or not to get the latest update?

I also try to read/verify the IO on a different disk by using --replay_redirect parameter, and I get a different error.
here is the command I use:
The write command is the same as above, and the verify command I add --replay_redirect:
Fio verify cmd: [‘PowerShell.exe‘, ‘& $Env:ProgramFiles\fio\fio.exe‘, ‘--output=verify_vol-0793e870233faa08e_0.out‘, ‘--direct=1‘, ‘--ioengine=windowsaio‘, ‘--filename=\\.\PHYSICALDRIVE3‘, ‘--iodepth=32‘, ‘--verify=crc32c‘, ‘--rw=randwrite‘, ‘--verify_dump=1‘, ‘--verify_fatal=1‘, ‘--do_verify=1‘, ‘--randseed=3847‘, ‘--bsrange=4k-128k‘, ‘--verify_pattern=0x0‘, ‘--norandommap=1‘, ‘--name=vol-0793e870233faa08e_0‘, ‘--replay_redirect=\\.\PHYSICALDRIVE4‘, ‘--replay_no_stall=1‘, ‘--read_iolog=read_vol-0793e870233faa08e_0.iolog’]
fio: norandommap given for variable block sizes, verify limited
This is the error:
Assertion failed: fio_file_open(f), file filesetup.c, line 1887

Here is the output file:
fio: this platform does not support process shared mutexes, forcing use of threads. Use the 'thread' option to get rid of this warning.
fio: norandommap given for variable block sizes, verify limited
vol-083e35985cf30b6e5_0: (g=0): rw=randwrite, bs=(R) 4096B-128KiB, (W) 4096B-128KiB, (T) 4096B-128KiB, ioengine=windowsaio, iodepth=32
fio-3.27
Starting 1 thread

Do you know if this is also related to the issue above? Is there a way that I can ignore the 'windowsaio: failed to flush file buffers' error?

TIA!

@axboe
Copy link
Owner

axboe commented Feb 27, 2022

If you run:

fio --version
fio-3.29-112-gc3773

it'll tell you what you're running, the c3773 is the first part of the git sha from which fio was built. I'd double check if you're running the expected version first. But just looking at your output, you're running 3.27. There's no time travel possible with fio (yet), you need to ensure that your instances are updated first.

@LamNguyen25
Copy link
Author

Thank you!
I installed the latest build here and ran the test again with Fio-3.29. The error still happens.
Do I need to add any other flags?

Here is the output log:

fio: this platform does not support process shared mutexes, forcing use of threads. Use the 'thread' option to get rid of this warning.
fio: norandommap given for variable block sizes, verify limited
vol-04e684f552694eb87_ev_writes: (g=0): rw=randwrite, bs=(R) 4096B-128KiB, (W) 4096B-128KiB, (T) 4096B-128KiB, ioengine=windowsaio, iodepth=32
fio-3.29
Starting 1 thread
windowsaio: failed to flush file buffers
fio: pid=3340, err=13/file:ioengines.c:390, func=td_io_queue, error=Permission denied

vol-04e684f552694eb87_ev_writes: (groupid=0, jobs=1): err=13 (file:ioengines.c:390, func=td_io_queue, error=Permission denied): pid=3340: Sun Feb 27 22:09:45 2022
read: IOPS=1930, BW=124MiB/s (131MB/s)(15.1GiB/124553msec)
slat (nsec): min=1821, max=1983.6k, avg=7335.18, stdev=13755.44
clat (usec): min=36, max=1066.2k, avg=16536.87, stdev=31096.76
lat (usec): min=150, max=1066.2k, avg=16544.20, stdev=31096.75
clat percentiles (usec):
| 1.00th=[ 343], 5.00th=[ 553], 10.00th=[ 963],
| 20.00th=[ 3228], 30.00th=[ 7963], 40.00th=[ 14877],
| 50.00th=[ 15795], 60.00th=[ 16450], 70.00th=[ 17433],
| 80.00th=[ 19530], 90.00th=[ 32375], 95.00th=[ 38536],
| 99.00th=[ 50070], 99.50th=[ 57410], 99.90th=[ 299893],
| 99.95th=[1035994], 99.99th=[1052771]
bw ( KiB/s): min= 8600, max=169800, per=100.00%, avg=130211.38, stdev=18443.48, samples=244
iops : min= 122, max= 2574, avg=1971.64, stdev=280.51, samples=244
lat (usec) : 50=0.01%, 250=0.36%, 500=3.85%, 750=3.24%, 1000=2.91%
lat (msec) : 2=4.05%, 4=10.71%, 10=6.23%, 20=49.06%, 50=18.58%
lat (msec) : 100=0.69%, 250=0.16%, 500=0.08%, 2000=0.07%
cpu : usr=1.61%, sys=0.00%, ctx=0, majf=0, minf=0
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=14.6%, 32=85.4%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=99.9%, 8=0.1%, 16=0.1%, 32=0.1%, 64=0.0%, >=64=0.0%
issued rwts: total=240486,0,0,1 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=32

Run status group 0 (all jobs):
READ: bw=124MiB/s (131MB/s), 124MiB/s-124MiB/s (131MB/s-131MB/s), io=15.1GiB (16.3GB), run=124553-124553msec

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants