Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read stats for backlog verifies not reported for time-expired workloads #1515

Closed
1 task done
horshack-dpreview opened this issue Feb 10, 2023 · 3 comments
Closed
1 task done

Comments

@horshack-dpreview
Copy link
Contributor

horshack-dpreview commented Feb 10, 2023

Please acknowledge the following before creating a ticket

Description of the bug:
When verify_backlog is used on a write-only workload with a runtime= value and the runtime expires before the workload has written its full dataset, the read stats for the backlog verifies are not reported, resulting in a stat result showing only the workload writes (ie, the "read:" results section is completely missing from fio's stats output)

Environment: Ubuntu 22.04.1 LTS

fio version: 3.33, tip of master

Reproduction steps

sudo modprobe brd rd_size=1024000 rd_nr=1
sudo fio -name=test -rw=write -filename=/dev/ram0 -verify=xxhash -verify_backlog=1 -bs=512b -size=1G -runtime=1s 
test: (g=0): rw=write, bs=(R) 512B-512B, (W) 512B-512B, (T) 512B-512B, ioengine=psync, iodepth=1
fio-3.33-82-g0f85
Starting 1 process
Jobs: 1 (f=1)
test: (groupid=0, jobs=1): err= 0: pid=12560: Fri Feb 10 09:16:01 2023
  write: IOPS=93.3k, BW=45.5MiB/s (47.8MB/s)(45.6MiB/1001msec); 0 zone resets
    clat (nsec): min=908, max=304377, avg=1871.05, stdev=1496.67
     lat (nsec): min=1816, max=307240, avg=3202.44, stdev=1564.86
    clat percentiles (nsec):
     |  1.00th=[  980],  5.00th=[ 1400], 10.00th=[ 1400], 20.00th=[ 1400],
     | 30.00th=[ 1400], 40.00th=[ 1400], 50.00th=[ 1464], 60.00th=[ 1464],
     | 70.00th=[ 1464], 80.00th=[ 1880], 90.00th=[ 3760], 95.00th=[ 4256],
     | 99.00th=[ 5280], 99.50th=[ 7456], 99.90th=[ 9280], 99.95th=[10432],
     | 99.99th=[24192]
   bw (  KiB/s): min=45367, max=45367, per=97.27%, avg=45367.00, stdev= 0.00, samples=1
   iops        : min=90734, max=90734, avg=90734.00, stdev= 0.00, samples=1
  lat (nsec)   : 1000=7.07%
  lat (usec)   : 2=82.50%, 4=7.54%, 10=2.86%, 20=0.02%, 50=0.01%
  lat (usec)   : 100=0.01%, 500=0.01%
  cpu          : usr=10.70%, sys=89.10%, ctx=2, majf=0, minf=14
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=93372,93372,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=45.5MiB/s (47.8MB/s), 45.5MiB/s-45.5MiB/s (47.8MB/s-47.8MB/s), io=45.6MiB (47.8MB), run=1001-1001msec

Disk stats (read/write):
  ram0: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%

Output with fix applied:
#1514

test: (g=0): rw=write, bs=(R) 512B-512B, (W) 512B-512B, (T) 512B-512B, ioengine=psync, iodepth=1
fio-3.33
Starting 1 process
Jobs: 1 (f=1)
test: (groupid=0, jobs=1): err= 0: pid=12838: Fri Feb 10 09:20:09 2023
  read: IOPS=75.3k, BW=36.8MiB/s (38.5MB/s)(36.8MiB/1001msec)
    clat (nsec): min=907, max=22769, avg=1520.47, stdev=345.04
     lat (nsec): min=1885, max=23746, avg=2736.71, stdev=490.13
    clat percentiles (nsec):
     |  1.00th=[  980],  5.00th=[ 1400], 10.00th=[ 1400], 20.00th=[ 1400],
     | 30.00th=[ 1400], 40.00th=[ 1400], 50.00th=[ 1464], 60.00th=[ 1464],
     | 70.00th=[ 1464], 80.00th=[ 1464], 90.00th=[ 1880], 95.00th=[ 1960],
     | 99.00th=[ 2448], 99.50th=[ 2864], 99.90th=[ 2928], 99.95th=[ 4768],
     | 99.99th=[11968]
  write: IOPS=75.3k, BW=36.8MiB/s (38.5MB/s)(36.8MiB/1001msec); 0 zone resets
    clat (nsec): min=838, max=357875, avg=2089.22, stdev=1742.54
     lat (nsec): min=1885, max=362205, avg=4257.19, stdev=1900.09
    clat percentiles (nsec):
     |  1.00th=[ 1400],  5.00th=[ 1400], 10.00th=[ 1400], 20.00th=[ 1464],
     | 30.00th=[ 1464], 40.00th=[ 1464], 50.00th=[ 1880], 60.00th=[ 1880],
     | 70.00th=[ 1880], 80.00th=[ 1960], 90.00th=[ 3856], 95.00th=[ 4320],
     | 99.00th=[ 5664], 99.50th=[ 7712], 99.90th=[10048], 99.95th=[11712],
     | 99.99th=[34048]
   bw (  KiB/s): min=36333, max=36333, per=96.53%, avg=36333.00, stdev= 0.00, samples=1
   iops        : min=72666, max=72666, avg=72666.00, stdev= 0.00, samples=1
  lat (nsec)   : 1000=1.12%
  lat (usec)   : 2=87.97%, 4=6.23%, 10=4.61%, 20=0.06%, 50=0.01%
  lat (usec)   : 100=0.01%, 500=0.01%
  cpu          : usr=22.70%, sys=77.00%, ctx=4, majf=0, minf=14
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=75354,75354,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=36.8MiB/s (38.5MB/s), 36.8MiB/s-36.8MiB/s (38.5MB/s-38.5MB/s), io=36.8MiB (38.6MB), run=1001-1001msec
  WRITE: bw=36.8MiB/s (38.5MB/s), 36.8MiB/s-36.8MiB/s (38.5MB/s-38.5MB/s), io=36.8MiB (38.6MB), run=1001-1001msec

Disk stats (read/write):
  ram0: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%
@horshack-dpreview
Copy link
Contributor Author

A secondary issue associated with the same root cause of the issue above is incorrect read stats results for the case of a non-timed execution. For example:

sudo modprobe brd rd_size=1024000 rd_nr=1
sudo fio -name=test -rw=write -filename=/dev/ram0 -verify=xxhash -verify_backlog=1 -bs=512b -size=100M

Produces wildly inaccurate read results, owing to the fact that the reads for the backlog verifies aren't being sampled for the stats but the reads for the post-workload verify are (do_verify), but there are no verifies to perform since the backlog completed them, thus the result is wildly inaccurate read stats since the completion time of do_verify() is near zero:

test: (g=0): rw=write, bs=(R) 512B-512B, (W) 512B-512B, (T) 512B-512B, ioengine=psync, iodepth=1
fio-3.33
Starting 1 process
Jobs: 1 (f=1)
test: (groupid=0, jobs=1): err= 0: pid=44852: Fri Feb 10 18:58:10 2023
  read: IOPS=205M, BW=97.7GiB/s (105GB/s)(100MiB/1msec)
    clat (nsec): min=838, max=21930, avg=1505.70, stdev=276.06
     lat (nsec): min=1885, max=23397, avg=2708.16, stdev=388.11
    clat percentiles (nsec):
     |  1.00th=[  980],  5.00th=[ 1400], 10.00th=[ 1400], 20.00th=[ 1400],
     | 30.00th=[ 1400], 40.00th=[ 1400], 50.00th=[ 1464], 60.00th=[ 1464],
     | 70.00th=[ 1464], 80.00th=[ 1464], 90.00th=[ 1880], 95.00th=[ 1880],
     | 99.00th=[ 2384], 99.50th=[ 2448], 99.90th=[ 2864], 99.95th=[ 3344],
     | 99.99th=[10176]
  write: IOPS=76.2k, BW=37.2MiB/s (39.0MB/s)(100MiB/2686msec); 0 zone resets
    clat (nsec): min=907, max=79268, avg=2063.13, stdev=1144.77
     lat (nsec): min=1886, max=81643, avg=4179.84, stdev=1237.45
    clat percentiles (nsec):
     |  1.00th=[ 1400],  5.00th=[ 1400], 10.00th=[ 1400], 20.00th=[ 1464],
     | 30.00th=[ 1464], 40.00th=[ 1816], 50.00th=[ 1880], 60.00th=[ 1880],
     | 70.00th=[ 1880], 80.00th=[ 1960], 90.00th=[ 4192], 95.00th=[ 4320],
     | 99.00th=[ 5280], 99.50th=[ 7136], 99.90th=[ 9536], 99.95th=[11328],
     | 99.99th=[29056]
   bw (  KiB/s): min=36357, max=38691, per=100.00%, avg=38153.40, stdev=1009.13, samples=5
   iops        : min=72714, max=77382, avg=76306.80, stdev=2018.25, samples=5
  lat (nsec)   : 1000=0.73%
  lat (usec)   : 2=91.31%, 4=2.88%, 10=5.04%, 20=0.04%, 50=0.01%
  lat (usec)   : 100=0.01%
  cpu          : usr=23.84%, sys=76.06%, ctx=5, majf=0, minf=13
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=204800,204800,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=97.7GiB/s (105GB/s), 97.7GiB/s-97.7GiB/s (105GB/s-105GB/s), io=100MiB (105MB), run=1-1msec
  WRITE: bw=37.2MiB/s (39.0MB/s), 37.2MiB/s-37.2MiB/s (39.0MB/s-39.0MB/s), io=100MiB (105MB), run=2686-2686msec

Disk stats (read/write):
  ram0: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%

Here's the output of the above run with the same correction from #1514:

test: (g=0): rw=write, bs=(R) 512B-512B, (W) 512B-512B, (T) 512B-512B, ioengine=psync, iodepth=1
fio-3.33
Starting 1 process
Jobs: 1 (f=1)
test: (groupid=0, jobs=1): err= 0: pid=45042: Fri Feb 10 19:03:08 2023
  read: IOPS=77.1k, BW=37.7MiB/s (39.5MB/s)(100MiB/2656msec)
    clat (nsec): min=907, max=16762, avg=1478.62, stdev=250.07
     lat (nsec): min=1816, max=18228, avg=2683.73, stdev=379.07
    clat percentiles (nsec):
     |  1.00th=[  980],  5.00th=[ 1400], 10.00th=[ 1400], 20.00th=[ 1400],
     | 30.00th=[ 1400], 40.00th=[ 1400], 50.00th=[ 1464], 60.00th=[ 1464],
     | 70.00th=[ 1464], 80.00th=[ 1464], 90.00th=[ 1464], 95.00th=[ 1880],
     | 99.00th=[ 2384], 99.50th=[ 2448], 99.90th=[ 2864], 99.95th=[ 2928],
     | 99.99th=[ 9664]
  write: IOPS=77.1k, BW=37.7MiB/s (39.5MB/s)(100MiB/2656msec); 0 zone resets
    clat (nsec): min=489, max=164334, avg=2044.14, stdev=1140.45
     lat (nsec): min=1955, max=182981, avg=4150.91, stdev=1239.13
    clat percentiles (nsec):
     |  1.00th=[ 1400],  5.00th=[ 1400], 10.00th=[ 1400], 20.00th=[ 1464],
     | 30.00th=[ 1464], 40.00th=[ 1464], 50.00th=[ 1880], 60.00th=[ 1880],
     | 70.00th=[ 1880], 80.00th=[ 1960], 90.00th=[ 3920], 95.00th=[ 4320],
     | 99.00th=[ 5152], 99.50th=[ 5792], 99.90th=[ 8896], 99.95th=[10176],
     | 99.99th=[23936]
   bw (  KiB/s): min=36413, max=39190, per=100.00%, avg=38565.20, stdev=1204.30, samples=5
   iops        : min=72826, max=78380, avg=77130.40, stdev=2408.59, samples=5
  lat (nsec)   : 500=0.01%, 1000=1.09%
  lat (usec)   : 2=90.92%, 4=3.23%, 10=4.73%, 20=0.02%, 50=0.01%
  lat (usec)   : 100=0.01%, 250=0.01%
  cpu          : usr=25.86%, sys=74.06%, ctx=6, majf=0, minf=13
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=204800,204800,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=37.7MiB/s (39.5MB/s), 37.7MiB/s-37.7MiB/s (39.5MB/s-39.5MB/s), io=100MiB (105MB), run=2656-2656msec
  WRITE: bw=37.7MiB/s (39.5MB/s), 37.7MiB/s-37.7MiB/s (39.5MB/s-39.5MB/s), io=100MiB (105MB), run=2656-2656msec

Disk stats (read/write):
  ram0: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%

axboe pushed a commit that referenced this issue Feb 15, 2023
When verify_backlog is used on a write-only workload with a runtime= value
and the runtime expires before the workload has written its full dataset,
the read stats for the backlog verifies are not reported, resulting in a
stat result showing only the workload writes (ie, the "read:" results
section is completely missing from fio's stats output)

The logic in thread_main() fails to call update_runtime() for DDIR_READ
because the existing call to update_runtime() for DDIR_READ on write-only
workloads is currently only done after do_verify() is complete, which wont
be called in this scenario because td->terminate is true due to the
expiration of the runtime.

Link: #1515

Signed-off-by: Adam Horshack (horshack@live.com)
@george-eldridge-wdc
Copy link

The only caveat with this fix is that when verify_backlog=n is used the read workflow timer starts at the beginning of the write workflow and not when the first verify read actually occurs. For large values of verify_backlog this can lead to printing of lower average read bandwidths than expected. For example, with verify_backlog=8m it can be several minutes before the first verify read occurs but the read data rate is averaged over the entire run.

@horshack-dpreview
Copy link
Contributor Author

horshack-dpreview commented Feb 25, 2023

@george-eldridge, Thanks. That aspect was discussed in the pull request thread that included a fix:

#1514

Btw the scenario you described occurs in other areas as well. For example, --rw=rw -rwmixread=5 will report very low average read bandwidths since the reads are only 5% of the workload mix and both reads and writes are calculated over the same workload time period

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants