Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clat latency doesn't seem correct #1668

Closed
1 task done
raj-prince opened this issue Nov 6, 2023 · 1 comment
Closed
1 task done

Clat latency doesn't seem correct #1668

raj-prince opened this issue Nov 6, 2023 · 1 comment

Comments

@raj-prince
Copy link

raj-prince commented Nov 6, 2023

Please acknowledge the following before creating a ticket

Description of the bug:

4_thread: (g=0): rw=write, bs=(R) 16.0KiB-16.0KiB, (W) 16.0KiB-16.0KiB, (T) 16.0KiB-16.0KiB, ioengine=libaio, iodepth=64
...
fio-3.35-139-gc5d8
Starting 40 threads
Jobs: 1 (f=1): [W(1),_(39)][100.0%][eta 00m:00s]                                                                                                                 
4_thread: (groupid=0, jobs=40): err= 0: pid=562556: Mon Nov  6 04:07:30 2023
  write: IOPS=29, BW=986KiB/s (1010kB/s)(76.1MiB/79037msec); 0 zone resets
    slat (usec): min=306, max=735217, avg=59917.76, stdev=39694.55
    clat (usec): min=6, max=94396k, avg=41163568.65, stdev=21078844.63
     lat (msec): min=63, max=94396, avg=41197.21, stdev=21078.47
    clat percentiles (msec):
     |  1.00th=[  877],  5.00th=[10402], 10.00th=[12550], 20.00th=[17113],
     | 30.00th=[17113], 40.00th=[17113], 50.00th=[17113], 60.00th=[17113],
     | 70.00th=[17113], 80.00th=[17113], 90.00th=[17113], 95.00th=[17113],
     | 99.00th=[17113], 99.50th=[17113], 99.90th=[17113], 99.95th=[17113],
     | 99.99th=[17113]
   bw (  KiB/s): min= 1270, max= 4705, per=100.00%, avg=3203.18, stdev=26.81, samples=919
   iops        : min=   78, max=  290, avg=198.16, stdev= 1.67, samples=919
  lat (usec)   : 10=0.09%
  lat (msec)   : 100=0.17%, 250=0.38%, 500=0.72%, 750=0.43%, 1000=0.47%
  lat (msec)   : 2000=0.72%, >=2000=204.26%
  fsync/fdatasync/sync_file_range:
    sync (msec): min=9709, max=94396, avg=43189.49, stdev=19686.84
    sync percentiles (msec):
     |  1.00th=[10402],  5.00th=[12416], 10.00th=[15368], 20.00th=[17113],
     | 30.00th=[17113], 40.00th=[17113], 50.00th=[17113], 60.00th=[17113],
     | 70.00th=[17113], 80.00th=[17113], 90.00th=[17113], 95.00th=[17113],
     | 99.00th=[17113], 99.50th=[17113], 99.90th=[17113], 99.95th=[17113],
     | 99.99th=[17113]
  cpu          : usr=0.00%, sys=0.00%, ctx=9399, majf=0, minf=0
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=199.9%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=98.3%, 8=0.0%, 16=0.0%, 32=0.0%, 64=1.7%, >=64=0.0%
     issued rwts: total=0,2350,0,2348 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
  WRITE: bw=986KiB/s (1010kB/s), 986KiB/s-986KiB/s (1010kB/s-1010kB/s), io=76.1MiB (79.8MB), run=79037-79037msec

As per my understanding, lat = clat + slat
Based on the reported clat percentile latency average should be around 17 seconds. But the average clat latency is 41163568.65 (usec) which is equivalent to 41 second.

Environment:
Ubuntu-20

fio version:
latest - fio-3.35-139-gc5d8 (built from source - c5d8ce3)

Reproduction steps
Job spec:

[global]
ioengine=libaio
direct=1
fadvise_hint=0
verify=0
rw=read
bs=16K
iodepth=64
invalidate=1
ramp_time=10s
runtime=60s
time_based=1
nrfiles=1
thread=1
fsync=1
openfiles=1
group_reporting=1
allrandrepeat=1
filename_format=$jobname.$jobnum.$filenum

[4_thread]
stonewall
directory=gcs/5mb
filesize=256K
rw=write
numjobs=40
@raj-prince raj-prince changed the title Clat latency doesn't seems correct Clat latency doesn't seem correct Nov 6, 2023
@vincentkfu
Copy link
Collaborator

vincentkfu commented Nov 6, 2023

I believe the problem is that your workload produces latencies exceeding the upper bound of values that fio records accurately for the latency percentiles. For latency percentiles fio accurately records values up to 2^34 ns or about 17 seconds. See the discussion at

fio/stat.h

Line 39 in 2c0b784

/*

In stat.h line 34 try changing 29 to something like 31 in the line below.

#define FIO_IO_U_PLAT_GROUP_NR 29

Then build fio again and re-run your workload.

The IO depths line in the output does seem strange as the last category is listed at 200%.

  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=199.9%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants