Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eta is whack in fio version 2.1.+ #2

Closed
db-at-home opened this issue Oct 10, 2013 · 1 comment
Closed

eta is whack in fio version 2.1.+ #2

db-at-home opened this issue Oct 10, 2013 · 1 comment

Comments

@db-at-home
Copy link

Jobs: 30 (f=30): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [0.0% done] [494.9MB/0KB/0KB /s] [63.4K/0/0 iops] [eta 1158047090d:11h:36m:31s]]

config for fio

[global]
direct=1
zonesize=1g
zoneskip=200g

runtime=120

write_bw_log

write_lat_log=lat.out

write_iops_log

ioengine=libaio
rw=read
iodepth=1
bs=8192,8192

[/dev/sda]
[/dev/sdaa]
[/dev/sdab]
[/dev/sdac]
[/dev/sdad]
[/dev/sdb]
[/dev/sdc]
[/dev/sdd]
[/dev/sde]
[/dev/sdf]
[/dev/sdg]
[/dev/sdh]
[/dev/sdi]
[/dev/sdj]
[/dev/sdk]
[/dev/sdl]
[/dev/sdm]
[/dev/sdn]
[/dev/sdo]
[/dev/sdp]
[/dev/sdq]
[/dev/sdr]
[/dev/sds]
[/dev/sdt]
[/dev/sdu]
[/dev/sdv]
[/dev/sdw]
[/dev/sdx]
[/dev/sdy]
[/dev/sdz]

@greenmang0
Copy link

Faced the same issue with 1.59 and 2.2.9, I think when direct=1 is used, this happens, since atomic and buffered worked as expected.

axboe pushed a commit that referenced this issue Sep 5, 2015
fio does not provide any possibility to verify checksum of a block with meta
information inside. You can create configuration for verifincation checksum
of random data either you can verify meta information with some pattern or
random data, but not both.

Why checksumming and meta together can be useful? Meta helps to figure out internally
on filesystem or storage what block was written in case of corruption, i.e. offset
of the block and block number explicitly tell us the virtual address of the block.
On the other hand checksum of random data helps to detect corruption. Using meta
and pattern together do not help a lot, since 'verify_interval' can be big enough
and same sequence of pattern bytes will be undistinguishable internally on filesystem
or storage.

Also, it seems to me that keeping meta header separately from generic verify header
does not make a lot of sense, since generic verify header can include all members
of meta header without any performance or other impact.

In this patch I move all members from vhdr_meta structure to generic verify_header,
always verifying meta with the possiblity to checksum the following data: random
or pattern.

You are allowed to specify verify_pattern=str with any of the possible verification
methods and have also meta verification, i.e.

   verify=md5
   verify_pattern=0xfe

 or

   verify=sha1
   verify_pattern=0xff

 etc.

To keep everything compatible with old configurations it is still possible to specify

   verify=meta

but this option marked and depricated and kept only for compatibility reasons.

Before that patch the verification layout according to the specified options looks
as the following, e.g.:

 #1
    --
    verify=meta
    verify_pattern=0xff
    --

    result layout of each block: [hdr|meta|pattern]

 #2
    --
    verify_pattern=0xff
    --

    result layout of each block: [hdr|pattern]

 #3
    --
    verify=pattern
    verify_pattern=0xff
    --

    result layout of each block: [pattern]

After applying of the patch 'vhdr_meta' is always embedded into 'verify_header' and layout
looks as the following, e.g.:

 #1
    --
    verify=meta
    verify_pattern=0xff
    --

    result layout of each block: [hdr+meta|pattern]
 #2
    --
    verify=md5|sha1|etc
    verify_pattern=0xff
    --

    result layout of each block: [hdr+meta|cksum|pattern]

 #3
    --
    verify_pattern=0xff
    --

    result layout of each block: [hdr+meta|pattern]

 #4
    --
    verify=pattern
    verify_pattern=0xff
    --

    result layout of each block: [pattern]

Signed-off-by: Roman Pen <r.peniaev@gmail.com>
Cc: fio@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@fb.com>
@axboe axboe closed this as completed Dec 19, 2015
sitsofe added a commit to sitsofe/fio that referenced this issue Oct 15, 2017
fio on Windows with a large number of CPUs/cores frequently fails while running

./fio --cpuclock-test

even though "reliable_tsc: yes" is reported. Using clang's thread sanitizer via

CC=clang ./configure --extra-cflags="-fsanitize=thread"

and running the same on Linux also generates multiple warnings similar to the
following on a VM with 16 cores:

WARNING: ThreadSanitizer: data race (pid=23780)
  Atomic write of size 4 at 0x7ffecb865a3c by thread T15 (mutexes: write M169):
    #0 __tsan_atomic32_fetch_add /home/clang-3.9/llvm/projects/compiler-rt/lib/tsan/rtl/tsan_interface_atomic.cc:591 (fio+0x000000471505)
    axboe#1 atomic32_inc_return /home/fio/gettime.c:567:13 (fio+0x0000004c56c1)
    axboe#2 clock_thread_fn /home/fio/gettime.c:607 (fio+0x0000004c56c1)

  Previous read of size 4 at 0x7ffecb865a3c by thread T4 (mutexes: write M147):
    #0 clock_thread_fn /home/fio/gettime.c:611:19 (fio+0x0000004c56e2)

  Location is stack of main thread.

  Mutex M169 (0x7d700000f6a0) created at:
    #0 pthread_mutex_init /home/clang-3.9/llvm/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:1119 (fio+0x00000043b695)
    axboe#1 fio_monotonic_clocktest /home/fio/gettime.c:694:3 (fio+0x0000004c4c12)
    axboe#2 parse_cmd_line /home/fio/init.c:2710:15 (fio+0x0000004ce8e5)
    axboe#3 parse_options /home/fio/init.c:2828:14 (fio+0x0000004cf3da)
    axboe#4 main /home/fio/fio.c:47:6 (fio+0x00000054b991)

  Mutex M147 (0x7d700000f178) created at:
    #0 pthread_mutex_init /home/clang-3.9/llvm/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:1119 (fio+0x00000043b695)
    axboe#1 fio_monotonic_clocktest /home/fio/gettime.c:694:3 (fio+0x0000004c4c12)
    axboe#2 parse_cmd_line /home/fio/init.c:2710:15 (fio+0x0000004ce8e5)
    axboe#3 parse_options /home/fio/init.c:2828:14 (fio+0x0000004cf3da)
    axboe#4 main /home/fio/fio.c:47:6 (fio+0x00000054b991)

  Thread T15 (tid=23796, running) created by main thread at:
    #0 pthread_create /home/clang-3.9/llvm/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:902 (fio+0x00000042c9a6)
    axboe#1 fio_monotonic_clocktest /home/fio/gettime.c:697:7 (fio+0x0000004c4c38)
    axboe#2 parse_cmd_line /home/fio/init.c:2710:15 (fio+0x0000004ce8e5)
    axboe#3 parse_options /home/fio/init.c:2828:14 (fio+0x0000004cf3da)
    axboe#4 main /home/fio/fio.c:47:6 (fio+0x00000054b991)

  Thread T4 (tid=23785, finished) created by main thread at:
    #0 pthread_create /home/clang-3.9/llvm/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:902 (fio+0x00000042c9a6)
    axboe#1 fio_monotonic_clocktest /home/fio/gettime.c:697:7 (fio+0x0000004c4c38)
    axboe#2 parse_cmd_line /home/fio/init.c:2710:15 (fio+0x0000004ce8e5)
    axboe#3 parse_options /home/fio/init.c:2828:14 (fio+0x0000004cf3da)
    axboe#4 main /home/fio/fio.c:47:6 (fio+0x00000054b991)

SUMMARY: ThreadSanitizer: data race /home/fio/gettime.c:567:13 in atomic32_inc_return

Avoid accessing t->seq directly and use __sync_val_compare_and_swap to
get at it. This shuts up the sanitizer, makes the test work on Windows
and hopefully means the appropriate memory fencing will be in place
preventing unwanted compiler or CPU reordering.

Fixes: axboe#479

Signed-off-by: Sitsofe Wheeler <sitsofe@yahoo.com>
sitsofe added a commit to sitsofe/fio that referenced this issue Oct 17, 2017
fio on Windows with a 16 or 32 CPUs frequently fails while running

./fio --cpuclock-test

even though "reliable_tsc: yes" is reported. Using clang's thread sanitizer via

CC=clang ./configure --extra-cflags="-fsanitize=thread"

and running the same on Linux also generates multiple warnings similar to the
following on a VM with 16 cores:

WARNING: ThreadSanitizer: data race (pid=23780)
  Atomic write of size 4 at 0x7ffecb865a3c by thread T15 (mutexes: write M169):
    #0 __tsan_atomic32_fetch_add /home/clang-3.9/llvm/projects/compiler-rt/lib/tsan/rtl/tsan_interface_atomic.cc:591 (fio+0x000000471505)
    axboe#1 atomic32_inc_return /home/fio/gettime.c:567:13 (fio+0x0000004c56c1)
    axboe#2 clock_thread_fn /home/fio/gettime.c:607 (fio+0x0000004c56c1)

  Previous read of size 4 at 0x7ffecb865a3c by thread T4 (mutexes: write M147):
    #0 clock_thread_fn /home/fio/gettime.c:611:19 (fio+0x0000004c56e2)

  Location is stack of main thread.

  Mutex M169 (0x7d700000f6a0) created at:
    #0 pthread_mutex_init /home/clang-3.9/llvm/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:1119 (fio+0x00000043b695)
    axboe#1 fio_monotonic_clocktest /home/fio/gettime.c:694:3 (fio+0x0000004c4c12)
    axboe#2 parse_cmd_line /home/fio/init.c:2710:15 (fio+0x0000004ce8e5)
    axboe#3 parse_options /home/fio/init.c:2828:14 (fio+0x0000004cf3da)
    axboe#4 main /home/fio/fio.c:47:6 (fio+0x00000054b991)

  Mutex M147 (0x7d700000f178) created at:
    #0 pthread_mutex_init /home/clang-3.9/llvm/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:1119 (fio+0x00000043b695)
    axboe#1 fio_monotonic_clocktest /home/fio/gettime.c:694:3 (fio+0x0000004c4c12)
    axboe#2 parse_cmd_line /home/fio/init.c:2710:15 (fio+0x0000004ce8e5)
    axboe#3 parse_options /home/fio/init.c:2828:14 (fio+0x0000004cf3da)
    axboe#4 main /home/fio/fio.c:47:6 (fio+0x00000054b991)

  Thread T15 (tid=23796, running) created by main thread at:
    #0 pthread_create /home/clang-3.9/llvm/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:902 (fio+0x00000042c9a6)
    axboe#1 fio_monotonic_clocktest /home/fio/gettime.c:697:7 (fio+0x0000004c4c38)
    axboe#2 parse_cmd_line /home/fio/init.c:2710:15 (fio+0x0000004ce8e5)
    axboe#3 parse_options /home/fio/init.c:2828:14 (fio+0x0000004cf3da)
    axboe#4 main /home/fio/fio.c:47:6 (fio+0x00000054b991)

  Thread T4 (tid=23785, finished) created by main thread at:
    #0 pthread_create /home/clang-3.9/llvm/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:902 (fio+0x00000042c9a6)
    axboe#1 fio_monotonic_clocktest /home/fio/gettime.c:697:7 (fio+0x0000004c4c38)
    axboe#2 parse_cmd_line /home/fio/init.c:2710:15 (fio+0x0000004ce8e5)
    axboe#3 parse_options /home/fio/init.c:2828:14 (fio+0x0000004cf3da)
    axboe#4 main /home/fio/fio.c:47:6 (fio+0x00000054b991)

SUMMARY: ThreadSanitizer: data race /home/fio/gettime.c:567:13 in atomic32_inc_return

Fix the above by doing the following:

- Add a configure check for __sync_val_compare_and_swap and add a helper
  atomic32_cas_return that uses it.
- Add comments noting that the atomic32_* functions act as full
  barriers.
- Don't access t->seq directly when protecting a critical region and
  instead use the atomic32_* helpers to update/read it.

The above fixes the sanitizer warnings and makes the test pass on
Windows.

Fixes: axboe#479

Signed-off-by: Sitsofe Wheeler <sitsofe@yahoo.com>
sitsofe added a commit to sitsofe/fio that referenced this issue Mar 7, 2018
Compiling with
CC=clang ./configure --extra-cflags='-fsanitize=thread'
make
and then running
./fio --cpuclock-test
generates warnings like

WARNING: ThreadSanitizer: unlock of an unlocked mutex (or by a wrong thread) (pid=324)
    #0 pthread_mutex_unlock <null> (fio+0x44ce3e)
    axboe#1 clock_thread_fn gettime.c:604:2 (fio+0x4d16c6)

  Location is heap block of size 480 at 0x7b5000000000 allocated by main thread:
    #0 malloc <null> (fio+0x42ea4b)
    axboe#1 fio_monotonic_clocktest gettime.c:690:13 (fio+0x4d0b1a)
    axboe#2 parse_cmd_line init.c:2792:15 (fio+0x4dad0b)
    axboe#3 parse_options init.c:2920:14 (fio+0x4db7b7)
    axboe#4 main fio.c:47 (fio+0x4247fa)

  Mutex M142 (0x7b5000000038) created at:
    #0 pthread_mutex_init <null> (fio+0x42f6ba)
    axboe#1 fio_monotonic_clocktest gettime.c:706:3 (fio+0x4d0c03)
    axboe#2 parse_cmd_line init.c:2792:15 (fio+0x4dad0b)
    axboe#3 parse_options init.c:2920:14 (fio+0x4db7b7)
    axboe#4 main fio.c:47 (fio+0x4247fa)

SUMMARY: ThreadSanitizer: unlock of an unlocked mutex (or by a wrong thread) (fio+0x44ce3e) in __interceptor_pthread_mutex_unlock

valgrind --tool=helgrind ./fio --cpuclock-test
shows a similar warning:

==6607== Thread axboe#3 unlocked lock at 0x639A730 currently held by thread axboe#1
==6607==    at 0x4C3233B: mutex_unlock_WRK (hg_intercepts.c:1094)
==6607==    by 0x4C35CE7: pthread_mutex_unlock (hg_intercepts.c:1115)
==6607==    by 0x41B872: clock_thread_fn (gettime.c:604)
==6607==    by 0x4C349E1: mythread_wrapper (hg_intercepts.c:389)
==6607==    by 0x59A836C: start_thread (in /usr/lib64/libpthread-2.25.so)
==6607==    by 0x5ED4B4E: clone (in /usr/lib64/libc-2.25.so)
==6607==  Lock at 0x639A730 was first observed
==6607==    at 0x4C35CA3: pthread_mutex_init (hg_intercepts.c:787)
==6607==    by 0x41D2DA: fio_monotonic_clocktest (gettime.c:706)
==6607==    by 0x4232EC: parse_cmd_line (init.c:2792)
==6607==    by 0x424372: parse_options (init.c:2920)
==6607==    by 0x40E2EA: main (fio.c:47)
==6607==  Address 0x639a730 is 176 bytes inside a block of size 480 alloc'd
==6607==    at 0x4C2EF7B: malloc (vg_replace_malloc.c:299)
==6607==    by 0x41D227: fio_monotonic_clocktest (gettime.c:690)
==6607==    by 0x4232EC: parse_cmd_line (init.c:2792)
==6607==    by 0x424372: parse_options (init.c:2920)
==6607==    by 0x40E2EA: main (fio.c:47)
==6607==  Block was alloc'd by thread axboe#1

This first issue ("unlock of an unlocked mutex (or by a wrong thread)
t->started") occurs because fio uses a mutexes to arrange for all the
cycle measurement threads to start their timing together but
http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_mutex_lock.html
warns: "If a thread attempts to unlock a mutex that it has not locked or
a mutex which is unlocked, undefined behavior results". Address this by
reworking fio to use a condition plus a condition variable to signal all
threads when its safe to proceed.

ThreadSanitizer has a second warning too:

==================
WARNING: ThreadSanitizer: data race (pid=324)
  Read of size 4 at 0x7ffffceafdf4 by thread T2 (mutexes: write M143):
    #0 clock_thread_fn gettime.c:614:10 (fio+0x4d1743)

  Previous atomic write of size 4 at 0x7ffffceafdf4 by thread T1 (mutexes: write M141):
    #0 __tsan_atomic32_compare_exchange_val <null> (fio+0x479237)
    axboe#1 atomic32_compare_and_swap gettime.c:576:9 (fio+0x4d1785)
    axboe#2 clock_thread_fn gettime.c:619 (fio+0x4d1785)

  Location is stack of main thread.

  Mutex M143 (0x7b5000000088) created at:
    #0 pthread_mutex_init <null> (fio+0x42f6ba)
    axboe#1 fio_monotonic_clocktest gettime.c:705:3 (fio+0x4d0bf7)
    axboe#2 parse_cmd_line init.c:2792:15 (fio+0x4dad0b)
    axboe#3 parse_options init.c:2920:14 (fio+0x4db7b7)
    axboe#4 main fio.c:47 (fio+0x4247fa)

  Mutex M141 (0x7b5000000010) created at:
    #0 pthread_mutex_init <null> (fio+0x42f6ba)
    axboe#1 fio_monotonic_clocktest gettime.c:705:3 (fio+0x4d0bf7)
    axboe#2 parse_cmd_line init.c:2792:15 (fio+0x4dad0b)
    axboe#3 parse_options init.c:2920:14 (fio+0x4db7b7)
    axboe#4 main fio.c:47 (fio+0x4247fa)

  Thread T2 (tid=327, running) created by main thread at:
    #0 pthread_create <null> (fio+0x42f3c6)
    axboe#1 fio_monotonic_clocktest gettime.c:708:7 (fio+0x4d0c20)
    axboe#2 parse_cmd_line init.c:2792:15 (fio+0x4dad0b)
    axboe#3 parse_options init.c:2920:14 (fio+0x4db7b7)
    axboe#4 main fio.c:47 (fio+0x4247fa)

  Thread T1 (tid=326, running) created by main thread at:
    #0 pthread_create <null> (fio+0x42f3c6)
    axboe#1 fio_monotonic_clocktest gettime.c:708:7 (fio+0x4d0c20)
    axboe#2 parse_cmd_line init.c:2792:15 (fio+0x4dad0b)
    axboe#3 parse_options init.c:2920:14 (fio+0x4db7b7)
    axboe#4 main fio.c:47 (fio+0x4247fa)

SUMMARY: ThreadSanitizer: data race gettime.c:614:10 in clock_thread_fn

The second issue ("t->seq data race") seems to be because mixing atomic
and non-atomic operations on the same address might not be safe (e.g.
the compiler may be allowed to make dangerous optimisations). Fix this
waring by using a __sync_fetch_and_add() to do the read and remove the
no longer needed __sync_synchronize().

Signed-off-by: Sitsofe Wheeler <sitsofe@yahoo.com>
vishnuitta added a commit to vishnuitta/fio that referenced this issue Mar 17, 2019
bvanassche added a commit to bvanassche/fio that referenced this issue Jan 12, 2020
This patch fixes the following Coverity complaint:

23. zero_return: Function call utime_since(&s, &re) returns 0.
CID 280732 (axboe#2 of 2): Division or modulo by zero (DIVIDE_BY_ZERO)
24. divide_by_zero: In expression bytes * 1000UL * 1000UL / utime_since(&s, &re), division by expression utime_since(&s, &re) which may be zero has undefined behavior.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
bvanassche added a commit to bvanassche/fio that referenced this issue Jan 12, 2020
This patch fixes the following Coverity complaint:

23. zero_return: Function call utime_since(&s, &re) returns 0.
CID 280732 (axboe#2 of 2): Division or modulo by zero (DIVIDE_BY_ZERO)
24. divide_by_zero: In expression bytes * 1000UL * 1000UL / utime_since(&s, &re), division by expression utime_since(&s, &re) which may be zero has undefined behavior.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
bvanassche added a commit to bvanassche/fio that referenced this issue Jan 12, 2020
This patch fixes the following Coverity complaint:

23. zero_return: Function call utime_since(&s, &re) returns 0.
CID 280732 (axboe#2 of 2): Division or modulo by zero (DIVIDE_BY_ZERO)
24. divide_by_zero: In expression bytes * 1000UL * 1000UL / utime_since(&s, &re), division by expression utime_since(&s, &re) which may be zero has undefined behavior.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
vincentkfu pushed a commit to vincentkfu/fio that referenced this issue Jan 22, 2020
This patch fixes the following Coverity complaint:

23. zero_return: Function call utime_since(&s, &re) returns 0.
CID 280732 (axboe#2 of 2): Division or modulo by zero (DIVIDE_BY_ZERO)
24. divide_by_zero: In expression bytes * 1000UL * 1000UL / utime_since(&s, &re), division by expression utime_since(&s, &re) which may be zero has undefined behavior.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
bvanassche added a commit to bvanassche/fio that referenced this issue Jun 13, 2020
This patch fixes the following Coverity complaint:

CID 184174 (axboe#2 of 2): Double lock (LOCK)

Fixes: c06379a ("fio: enable overlap checking with offload submission")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
bvanassche added a commit to bvanassche/fio that referenced this issue Jun 13, 2020
The code in check_overlap() confuses Coverity so reorganize that code such
that it becomes easier to analyze. This patch fixes the following Coverity
complaint:

CID 184174 (axboe#2 of 2): Double lock (LOCK)

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
bvanassche added a commit to bvanassche/fio that referenced this issue Jun 14, 2020
If the following happens:
* check_overlap() finds an overlap.
* All other threads finish after the overlap has been found and before
  the next iteration of the do/while loop starts.

Then the do/while loop in check_overlap() will iterate forever. Fix this
by rewriting check_overlap() such that this cannot happen.

This patch fixes the following Coverity complaint:

CID 184174 (axboe#2 of 2): Double lock (LOCK)

Fixes: c06379a ("fio: enable overlap checking with offload submission")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
bvanassche added a commit to bvanassche/fio that referenced this issue Jul 2, 2020
If the following happens:
* check_overlap() finds an overlap.
* All other threads finish after the overlap has been found and before
  the next iteration of the do/while loop starts.

Then the do/while loop in check_overlap() will iterate forever. Fix this
by rewriting check_overlap() such that this cannot happen.

This patch fixes the following Coverity complaint:

CID 184174 (axboe#2 of 2): Double lock (LOCK)

Fixes: c06379a ("fio: enable overlap checking with offload submission")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
bvanassche added a commit to bvanassche/fio that referenced this issue Jul 4, 2020
If the following happens:
* check_overlap() finds an overlap.
* All other threads finish after the overlap has been found and before
  the next iteration of the do/while loop starts.

Then the do/while loop in check_overlap() will iterate forever. Fix this
by rewriting check_overlap() such that this cannot happen.

This patch fixes the following Coverity complaint:

CID 184174 (axboe#2 of 2): Double lock (LOCK)

Fixes: c06379a ("fio: enable overlap checking with offload submission")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
bvanassche added a commit to bvanassche/fio that referenced this issue Jul 4, 2020
If the following happens:
* check_overlap() finds an overlap.
* All other threads finish after the overlap has been found and before
  the next iteration of the do/while loop starts.

Then the do/while loop in check_overlap() will iterate forever. Fix this
by rewriting check_overlap() such that this cannot happen.

This patch fixes the following Coverity complaint:

CID 184174 (axboe#2 of 2): Double lock (LOCK)

Fixes: c06379a ("fio: enable overlap checking with offload submission")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
bvanassche added a commit to bvanassche/fio that referenced this issue Jul 4, 2020
If the following happens:
* check_overlap() finds an overlap.
* All other threads finish after the overlap has been found and before
  the next iteration of the do/while loop starts.

Then the do/while loop in check_overlap() will iterate forever. Fix this
by rewriting check_overlap() such that this cannot happen.

This patch fixes the following Coverity complaint:

CID 184174 (axboe#2 of 2): Double lock (LOCK)

Fixes: c06379a ("fio: enable overlap checking with offload submission")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
axboe pushed a commit that referenced this issue Sep 5, 2020
Parse "io_size=N%".

Semantics is "multiply whatever size= calculations result in".

Example #1:

	size=50%
	io_size=50%

will do 25% of a file.

Example #2:

	size=1G
	io_size=50%

will do 512M I/O.

As side effect, fix a bug with essentially infinite loop if both size=N%
and io_size=M% are given: io_size is set to 2^64-... in this case (a lot!).

Note: only values under 100% work currently.
Going for io_size=150% requires resetting workload generator state
which is whole separate endeavour.

Signed-off-by: Alexey Dobriyan (SK hynix) <adobriyan@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
lukaszstolarczuk pushed a commit to lukaszstolarczuk/fio that referenced this issue Sep 15, 2020
Add dummy stub for rpmem server engine
rbates0119 pushed a commit to rbates0119/fio that referenced this issue Oct 28, 2020
If the following happens:
* check_overlap() finds an overlap.
* All other threads finish after the overlap has been found and before
  the next iteration of the do/while loop starts.

Then the do/while loop in check_overlap() will iterate forever. Fix this
by rewriting check_overlap() such that this cannot happen.

This patch fixes the following Coverity complaint:

CID 184174 (axboe#2 of 2): Double lock (LOCK)

Fixes: c06379a ("fio: enable overlap checking with offload submission")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
sitsofe added a commit to sitsofe/fio that referenced this issue Jan 15, 2021
Google's OSS-fuzz turned up a heap overrun when substituting keywords in
job files. To reproduce compile fio with address sanitizer options like
the following

LDFLAGS="-fsanitize=address" ./configure --disable-optimizations \
  --extra-cflags="-fsanitize=address"

The issue is demonstrated by the following job:

% printf '[t]\ndescription=$ncpus_' | fio --parse-only -
opt = 'description=$ncpus'
=================================================================
==22547==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000001863 at pc 0x000107a833c9 bp 0x7ffee82ac260 sp 0x7ffee82ac258
READ of size 1 at 0x603000001863 thread T0
    #0 0x107a833c8 in fio_keyword_replace options.c:5124
    axboe#1 0x107a7c6ab in dup_and_sub_options options.c:5158
    axboe#2 0x107a7bb4f in fio_options_parse options.c:5203
    axboe#3 0x1079b2214 in __parse_jobs_ini init.c:2076
    axboe#4 0x1079aff07 in parse_jobs_ini init.c:2127
    axboe#5 0x1079b7501 in parse_options init.c:2989
    axboe#6 0x107b876a4 in main fio.c:42
    axboe#7 0x7fff702f1cc8 in start (libdyld.dylib:x86_64+0x1acc8)

Fix the thinko (because opt is pointing to a later position) and
rearrange some code to make it clearer that olen is being used as an
initial offset

Signed-off-by: Sitsofe Wheeler <sitsofe@yahoo.com>
sitsofe added a commit to sitsofe/fio that referenced this issue Jan 16, 2021
Google's OSS-fuzz turned up a buffer overrun with value of the filename
option due to an overrun in a MAX_PATH sized buffer. To reproduce
compile fio with address sanitizer options like the following

LDFLAGS="-fsanitize=address" ./configure --disable-optimizations \
      --extra-cflags="-fsanitize=address"

The issue is demonstrated by the following job:

% COUNT=$(getconf PATH_MAX /); printf "[t]\nfilename=%${COUNT}s" \
  | sed 's/ /@/g' | fio --parse-only -
=================================================================
==45748==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffee8e35780 at pc 0x00010735a343 bp 0x7ffee8e35270 sp 0x7ffee8e34a08
WRITE of size 1025 at 0x7ffee8e35780 thread T0
    #0 0x10735a342 in wrap_vsprintf (libclang_rt.asan_osx_dynamic.dylib:x86_64h+0x22342)
    axboe#1 0x10735a9ac in wrap_sprintf (libclang_rt.asan_osx_dynamic.dylib:x86_64h+0x229ac)
    axboe#2 0x106e83b01 in add_file filesetup.c:1656
    axboe#3 0x106ee8c87 in str_filename_cb options.c:1320
    axboe#4 0x106ee1b44 in __handle_option parse.c:792
    axboe#5 0x106ed99ad in handle_option parse.c:1014
    axboe#6 0x106eda07d in parse_option parse.c:1184
    axboe#7 0x106ef10ea in fio_options_parse options.c:5199
    axboe#8 0x106e27684 in __parse_jobs_ini init.c:2076
    axboe#9 0x106e25377 in parse_jobs_ini init.c:2127
    axboe#10 0x106e2c971 in parse_options init.c:2989
    axboe#11 0x106ffc884 in main fio.c:42
    axboe#12 0x7fff702f1cc8 in start (libdyld.dylib:x86_64+0x1acc8)

Address 0x7ffee8e35780 is located in stack of thread T0 at offset 1056 in frame
    #0 0x106e836ef in add_file filesetup.c:1644

  This frame has 1 object(s):
    [32, 1056) 'file_name' (line 1646) <== Memory access at offset 1056 overflows this variable

Return an error message to the user by doing the following:

- Allow "regular" string options to have a maxlen parameter
- Set the filename option to have a maxlen of MAX_PATH

Signed-off-by: Sitsofe Wheeler <sitsofe@yahoo.com>
axboe pushed a commit that referenced this issue Jan 29, 2021
Several error conditions that are encountered during zone processing
in zbd_adjust_block() function cause it to return io_u_eof value.
This stops the i/o to the given file, but there is no error raised or
reported if this code is returned. For a few particular conditions,
just stopping the i/o is reasonable, but others are serious errors
that should be reported.

Add td_verror() calls to raise thread errors for a few abnormal
conditions during adjusting the i/o. The only test that needs to be
modified because of this changes is test #2.

Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
axboe pushed a commit that referenced this issue Jan 29, 2021
With the preceding commit in place, fio gives an error if user attempts
to run write I/O size that is larger than the zone size. Grep for that
message instead of checking that no write has happened.

Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
dmitry-fomichev added a commit to dmitry-fomichev/fio that referenced this issue Jan 29, 2021
Several error conditions that are encountered during zone processing
in zbd_adjust_block() function cause it to return io_u_eof value.
This stops the i/o to the given file, but there is no error raised or
reported if this code is returned. For a few particular conditions,
just stopping the i/o is reasonable, but others are serious errors
that should be reported.

Add td_verror() calls to raise thread errors for a few abnormal
conditions during adjusting the i/o. The only test that needs to be
modified because of this changes is test axboe#2.

Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
dmitry-fomichev added a commit to dmitry-fomichev/fio that referenced this issue Jan 29, 2021
With the preceding commit in place, fio gives an error if user attempts
to run write I/O size that is larger than the zone size. Grep for that
message instead of checking that no write has happened.

Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
dmitry-fomichev added a commit to dmitry-fomichev/fio that referenced this issue Jan 29, 2021
Several error conditions that are encountered during zone processing
in zbd_adjust_block() function cause it to return io_u_eof value.
This stops the i/o to the given file, but there is no error raised or
reported if this code is returned. For a few particular conditions,
just stopping the i/o is reasonable, but others are serious errors
that should be reported.

Add td_verror() calls to raise thread errors for a few abnormal
conditions during adjusting the i/o. The only test that needs to be
modified because of this changes is test axboe#2.

Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
dmitry-fomichev added a commit to dmitry-fomichev/fio that referenced this issue Jan 29, 2021
With the preceding commit in place, fio gives an error if user attempts
to run write I/O size that is larger than the zone size. Grep for that
message instead of checking that no write has happened.

Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
floatious added a commit to floatious/fio that referenced this issue Feb 18, 2021
Fix the following LeakSanitizer warnings:

Indirect leak of 224 byte(s) in 7 object(s) allocated from:
    #0 0x7f7377b21bc8 in malloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10dbc8)
    axboe#1 0x563951e5e09d in add_to_dump_list /home/nks/src/fio/parse.c:1135
    axboe#2 0x563951e5e09d in add_to_dump_list /home/nks/src/fio/parse.c:1127
    axboe#3 0x563951e5e09d in parse_cmd_option /home/nks/src/fio/parse.c:1162

Indirect leak of 43 byte(s) in 7 object(s) allocated from:
    #0 0x7f7377aaa3dd in strdup (/lib/x86_64-linux-gnu/libasan.so.5+0x963dd)
    axboe#1 0x563951e5e0a8 in add_to_dump_list /home/nks/src/fio/parse.c:1136
    axboe#2 0x563951e5e0a8 in add_to_dump_list /home/nks/src/fio/parse.c:1127
    axboe#3 0x563951e5e0a8 in parse_cmd_option /home/nks/src/fio/parse.c:1162

Indirect leak of 36 byte(s) in 7 object(s) allocated from:
    #0 0x7f7377aaa3dd in strdup (/lib/x86_64-linux-gnu/libasan.so.5+0x963dd)
    axboe#1 0x563951e5e0b9 in add_to_dump_list /home/nks/src/fio/parse.c:1138
    axboe#2 0x563951e5e0b9 in add_to_dump_list /home/nks/src/fio/parse.c:1127
    axboe#3 0x563951e5e0b9 in parse_cmd_option /home/nks/src/fio/parse.c:1162

by moving fio_dump_options_free() to options.h,
so that we can call it during exit.

Reproducer:
LD_PRELOAD=libasan.so.5 fio --name=test --filename=/dev/nullb0 --runtime=2

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants