Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIO always reporting "Invalid or incomplete multibyte or wide character" #740

Open
chamarthy opened this issue Feb 8, 2019 · 5 comments
Labels

Comments

@chamarthy
Copy link

I am getting "Invalid or incomplete multibyte or wide character" error consistently on almost all filesystems with the following profile:

# fio --name=fileio --directory=/nas1_fs1_nfs1 --ioengine=libaio --direct=1 --size=8G --nrfiles=100 --openfiles=10 --file_service_type=random --bsrange=1k-16k --create_on_open=1 --rw=randrw --fallocate=native --refill_buffers --verify=crc32c --do_verify=1 --iodepth=128 --verify_fatal=1 --verify_dump=
1 --verify_state_save=1 --exitall_on_error --error_dump=1 --time_based --runtime=99999
fileio: (g=0): rw=randrw, bs=(R) 1024B-16.0KiB, (W) 1024B-16.0KiB, (T) 1024B-16.0KiB, ioengine=libaio, iodepth=128
fio-3.12-117-gfda82
Starting 1 process
verify: bad header rand_seed 8653442502461740848, wanted 1277040692297970740 at file /nas1_fs1_nfs1/fileio.0.83 offset 27108352, length 1024 (requested block: offset=27108352, length=4096)
       hdr_fail data dumped as fileio.0.83.27108352.hdr_fail
fio: pid=7019, err=84/file:io_u.c:2074, func=io_u_queued_complete, error=Invalid or incomplete multibyte or wide character

fileio: (groupid=0, jobs=1): err=84 (file:io_u.c:2074, func=io_u_queued_complete, error=Invalid or incomplete multibyte or wide character): pid=7019: Fri Feb  8 02:38:13 2019
  read: IOPS=1015, BW=6194KiB/s (6343kB/s)(4100MiB/677811msec)
    slat (usec): min=3, max=1954, avg=23.19, stdev=12.83
    clat (usec): min=1407, max=412828, avg=62449.17, stdev=26753.09
     lat (usec): min=1417, max=412882, avg=62472.74, stdev=26753.86
    clat percentiles (msec):
     |  1.00th=[   37],  5.00th=[   41], 10.00th=[   44], 20.00th=[   47],
     | 30.00th=[   50], 40.00th=[   53], 50.00th=[   56], 60.00th=[   59],
     | 70.00th=[   64], 80.00th=[   72], 90.00th=[   88], 95.00th=[  108],
     | 99.00th=[  180], 99.50th=[  213], 99.90th=[  296], 99.95th=[  317],
     | 99.99th=[  355]
   bw (  KiB/s): min= 1388, max=11125, per=99.99%, avg=6193.28, stdev=1707.40, samples=1355
   iops        : min=  192, max= 1578, avg=1014.81, stdev=236.51, samples=1355
  write: IOPS=1013, BW=6182KiB/s (6330kB/s)(4092MiB/677807msec)
    slat (usec): min=12, max=238645, avg=949.54, stdev=1605.88
    clat (usec): min=1451, max=411650, avg=62734.13, stdev=26535.63
     lat (msec): min=2, max=413, avg=63.68, stdev=26.85
    clat percentiles (msec):
     |  1.00th=[   37],  5.00th=[   42], 10.00th=[   44], 20.00th=[   47],
     | 30.00th=[   50], 40.00th=[   53], 50.00th=[   56], 60.00th=[   59],
     | 70.00th=[   65], 80.00th=[   72], 90.00th=[   88], 95.00th=[  108],
     | 99.00th=[  180], 99.50th=[  209], 99.90th=[  292], 99.95th=[  317],
     | 99.99th=[  355]
   bw (  KiB/s): min= 1470, max=11602, per=100.00%, avg=6181.17, stdev=1677.05, samples=1355
   iops        : min=  200, max= 1606, avg=1013.58, stdev=231.95, samples=1355
  lat (msec)   : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=31.68%
  lat (msec)   : 100=61.92%, 250=6.14%, 500=0.25%
  cpu          : usr=1.99%, sys=7.23%, ctx=651911, majf=0, minf=16187
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
     issued rwts: total=688176,687235,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=128

Run status group 0 (all jobs):
   READ: bw=6194KiB/s (6343kB/s), 6194KiB/s-6194KiB/s (6343kB/s-6343kB/s), io=4100MiB (4299MB), run=677811-677811msec
  WRITE: bw=6182KiB/s (6330kB/s), 6182KiB/s-6182KiB/s (6330kB/s-6330kB/s), io=4092MiB (4291MB), run=677807-677807msec

Disk stats (read/write):
    dm-0: ios=1151899/1166196, merge=0/0, ticks=697327/831383, in_queue=1528681, util=96.74%, aggrios=1150446/1159076, aggrmerge=1453/7120, aggrticks=694809/824265, aggrin_queue=1518207, aggrutil=96.50%
  sda: ios=1150446/1159076, merge=1453/7120, ticks=694809/824265, in_queue=1518207, util=96.50%

Do not know if there is a issue with FIO or if there is some issue with some filesystems. I tried the same on NFS and on EXT4 but both times hitting the issue. Can someone have a look and confirm that there is no issue with FIO here?

fio --version

fio-3.12-117-gfda82

@axboe
Copy link
Owner

axboe commented Feb 8, 2019

Does it reproduce if you do --rw=randwrite instead?

@sitsofe sitsofe added the needreporterinfo Waiting on information from the issue reporter label Feb 9, 2019
@chamarthy
Copy link
Author

If i specify --rw=randwrite, the verification phase does not start at all. I was expecting that it will verify after each phase of write.

[root@vm-d00620 ~]# fio --name=fileio --directory=/nas1_fs1_nfs1 --ioengine=libaio --direct=1 --size=8G --nrfiles=100 --openfiles=10 --file_service_type=random --bsrange=1k-16k --create_on_open=1 --rw=randwrite --fallocate=native --refill_buffers --verify=crc32c --do_verify=1 --iodepth=128 --verify_fatal=1 --verify_dump=1 --verify_state_save=1 --exitall_on_error --error_dump=1 --time_based --runtime=99999
fio: verification read phase will never start because write phase uses all of runtime
fileio: (g=0): rw=randwrite, bs=(R) 1024B-16.0KiB, (W) 1024B-16.0KiB, (T) 1024B-16.0KiB, ioengine=libaio, iodepth=128
fio-3.12-117-gfda82
Starting 1 process
Jobs: 1 (f=10): [w(1)][0.0%][w=5375KiB/s][w=895 IOPS][eta 01d:03h:46m:32s]

The error seems to occur only for time_based execution with runtime set. If i remove the time_based execution, for a single iteration, job seems to pass. I trided with loops and seems to work as expected.

@sitsofe
Copy link
Collaborator

sitsofe commented Feb 9, 2019

If i specify --rw=randwrite, the verification phase does not start at all. I was expecting that it will verify after each phase of write.

[root@vm-d00620 ~]# fio --name=fileio --directory=/nas1_fs1_nfs1 --ioengine=libaio --direct=1 --size=8G --nrfiles=100 --openfiles=10 --file_service_type=random --bsrange=1k-16k --create_on_open=1 --rw=randwrite --fallocate=native --refill_buffers --verify=crc32c --do_verify=1 --iodepth=128 --verify_fatal=1 --verify_dump=1 --verify_state_save=1 --exitall_on_error --error_dump=1 --time_based --runtime=99999
fio: verification read phase will never start because write phase uses all of runtime

^^^ I think this is trying to tell you your expectation is wrong! Ask yourself - when does the write phase of a time_based job finish (hint: time_based jobs don't stop when they reach the end of the disk because they wrap around)? Have you seen https://fio.readthedocs.io/en/latest/fio_doc.html#cmdoption-arg-verify-backlog ?

fileio: (g=0): rw=randwrite, bs=(R) 1024B-16.0KiB, (W) 1024B-16.0KiB, (T) 1024B-16.0KiB, ioengine=libaio, iodepth=128
fio-3.12-117-gfda82
Starting 1 process
Jobs: 1 (f=10): [w(1)][0.0%][w=5375KiB/s][w=895 IOPS][eta 01d:03h:46m:32s]

The error seems to occur only for time_based execution with runtime set. If i remove the time_based execution, for a single iteration, job seems to pass. I trided with loops and seems to work as expected.

I think the problem is simply that the rules around what happens when you use randrw for a verify job are very complicated. We'd have to look to see if a time_based/randrw verify job verifies all data rather than only data written by that job (because there's no concept of a pass).

@chamarthy
Copy link
Author

chamarthy commented Feb 11, 2019

If i specify --rw=randwrite, the verification phase does not start at all. I was expecting that it will verify after each phase of write.

[root@vm-d00620 ~]# fio --name=fileio --directory=/nas1_fs1_nfs1 --ioengine=libaio --direct=1 --size=8G --nrfiles=100 --openfiles=10 --file_service_type=random --bsrange=1k-16k --create_on_open=1 --rw=randwrite --fallocate=native --refill_buffers --verify=crc32c --do_verify=1 --iodepth=128 --verify_fatal=1 --verify_dump=1 --verify_state_save=1 --exitall_on_error --error_dump=1 --time_based --runtime=99999
fio: verification read phase will never start because write phase uses all of runtime

^^^ I think this is trying to tell you your expectation is wrong! Ask yourself - when does the write phase of a time_based job finish (hint: time_based jobs don't stop when they reach the end of the disk because they wrap around)? Have you seen https://fio.readthedocs.io/en/latest/fio_doc.html#cmdoption-arg-verify-backlog ?

Yes. I tried with verify_backlog and it works fine as expected.

fileio: (g=0): rw=randwrite, bs=(R) 1024B-16.0KiB, (W) 1024B-16.0KiB, (T) 1024B-16.0KiB, ioengine=libaio, iodepth=128
fio-3.12-117-gfda82
Starting 1 process
Jobs: 1 (f=10): [w(1)][0.0%][w=5375KiB/s][w=895 IOPS][eta 01d:03h:46m:32s]

The error seems to occur only for time_based execution with runtime set. If i remove the time_based execution, for a single iteration, job seems to pass. I trided with loops and seems to work as expected.

I think the problem is simply that the rules around what happens when you use randwrite for a verify job are very complicated. We'd have to look to see if a time_basedreadwrite `` verify job verifies all data rather than only data written by that job (because there's no concept of a pass).

As of now i think i will keep using verify_backlog.

@gzhx106
Copy link

gzhx106 commented Apr 3, 2019

This error occurred while using script testing
fio -filename=/dev/nvme2n1 -direct=1 -ioengine=libaio -continue_on_error=none -random_generator=tausworthe64 -rw=randrw -bs=3388K -numjobs=1 -iodepth=256 -runtime=20000 -loops=$tc_hour -time_based -name=ssd_rw -do_verify=1 -verify_dump=1 -verify=md5

error info:
(g=0): rw=randrw, bs=(R) 3388KiB-3388KiB, (W) 3388KiB-3388KiB, (T) 3388KiB-3388KiB, ioengine=libaio, iodepth=256
fio-3.12
Starting 1 process
fio: pid=8610, err=84/file:io_u.c:2074, func=io_u_queued_complete, error=Invalid or incomplete multibyte or wide character

(groupid=0, jobs=1): err=84 (file:io_u.c:2074, func=io_u_queued_complete, error=Invalid or incomplete multibyte or wide character): pid=8610:

Do not know if there is a issue with FIO ?

KyleMahlkuch pushed a commit to KyleMahlkuch/avocado-misc-tests that referenced this issue Aug 5, 2020
Runs with Filesystems require a different fio command than runs without
filesystems. This patch creates a seperate command for the no_fs case.
It also corrects the path for the --filename variable in the fs case.

This change may expose a known issue with fio:
(see: axboe/fio#740)
There is temporary code to make this an avocado warning instead of an
error.

We should declare variables before test cancels and goes to clean up
because clean up expects fs_create and lv_create. Current code throws an
error when btrfs is not supported.

Signed-off-by: Kyle Mahlkuch <kmahlkuc@linux.vnet.ibm.com>
KyleMahlkuch pushed a commit to KyleMahlkuch/avocado-misc-tests that referenced this issue Aug 5, 2020
Runs with Filesystems require a different fio command than runs without
filesystems. This patch creates a seperate command for the no_fs case.
It also corrects the path for the --filename variable in the fs case.

This change may expose a known issue with fio:
(see: axboe/fio#740)
There is temporary code to make this an avocado warning instead of an
error because the test still completes.

We should declare variables before test cancels and goes to clean up
because clean up expects fs_create and lv_create. Current code throws an
error when btrfs is not supported.

Signed-off-by: Kyle Mahlkuch <kmahlkuc@linux.vnet.ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants