Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When I use numjobs=8, the perfoemance is lower then nubnerjob=1 #1046

Closed
ZhenLiu94 opened this issue Jul 21, 2020 · 10 comments
Closed

When I use numjobs=8, the perfoemance is lower then nubnerjob=1 #1046

ZhenLiu94 opened this issue Jul 21, 2020 · 10 comments
Labels
needreporterinfo Waiting on information from the issue reporter

Comments

@ZhenLiu94
Copy link

When I use fio testing my ceph cluster, Some results are not as expected. For example, the command I use is
fio --filename=/dev/rbd0 -iodepth=128 -rw=read -ioengine=libaio -bs=4k -size=100G -numjobs=8 -group_reporting -direct=1 -name=iops_write -runtime=600
but the cluster's iops result is lower then numjobs=1. So ,I want to ask when I use fio testing a ceph cluster, how should I set my fio's parameters?
thx~

@Martins3
Copy link

I think it's unreasonable, but you should provide more details, please read how to report bugs

@ZhenLiu94
Copy link
Author

I think it's unreasonable, but you should provide more details, please read how to report bugs

Thanks~

@sitsofe
Copy link
Collaborator

sitsofe commented Jul 21, 2020

I would like to echo @Martins3 comment about need for the minimum information.

Taking a glance I will note that you have a high depth of 128 for one job alone and cranking up numjobs means you are effectively asking for a depth of 1024. If your backend doesn't actually have a "depth" that deep then all you are doing is creating contention for the depth it can service and the more contention there is the worse the speeds you achieve will be as latency goes up.

@sitsofe
Copy link
Collaborator

sitsofe commented Jul 22, 2020

Let's make the next action clearer: @ZhenLiu94 what is the output of

grep . /sys/block/rbd0/device/{queue_depth,nr_requests}
grep . /sys/block/rbd0/queue/max_sectors_kb

@sitsofe sitsofe added the needreporterinfo Waiting on information from the issue reporter label Jul 22, 2020
@ZhenLiu94
Copy link
Author

Let's make the next action clearer: @ZhenLiu94 what is the output of

grep . /sys/block/rbd0/device/{queue_depth,nr_requests}
grep . /sys/block/rbd0/queue/max_sectors_kb

My rbd0 is a cloud drive, so I don't have /sys/block/rbd0/device/ on my system, but I can cd /sys/block/rbd0, then use "ls", The output information is

alignment_offset  bdi  capability  dev  discard_alignment  ext_range  hidden  holders  inflight  integrity  mq  power  queue  range  removable  ro  size  slaves  stat  subsystem  trace  uevent

I thought a lot last night, Here's my idea:
When I start 8 or more numjobs, The CPU cannot parallel so many threads simultaneously, causing the thread to block. As a result, the delay becomes extremely high, making the test result worse than using a single thread.

@sitsofe
Copy link
Collaborator

sitsofe commented Jul 22, 2020

@ZhenLiu94 Depending on your machine it could be threads but it could just as well be in-flight I/Os. I would be more likely to suspect the later. There will be some maximum number of I/Os can be in-flight and then there will be some maximum number of I/Os that the kernel is then willing to queue up internally. Doing a Google search (https://www.google.com/search?q=linux+rbd+queue_depth ) leads you to https://docs.ceph.com/docs/mimic/man/8/rbd/ which mentions about a default queue depth of 128 which feels like a hint :-). Did you check what happened if you used just two jobs and then 4? Did you check what iostat said the queue depth of your rbd device was while fio was running with all these permutations?

@sitsofe
Copy link
Collaborator

sitsofe commented Jul 22, 2020

I will also note my greps above were wrong and should have been:

grep . /sys/block/rbd0/queue/{max_sectors_kb,nr_requests}
grep . /sys/block/rbd0/device/queue_depth

obviously you don't have a device in your case but queue should be there...

@sitsofe
Copy link
Collaborator

sitsofe commented Jul 23, 2020

@ZhenLiu94 any follow up on this one?

@Martins3
Copy link

Martins3 commented Jul 26, 2020

I run fio --size=10G --readwrite=rw --rwmixread=10 with different numjobs on a SSD, here is the result. I'm kind of confused too.

Tested with btrfs
Filesystem-benchmark
Tested with ext4
Filesystem-benchmark

  1. /proc/cpuinof
Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz
  1. uname -a
 Linux n8-030-171 4.19.28.bsk.3-amd64 #4.19.28.bsk.3 SMP Debian 4.19.28.bsk.3 Sun Apr 14 11:46:25 UTC  x86_64 GNU/Linux
  1. fio --version
root@n8-030-171:~/huxueshi/flatten# fio --version
fio-3.21-7-g5090
root@n8-030-171:~/huxueshi/flatten# grep . /sys/block/nvme0n1/queue/max_sectors_kb
128
root@n8-030-171:~/huxueshi/flatten# grep . /sys/block/nvme0n1/queue/nr_requests
511

@sitsofe
Copy link
Collaborator

sitsofe commented Jul 26, 2020

@Martins3:

There is going to be some maximum number of I/O that can be kept in flight by your hardware at any given time. If we assume that the I/Os reaching the disk are a uniform size then once you have hit that limit future I/Os are going to have to sit and wait in a queue before they can be serviced. The deeper that queue becomes the higher the latency for I/Os (because the latency now has to include the time it takes to get through that queue). Thus you can hit a throughput plateaux but see latency steadily rise. BtFS is a copy on write filesystem and this workload will likely cause rapid fragmentation (a google search for btrfs fragmentation quickly got me to https://btrfs.wiki.kernel.org/index.php/Gotchas#Fragmentation ) so I don't think that result is unexpected either. Having said all that I suspect there was more in fio job which you didn't post (I notice the chart mentions iodepth which isn't covered by the fio command line you posted so I'll guide you towards https://github.com/axboe/fio/blob/master/REPORTING-BUGS#L14 ) so there could be more to this...

However from what's been posted I can't see something that looks like a bug in fio and the questions seem more of the form "How do I/Why is?" (and this has now attracted two different setups so it's becoming more unfocussed). Such questions are better aimed at the fio mailing list (note that the list only accepts plain text emails). I strongly recommend continuing any further discussion there so I'll close the issue here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needreporterinfo Waiting on information from the issue reporter
Projects
None yet
Development

No branches or pull requests

3 participants