Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Peformance plateaus while performance testing on more than 3 drives with multiple jobs per each drive #703

Closed
rustyscottweber opened this issue Oct 9, 2018 · 5 comments
Labels
needreporterinfo Waiting on information from the issue reporter

Comments

@rustyscottweber
Copy link

As the title would suggest, attempting to formulate a single job file to conduct performance on multiple drives at the same time results in a performance plateau after the third drive added and does not scale correctly to the expected performance of the drives being used. This result is the same regardless of wither or not the threading or process model is used. However, splitting the fio job file into multiple processes where each drive gets it's own process spawned from the command line, the results will scale correctly. For now, we have been able to work around the issue by doing just this. Is there some type of global lock on each process that might be creating a bottle neck perhaps?

Example job attached.
ExampleFiles.zip

@sitsofe
Copy link
Collaborator

sitsofe commented Oct 9, 2018

Hmm I fairly sure fio doesn't share very much at all when jobs are forked off but I wonder if having them all come from the same parent influences how they are scheduled... Could you use one of the cpu options (e.g. https://fio.readthedocs.io/en/latest/fio_doc.html#cmdoption-arg-cpus-allowed-policy ) that forcefully spreads them out?

@sitsofe
Copy link
Collaborator

sitsofe commented Oct 10, 2018

Quick summary:

  • ioengine is posixaio
  • filename contains 15 disks
  • There are 256 jobs
  • Each job is working on a different 32 gigabyte region
  • depth is not set (so it will be at its default of 1)
  • end_fsync is 1

You appear to have plenty of cpu left so I'm a bit baffled. The only thing I wonder about is whether end_fsync is causing every file to be fsync'd every time 32GBytes has been written by any job (see

fio/backend.c

Line 1122 in f1867a7

if (should_fsync(td) && td->o.end_fsync) {
for where this might happen).

However, as this isn't so much an issue in fio and more of a "How do I/Why is?" question it is better aimed at the fio mailing list (note that the list only accepts plain text emails)... Could you move this question there?

@axboe
Copy link
Owner

axboe commented Oct 10, 2018

It's worth nothing that posixaio is usually a horrible choice, in particular so for iodepth=1. This is because the underlying libc implementation is a thread pool, and not a very good one at that. Nobody should be using posixaio on Linux...

@sitsofe sitsofe added the needreporterinfo Waiting on information from the issue reporter label Oct 16, 2018
@sitsofe
Copy link
Collaborator

sitsofe commented Oct 16, 2018

Setting to needinfo pending a reply from @rustyscottweber ...

@sitsofe
Copy link
Collaborator

sitsofe commented Nov 11, 2018

Closing because we need more information from the reporter to take this further.

@rustyscottweber - please re-open this issue if/when you're able to supply the requested information requested in #703 (comment) . Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needreporterinfo Waiting on information from the issue reporter
Projects
None yet
Development

No branches or pull requests

3 participants