-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run FIO on iSCSI but can't reach the network speed limit #508
Comments
@WisQTuser Hi there! As phrased this is unactionable by the fio project because your problem is so vague and open ended - there are simply too many variables and not enough information to help. You've said it's slow but why should I believe that's within fio? :-) For example is your kernel misconfigured? Did you give fio a bad job with settings that limit it? Are your iSCSI initiator settings good enough? Did tgtd or fio run out of CPU? Are you using an old version of fio? Are you using jumbo frames? Do other tools go faster and if so how are they submitting I/O? And so forth. Even if we were to only consider that prior list of questions the fio project would only have time and capacity to help you with one of those questions ("Did you give fio a bad job?") and could only do so if you included the information requested within https://github.com/axboe/fio/blob/fio-3.3/REPORTING-BUGS . Asking someone to debug and tune your entire system is too big a request for a github issue ;-) I don't know what others will say I would suggest that you close this issue until you've narrowed down your problem to something provably within fio itself (and ideally reproducible without tgtd). Until then a better starting point might be the tgtd mailing list over on http://vger.kernel.org/vger-lists.html#stgt . Also note while being extremely flexible tgtd isn't the fastest iSCSI target I've seen... |
Hi,
|
This is a user error, I know plenty of folks that are driving way more, iscsi included. psync and iodepth > 1 doesn't make sense, as the max depth for a sync engine is 1. I'd experiment with libaio if you want higher queue depths per thread, and potentially reducing the massive buffer size. |
@WisQTuser - There are all sorts of other issues that could be going on with iSCSI specifically, having nothing to do with FIO. What I would start by asking first is, are you actually seeing completely different results from another tool, or tools, or doing testing via some other means? With iSCSI where I would start is large sequential IOs, and depth of 1, with a single job. Figure out if you can push a single IO stream, and instead of blocksize of 16M, which I am quite sure is not ideal with pretty much any target out there, consider what iSCSI is meant to mimic, which is disks, and instead start with 8K block, then going to perhaps 64K, then 128K and maybe up to 1M, to see if you are observing progression. Of course, whatever is on the other end of the iSCSI target matters. Being on the vendor side of this, I see these sorts of things all the time. Networks are rarely a problem, and |
@WisQTuser I can only second the comments people have posted already:
At this stage I think it's unlikely to be an issue in fio but here's an experiment: run the following as root on the iSCSI client machine:
and post us back the full fio output. |
Hi @sitsofe ,
Ans: After try to build up iSCSI with NVMe SSD in windows server (iSCSI target), FIO of client storage performance can be targeted expect result (meet NVMe SSD performance up to 3,100 MB/s through Mellanox ConnectX 4 100Gbps network adapter), but Linux can’t (always limit on 1,500 MB/s).
Ans: We try to use your comment command as “fio --filename=/dev/nullb0 --direct=1 --rw=read --bs=127k --stonewall --runtime=10s --time_based --name=test1psync --ioengine=pvsync --name=test2libaio --ioengine=libaio --iodepth=128” that still have this problem |
OK so you're saying Windows iSCSI target <-> Linux iSCSI client reaches the speeds you expect but Linux tgtd iSCSI target <-> Linux iSCSI client doesn't?
That command was designed to be run explicitly run against Linux's null block device NOT your iSCSI target disk:
Can you make sure that you are running that fio explicitly against Linux's null block device as above and please post the full output of running those commands into this issue (i.e. don't summarise it really copy and paste the output of running those commands into this github issue). |
@WisQTuser Just in case it's not clear what I mean by full output: take a look at #509 (comment) . There the reporter not only posts the command they ran (the line that starts |
@WisQTuser your previous comment seems to suggest that whatever you're seeing isn't down to fio itself. If you're unable to post the null block device test full output could you please close this issue as it can be meaningful taken further here. |
Hi @sitsofe FIO command in Client: |
@WisQTuser Well (sigh)... the results are for different device than the one I was asking for and I was very explicit the last time [emphasis added]:
You also left out the answer to this question:
When loaded, the null block devices start at
So with the pvsync ioengine fio is pushing at least 27 gigabytes per second using a single core and looking at the cpu line shows we're using near enough 100% CPU. The libaio result is even higher but uses more userspace time. Put another way, if your "disk" can go fast enough, on my system fio can push 10s of gigabytes while using 20% or less CPU for itself on a single core even at an I/O depth of 1! So this suggests fio itself is not your bottleneck but as a last ditch effort let's analyze what you posted... Looking at the right most terminal in the screenshots you posted we see submission latencies (time to submit the I/O and have the kernel tell us it's queued it up for sending) in the 1-6 microsecond range (which is fine). Unfortunately your completion latencies (time from when kernel accepted it for queuing until it got a reply back from the underlying disk saying the I/O completed and then we notice the kernel saying to us the I/O completed) are in the 1-108 millisecond range with an average of 11.2ms (this is not fine for fast speeds). In short fio is telling you that I/Os to In short your problem is down to the latency of something in or below |
Setup tgtd by iscsi target service in Linux through 100Gbps break out 4* 25Gbps with 100Gbps Switch, client storage performance will be limited almost 1.5Gbps by FIO
The text was updated successfully, but these errors were encountered: