Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TCP bandwidth drops to zero #1021

Open
mellowcandle opened this issue Jun 16, 2020 · 25 comments
Open

TCP bandwidth drops to zero #1021

mellowcandle opened this issue Jun 16, 2020 · 25 comments

Comments

@mellowcandle
Copy link

The behavior occurs on Linux 4.19.124-RT,
UDP works fine, on TCP the bandwidth drops to zero after few seconds.
Ethernet driver is virtual driver between to PCIe machines.
The driver is quiet stable, and no observed problems with software (scp, ssh, etc)
No packet drops on the Ethernet driver.

I would love to understand why iperf3 stops sending packets, is there a way to enable more debugging, or somewhere I could look to debug it further ?

  • Version of iperf3:
    3.7

  • Hardware:
    MIPS based Embedded Linux board

  • Operating system (and distribution, if any):
    Buildroot 20.02

  • Other relevant information (for example, non-default compilers,
    libraries, cross-compiling, etc.):
    Cross compiler:
    mips-img-linux-gnu-gcc (Codescape.GNU.Tools.Package.2019.02-01.for.MIPS.IMG.Linux.CentOS-6.x86_64.tar.gz)

Bug Report

  • Expected Behavior
    Bandwidth is stable
  • Actual Behavior
    Bandwidth drops to zero, no packets are sent from iperf.
  • Steps to Reproduce
    Running on the HW
@PreetiMSobarad
Copy link

PreetiMSobarad commented Jul 1, 2020

I am also facing a similar issue. I debugged a lot. Not able to figure out. iperf3 version on windows client is 3.1.3 and iperf3 version on Macbook server is 3.7. Kindly help me out!

C:\Users\psobarad\Downloads\iperf-3.1.3-win64>iperf3.exe -c 192.168.200.2
Connecting to host 192.168.200.2, port 5201
[  4] local 192.168.100.2 port 55189 connected to 192.168.200.2 port 5201
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-1.00   sec   256 KBytes  2.10 Mbits/sec
[  4]   1.00-2.00   sec  0.00 Bytes  0.00 bits/sec
[  4]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec
[  4]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec
[  4]   4.00-5.00   sec  0.00 Bytes  0.00 bits/sec
[  4]   5.00-6.00   sec  0.00 Bytes  0.00 bits/sec
[  4]   6.00-7.00   sec  0.00 Bytes  0.00 bits/sec
[  4]   7.00-8.00   sec  0.00 Bytes  0.00 bits/sec
[  4]   8.00-9.00   sec  0.00 Bytes  0.00 bits/sec
[  4]   9.00-10.00  sec  0.00 Bytes  0.00 bits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-10.00  sec   256 KBytes   210 Kbits/sec                  sender
[  4]   0.00-10.00  sec  18.5 KBytes  15.2 Kbits/sec                  receiver

iperf Done.

@davidBar-On
Copy link
Contributor

@PreetiMSobarad, can you also list the server's iperf3 command?
Also, I suggest to add the following to the client's (and maybe also server's) command:

  • --debug and --verbose to get more information about the problem
  • "-t 180" to increase test time to 3 minutes and see if/when TCP times out

@PreetiMSobarad
Copy link

Hey David, thank you so much for responding. I have attached iperf3 session from server side and from client side. I have attached 2 iperf3 sessions from client side. One is where I am facing the issue with windows as client and Macbook as server. and another is iperf3 with -R flag which works fine with Macbook as client and Windows as server. Please take a look at it and kindly help me out. Awaiting to hear back from you.
iperf3client.txt
iperf3client180seconds.txt
iperf3server.txt

@PreetiMSobarad
Copy link

@davidBar-On Kindly help me out!

@davidBar-On
Copy link
Contributor

@PreetiMSobarad unfortonattely I don't see much additional useful information in the new log files. I am surprised that the 180 seconds test did not timeout, as it seems that that two 128KB buffers ((131,072 bytes - the default size) that were sent by the client (the 256KB) may even didn't get out of the Windows machine.

It seems that this issue is similar to #839. As @bmah888 noted there, version 3.1.3 used on Windows is 4 years old, so it might help if you will use newer version of iperf3.

If newer version doesn't help, or if you want to try other options first:

  • Reduce buffer length using the -l parameter, in case the problem is related to big buffers. Suggest to start with very small buffers, e.g. "-l 512".

  • Limit bandwidth using -b to set target bitrate. E.g. "-b 10K". That also impacts buffers used in case this is reason for the problem.

  • If you know how to use Wireshark, log the related network messages. That may give a hint for what may be the problem and were.

@PreetiMSobarad
Copy link

Hello, @bmah888 According to the below link, the latest version for iperf3 Windows is 3.1.3. Do you know where I can get 3.7 version for Windows? https://iperf.fr/iperf-download.php#windows

@PreetiMSobarad
Copy link

Hello @davidBar-On , I did as you said. There is traffic going through when the buffer is kept low and bandwidth is kept low. Do you know whats wrong when we dont put any flag? I have attached files. one is iperf3 with buffer flag and another is iperf3 with bandwidth flag.
iperf3withbandwidth.txt
iperf3withbuffer.txt

I connected just my windows machine and intermediate server. I think something is wrong with the way windows is sending data. In the same place when I use Macbook it works. Do you know anything about particular windows iperf3 setting that's causing this issue?

I am unable to attach wireshark capture files because its not supported on github, but it seems fine.

@davidBar-On
Copy link
Contributor

Hi @PreetiMSobarad, from the logs it is quite clear that for some reason the Windows computer Tx bandwidth is very limited - few 10s of KB/sec. The reason Tx from Macintosh is working is that in this case Windows only sends acks which require very low bandwidth.

It seems that the reason in the 128KB buffer (default) case almost no data is reported as received, is because TCP gets into heavy cycles of retries, so most of the data received at the other end are the retries. That can be seen in the "BUFFER SIZE OF 16384 BYTES" case, were it takes about 4 sec to transfer each of the 16KB buffers. The reason for the average 4KB/sec in this case, compared to the the about 10KB/sec with short buffer is probably the retries. I assume that gets worth with the 128KB buffer.

It doesn't seem that the problem is in iperf3. Wireshark logs on Windows side for the default buffer length case (and maybe also the 16384 BYTES case) may allow me or others to help. I think you can attach Zip of the Wireshark log.

@PreetiMSobarad
Copy link

@davidBar-On I see. I tried to increase the buffer size of my Windows machine but that still did not work. I followed this link to do so. Do you have any other suggestions?

I am attaching wireshark captures. Wireshark.pcapng is the one that has the issue. wiresharkbuff.pcapng is the one with buffer.

Kindly let me know if you want any other captures.

wireshark.zip

@PreetiMSobarad
Copy link

@davidBar-On The link i used to change buffer size of windows machine - http://smallvoid.com/article/winnt-winsock-buffer.html

@davidBar-On
Copy link
Contributor

@PreetiMSobarad, from the Wireshark logs it seems that the main issue is related to fragmentation. In both cases, TCP sends packets of up to about 1460*10 bytes with "don't fragment". Router/switch/interface card in the Windows machine/... with IP address 192.168.100.4 sends back ICMP that next hop (the Macintosh? Router? ...) MTU is 1500 and fragmentation is needed.

Issue I don't understand is that when TCP already knows that path MTU is limited to 1500, it still continue to send larger packets with larger size. Only after receiving the ICMP it restransmits it with proper packets size.

From what I read, it may not be good idea to allow TCP packets to be fragmented. Therefore, try to reduce the MTU of the relevant interface on the Windows machine to 1500 (I would expect that now it is about 14KB). Try first my using the iperf3 option "-M 1460".

Another option is to set the Windows machine MTU to 1500. Example of instructions for how to do that on Windows 10: https://myrandomtechblog.com/cryptomining/change-mtu-size-in-windows-10/. Even if one the options works, please send for information what is the current MTU size.

@PreetiMSobarad
Copy link

Hello @davidBar-On I checked the interfaces of all the devices through which data is traversing. The MTU is set as 1500. I tried to change the MTU from command line using flag but I got an error. In the Windows machine too, MTU is set to1500.

interfaces
mtuiperf3

@davidBar-On
Copy link
Contributor

Hi, searching the internet it seem that setting MSS on Windows is not supported .... In any case, since all interfaces are with MTU 1500 that wouldn't help anyway. Also, I see that both SYN and SYN ACK from Windows and Macintosh during the TCP initialization report MSS=1460.

I now believe that the issue is related to NIC TCP Offload. This will also explain why TCP continues to send large packets. TCP Offload is when some TCP functions are offloaded to the NIC to save main CPU performance. Mainly it is related to un/segmentation of large packets and checksum computation. For the sending side the most known terms are TCP Segmentation offload (TSO) and in Windows the main term used is Large Send Offload (LSO).

If LSO is indeed active, then for some reason it doesn't take the reported MSS into account. Therefore, as a next step I suggest to disable LSO, at least on the relevant port. See for example: https://atomicit.ca/kb/articles/slow-network-speed-windows-10/.

If that will not help then the only other suggestion I have is to try disabling the "Don't Fragment" flag of TCP. See for example: https://support.microsoft.com/en-us/help/900926/recommended-tcp-ip-settings-for-wan-links-with-a-mtu-size-of-less-than.

@PreetiMSobarad
Copy link

Hey @davidBar-On , I tried to disable LSO on windows. It did not help fix the initial issue but i do see improvement in the speed when I use small size buffers. Thats the only improvement I am able to see so far. But the bandwidth again becomes zero when the buffer size goes to 8KB. I have attached a text document that shows different buffer length and different bandwidth.

Using the second link, when I enabled PMTU (Dont fragment bit off), that made the bandwidth to drop compared to LSO disabled. I am attaching wireshark captures. One with LSO disabled and another one with LSO disbaled and PMTU enabled-
Dont fragment bit set off.

wiresharkmtu.zip
iperfLSOdisabled.txt

@davidBar-On
Copy link
Contributor

Hi @PreetiMSobarad, the "LSOdisabled" Wireshark log shows that test was successful - over 330MB were successfully send during the 10 seconds (for some reason each chunk sent is 1536 bytes, which is split into 1460+73 bytes packets). Can it be that it really worked successfully or by mistake you attached the Wireshark log for one of the small buffer tests?

Regarding don't fragment test. The don't fragment bit is still set, but packet size is reduced to 536, i.e. instead of unsetting the flag the TCP stack reduces the packet size to almost minimum. That should have worked successfully, but per the ICMP "fragmentation needed" error the packets are combined to 2-3KB packets before sending. Try using the "--no-delay" option that can help preventing combining the packets. If that doesn't help it may be that LSO was not disabled.

@bmah888
Copy link
Contributor

bmah888 commented Jul 6, 2020

Hello, @bmah888 According to the below link, the latest version for iperf3 Windows is 3.1.3. Do you know where I can get 3.7 version for Windows? https://iperf.fr/iperf-download.php#windows

ESnet doesn't produce binaries for Windows or any other platform, we only release source code. I'm sure if you use your favorite search engine you can find a Windows binary somewhere (just be aware that iperf3 is not an officially supported platform, and we can't take any responsibility for any binaries that a third party has made available).

@davidBar-On
Copy link
Contributor

@PreetiMSobarad, can you update about the status of the problem? Did it help to disable the LSO? Did anything else help?

@PreetiMSobarad
Copy link

Hello @davidBar-On disabling LSO did help me a little bit in pushing more data. But not the level I was expecting. But I think there's something wrong with my windows machine to be behaving like this. I'm marking this issue as solved. You have given many steps to debug and if someone is facing this issue can use these to resolve their issue.

@davidBar-On
Copy link
Contributor

@PreetiMSobarad , thanks. I have created issue #1029 with suggested documentations based on the debug steps suggested for this issue. You can close this issue.

@wongex23
Copy link

Hello,

I am having the same issue here.

What is the final solution?

@davidBar-On
Copy link
Contributor

@wongex23, there are several potential causes for this issue. Please send the commands you used and the client input. Also specify which iperf3 version you are using you should send the commands you are using and on what OS.

@FudiHub
Copy link

FudiHub commented May 14, 2023

I have a similar issue. 10 Gbit in one direction, 0 in the other.

  • MTU ist set to 9000.
  • OS is OPNsense on one side and Unraid on the other.
root@Host:~# iperf3 -c 10.10.20.1 --bidir
Connecting to host 10.10.20.1, port 5201
[  5] local 10.10.20.10 port 34956 connected to 10.10.20.1 port 5201
[  7] local 10.10.20.10 port 34970 connected to 10.10.20.1 port 5201
[ ID][Role] Interval           Transfer     Bitrate         Retr  Cwnd
[  5][TX-C]   0.00-1.00   sec   489 KBytes  4.01 Mbits/sec    3   8.74 KBytes       
[  7][RX-C]   0.00-1.00   sec  1.08 GBytes  9.30 Gbits/sec                  
[  5][TX-C]   1.00-2.00   sec  0.00 Bytes  0.00 bits/sec    1   8.74 KBytes       
[  7][RX-C]   1.00-2.00   sec  1.12 GBytes  9.65 Gbits/sec                  
[  5][TX-C]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec    0   8.74 KBytes       
[  7][RX-C]   2.00-3.00   sec  1.10 GBytes  9.48 Gbits/sec                  
[  5][TX-C]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec    1   8.74 KBytes       
[  7][RX-C]   3.00-4.00   sec  1.13 GBytes  9.69 Gbits/sec                  
[  5][TX-C]   4.00-5.00   sec  0.00 Bytes  0.00 bits/sec    0   8.74 KBytes       
[  7][RX-C]   4.00-5.00   sec  1.14 GBytes  9.76 Gbits/sec                  
[  5][TX-C]   5.00-6.00   sec  0.00 Bytes  0.00 bits/sec    0   8.74 KBytes       
[  7][RX-C]   5.00-6.00   sec  1.13 GBytes  9.72 Gbits/sec                  
[  5][TX-C]   6.00-7.00   sec  0.00 Bytes  0.00 bits/sec    1   8.74 KBytes       
[  7][RX-C]   6.00-7.00   sec  1.11 GBytes  9.57 Gbits/sec                  
[  5][TX-C]   7.00-8.00   sec  0.00 Bytes  0.00 bits/sec    0   8.74 KBytes       
[  7][RX-C]   7.00-8.00   sec  1.11 GBytes  9.54 Gbits/sec                  
[  5][TX-C]   8.00-9.00   sec  0.00 Bytes  0.00 bits/sec    0   8.74 KBytes       
[  7][RX-C]   8.00-9.00   sec  1.14 GBytes  9.82 Gbits/sec                  
[  5][TX-C]   9.00-10.00  sec  0.00 Bytes  0.00 bits/sec    0   8.74 KBytes       
[  7][RX-C]   9.00-10.00  sec  1.14 GBytes  9.77 Gbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID][Role] Interval           Transfer     Bitrate         Retr
[  5][TX-C]   0.00-10.00  sec   489 KBytes   401 Kbits/sec    6             sender
[  5][TX-C]   0.00-10.00  sec  0.00 Bytes  0.00 bits/sec                  receiver
[  7][RX-C]   0.00-10.00  sec  11.2 GBytes  9.63 Gbits/sec  247             sender
[  7][RX-C]   0.00-10.00  sec  11.2 GBytes  9.63 Gbits/sec                  receiver

iperf Done.
  • Setting MTU to 4000 gives following:
root@Host:~# iperf3 -c 10.10.20.1 --bidir
Connecting to host 10.10.20.1, port 5201
[  5] local 10.10.20.10 port 52932 connected to 10.10.20.1 port 5201
[  7] local 10.10.20.10 port 52944 connected to 10.10.20.1 port 5201
[ ID][Role] Interval           Transfer     Bitrate         Retr  Cwnd
[  5][TX-C]   0.00-1.00   sec   188 MBytes  1.58 Gbits/sec    0    301 KBytes       
[  7][RX-C]   0.00-1.00   sec   487 MBytes  4.08 Gbits/sec                 
[  5][TX-C]   1.00-2.00   sec  82.5 MBytes   692 Mbits/sec    0    293 KBytes       
[  7][RX-C]   1.00-2.00   sec   709 MBytes  5.95 Gbits/sec                 
[  5][TX-C]   2.00-3.00   sec   122 MBytes  1.03 Gbits/sec    0    254 KBytes       
[  7][RX-C]   2.00-3.00   sec   602 MBytes  5.05 Gbits/sec                 
[  5][TX-C]   3.00-4.00   sec   151 MBytes  1.27 Gbits/sec    0    270 KBytes       
[  7][RX-C]   3.00-4.00   sec   584 MBytes  4.90 Gbits/sec                 
[  5][TX-C]   4.00-5.00   sec   145 MBytes  1.22 Gbits/sec    0    285 KBytes       
[  7][RX-C]   4.00-5.00   sec   602 MBytes  5.05 Gbits/sec                 
[  5][TX-C]   5.00-6.00   sec   121 MBytes  1.01 Gbits/sec    0    278 KBytes       
[  7][RX-C]   5.00-6.00   sec   658 MBytes  5.52 Gbits/sec                 
[  5][TX-C]   6.00-7.00   sec   144 MBytes  1.21 Gbits/sec    0    270 KBytes       
[  7][RX-C]   6.00-7.00   sec   592 MBytes  4.97 Gbits/sec                 
[  5][TX-C]   7.00-8.00   sec   150 MBytes  1.25 Gbits/sec    0    316 KBytes       
[  7][RX-C]   7.00-8.00   sec   607 MBytes  5.09 Gbits/sec                 
[  5][TX-C]   8.00-9.00   sec   129 MBytes  1.09 Gbits/sec    0    293 KBytes       
[  7][RX-C]   8.00-9.00   sec   616 MBytes  5.17 Gbits/sec                 
[  5][TX-C]   9.00-10.00  sec   108 MBytes   908 Mbits/sec    0    320 KBytes       
[  7][RX-C]   9.00-10.00  sec   635 MBytes  5.32 Gbits/sec                 
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID][Role] Interval           Transfer     Bitrate         Retr
[  5][TX-C]   0.00-10.00  sec  1.31 GBytes  1.13 Gbits/sec    0             sender
[  5][TX-C]   0.00-10.00  sec  1.31 GBytes  1.12 Gbits/sec                  receiver
[  7][RX-C]   0.00-10.00  sec  5.95 GBytes  5.11 Gbits/sec   43             sender
[  7][RX-C]   0.00-10.00  sec  5.95 GBytes  5.11 Gbits/sec                  receiver

iperf Done.

Any idea how I can improve this? Why is TCP bandwith dropping to 0 in one direction when MTU is 9000?

@davidBar-On
Copy link
Contributor

@FudiHub, few questions:

  1. Did you try running one directional test, i.e. without the --bidir - both from client to server and from server to client (using -R)?
  2. I assume you referred to the client's MTU. What is the MTU on the server side?
  3. One of the possible causes for such issue is that fragmentation is not allowed/supported, and that 9K packets is to large for server or router in the way. Can this be the reason?

@FudiHub
Copy link

FudiHub commented May 14, 2023

@FudiHub, few questions:

  1. Did you try running one directional test, i.e. without the --bidir - both from client to server and from server to client (using -R)?
  2. I assume you referred to the client's MTU. What is the MTU on the server side?
  3. One of the possible causes for such issue is that fragmentation is not allowed/supported, and that 9K packets is to large for server or router in the way. Can this be the reason?

Thanks for your questions:

  1. I tried with directional test as well. Just did a bidirectional result here to have it more compact.
  2. Both, client and server are set to MTU 9000
  3. Maybe, I set client to MTU 4000 and get the following results:
root@Host:~# iperf3 -c 10.10.20.1 --bidir
Connecting to host 10.10.20.1, port 5201
[  5] local 10.10.20.10 port 38996 connected to 10.10.20.1 port 5201
[  7] local 10.10.20.10 port 39004 connected to 10.10.20.1 port 5201
[ ID][Role] Interval           Transfer     Bitrate         Retr  Cwnd
[  5][TX-C]   0.00-1.00   sec   133 MBytes  1.12 Gbits/sec    0    278 KBytes       
[  7][RX-C]   0.00-1.00   sec   653 MBytes  5.48 Gbits/sec                  
[  5][TX-C]   1.00-2.00   sec   118 MBytes   986 Mbits/sec    0    270 KBytes       
[  7][RX-C]   1.00-2.00   sec   686 MBytes  5.75 Gbits/sec                  
[  5][TX-C]   2.00-3.00   sec   148 MBytes  1.24 Gbits/sec    0    293 KBytes       
[  7][RX-C]   2.00-3.00   sec   731 MBytes  6.14 Gbits/sec                  
[  5][TX-C]   3.00-4.00   sec   145 MBytes  1.22 Gbits/sec    0    270 KBytes       
[  7][RX-C]   3.00-4.00   sec   756 MBytes  6.34 Gbits/sec                  
[  5][TX-C]   4.00-5.00   sec   478 MBytes  4.01 Gbits/sec    0    316 KBytes       
[  7][RX-C]   4.00-5.00   sec   583 MBytes  4.89 Gbits/sec                  
[  5][TX-C]   5.00-6.00   sec   506 MBytes  4.25 Gbits/sec    0    301 KBytes       
[  7][RX-C]   5.00-6.00   sec   582 MBytes  4.88 Gbits/sec                  
[  5][TX-C]   6.00-7.00   sec   412 MBytes  3.46 Gbits/sec    0    308 KBytes       
[  7][RX-C]   6.00-7.00   sec   638 MBytes  5.36 Gbits/sec                  
[  5][TX-C]   7.00-8.00   sec   470 MBytes  3.94 Gbits/sec    0    270 KBytes       
[  7][RX-C]   7.00-8.00   sec   599 MBytes  5.03 Gbits/sec                  
[  5][TX-C]   8.00-9.00   sec   446 MBytes  3.74 Gbits/sec    0    316 KBytes       
[  7][RX-C]   8.00-9.00   sec   587 MBytes  4.92 Gbits/sec                  
[  5][TX-C]   9.00-10.00  sec   441 MBytes  3.70 Gbits/sec    0    308 KBytes       
[  7][RX-C]   9.00-10.00  sec   601 MBytes  5.04 Gbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID][Role] Interval           Transfer     Bitrate         Retr
[  5][TX-C]   0.00-10.00  sec  3.22 GBytes  2.77 Gbits/sec    0             sender
[  5][TX-C]   0.00-10.00  sec  3.22 GBytes  2.76 Gbits/sec                  receiver
[  7][RX-C]   0.00-10.00  sec  6.27 GBytes  5.38 Gbits/sec  1668             sender
[  7][RX-C]   0.00-10.00  sec  6.27 GBytes  5.38 Gbits/sec                  receiver

iperf Done.

But I am not sure why my OPNsense should not be able to handle 9K packets. Its a DEC740.
tbh I am very new to all this 10GB space....

Edit: Reading the results of iperf with MTU 4000 on client side 1668 packet had to be retransmitted... also not a good result, right?

@davidBar-On
Copy link
Contributor

Few tests that may help to better understand what may be the problem (when MTU is 9000):

  1. Find the Path MTU for both directions. May be done using ping with and without -f option and varying -l packet sizes.
  2. Use Wireshark or tcpdump on both the client and server to find the actual sizes of the packets/fragments sent/received, if and what is received from the server on the client's machine, etc. By the way using UDP with -l 8K instead of TCP may be easier to analyze.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants