Skip to content

high number of unneeded retransmissions in bulk transfer #253

@pabeni

Description

@pabeni

comparing MPTCP single subflow bulk transfer performances to plain TCP, a large number of TCP-level retransmissions happen in the MPTCP case:

mptcpize run taskset 0x1 iperf3 -t 60 -i 60 -c 192.168.255.1
Connecting to host 192.168.255.1, port 5201
[  5] local 192.168.255.2 port 48820 connected to 192.168.255.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-60.00  sec   133 GBytes  19.1 Gbits/sec  36519   1.02 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-60.00  sec   133 GBytes  19.1 Gbits/sec  36519             sender
[  5]   0.00-60.00  sec   133 GBytes  19.1 Gbits/sec                  receiver

That is sever orders of magnitude more than plain TCP in the same scenario:

taskset 0x1 iperf3 -t 60 -i 60 -c 192.168.255.1
Connecting to host 192.168.255.1, port 5201
[  5] local 192.168.255.2 port 48824 connected to 192.168.255.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-4.05   sec  10.8 GBytes  22.9 Gbits/sec   68    971 KBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-4.05   sec  10.8 GBytes  22.9 Gbits/sec   68             sender
[  5]   0.00-4.05   sec  0.00 Bytes  0.00 bits/sec                  receiver

The retransmissions are fast retransmit ones:

taskset 0x1 iperf3 -t 60 -i 60 -c 192.168.255.1 & sleep 1; nstat >/dev/null ; sleep 1 ; nstat Tcp* MPTcp*
[1] 8321
Connecting to host 192.168.255.1, port 5201
[  5] local 192.168.255.2 port 48828 connected to 192.168.255.1 port 5201
#kernel
TcpInSegs                       21415              0.0
TcpOutSegs                      1990105            0.0
TcpExtTCPPureAcks               13718              0.0
TcpExtTCPHPAcks                 7694               0.0
TcpExtTCPBacklogCoalesce        6                  0.0
TcpExtTCPAutoCorking            4589               0.0
TcpExtTCPOrigDataSent           1990487            0.0
TcpExtTCPDelivered              1990487            0.0

perf probes show that the code triggering the fast retransmit is:

        if (tcp_ack_is_dubious(sk, flag)) {
                if (!(flag & (FLAG_SND_UNA_ADVANCED | FLAG_NOT_DUP))) {
                        num_dupack = 1;
                        /* Consider if pure acks were aggregated in tcp_add_backlog() */
                        if (!(flag & FLAG_DATA))
                                num_dupack = max_t(u16, 1, skb_shinfo(skb)->gso_segs);
                }
                tcp_fastretrans_alert(sk, prior_snd_una, num_dupack, &flag,
                                      &rexmit);

in tcp_ack(), due to 'flag' being 0x4100. That is possibly caused by MPTCP-level ack generated on the other side by mptcp_cleanup_rbuf(), with unmodified TCP-level ack seq, unmodified window size, and increased MPTCP-level ack seq.

From plain TCP PoV, such acks are dups and trigger the fast retransmit logic.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions