Skip to content

Connection is RST if data (< init window) with MPTCP options received, then MPTCP options are dropped #518

@matttbe

Description

@matttbe

@cpaasch reported me an issue seen in the field, due to a middlebox stripping MPTCP options later on after the 3WHS.

This packetdrill test is what we are supposed to have:

--tolerance_usecs=100000
`../common/defaults.sh`

0.0    socket(..., SOCK_STREAM, IPPROTO_MPTCP) = 3
+0     setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0

+0     bind(3, ..., ...) = 0
+0     listen(3, 1) = 0

// 3WHS is OK
+0.0     <  S   0:0(0)                    win 65535  <mss 1460, sackOK, nop, nop, nop, wscale 6, mpcapable v1 flags[flag_h] nokey>
+0.0     >  S.  0:0(0)           ack 1               <mss 1460, nop, nop, sackOK, nop, wscale 8, mpcapable v1 flags[flag_h] key[skey]>
+0.1     <   .  1:1(0)           ack 1    win 2048                                              <mpcapable v1 flags[flag_h] key[ckey=2, skey]>
+0     accept(3, ..., ...) = 4

+0.01    <  P.  1:501(500)       ack 1    win 2048                                              <mpcapable v1 flags[flag_h] key[skey, ckey] mpcdatalen 500, nop, nop>
// From here, the MPTCP options will be dropped by a middlebox
+0.0     >   .  1:1(0)           ack 501                                                        <dss dack8=501 dll=0 nocs>
+0.01  read(4, ..., 500) = 500
+0     write(4, ..., 100) = 100

// The server replies with data, still thinking MPTCP is being used
+0       >  P.  1:101(100)       ack 501                                                        <dss dack8=501 dsn8=1 ssn=1 dll=100 nocs, nop, nop>
// But the client already did a fallback to TCP, because the two previous packets have been received without MPTCP options
+0.01    <   .  501:501(0)       ack 101  win 2048

+0.0     <  P.  501:601(100)     ack 101  win 2048
+0.01  `nstat -s | grep MPTcp`  // in case of issue, MPRstTx counter is 1
+0.0     >   .  101:101(0)       ack 601

But instead of sending a "plain TCP" ACK for the last packet here, we can see a RST packet with MP_TCPRST set to 1:

0.131450 R. 101:101(0) ack 601 win 1049 <mp_reset 1>

The server sends a RST, because when it detects the errors, it first checks if a fallback is possible [1] using subflow_can_fallback() which returns false due to:

return !subflow->fully_established;

Technically, the server should not be in fully_established mode here, because it didn't receive a valid DATA_ACK from the client, nor data for more than the initial window (implying a DATA_ACK has been received from the other side).

But not to change the whole behaviour, maybe easier to do something like this if it is OK with the RFC:

diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index a1e28e1d8b4e..61482f8b425b 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -1282,7 +1282,7 @@ static bool subflow_can_fallback(struct mptcp_subflow_context *subflow)
        else if (READ_ONCE(msk->csum_enabled))
                return !subflow->valid_csum_seen;
        else
-               return !READ_ONCE(subflow->fully_established);
+               return READ_ONCE(msk->allow_infinite_fallback);
 }
 
 static void mptcp_subflow_fail(struct mptcp_sock *msk, struct sock *ssk)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions