-
-
Notifications
You must be signed in to change notification settings - Fork 51
Description
@cpaasch reported me an issue seen in the field, due to a middlebox stripping MPTCP options later on after the 3WHS.
This packetdrill test is what we are supposed to have:
--tolerance_usecs=100000
`../common/defaults.sh`
0.0 socket(..., SOCK_STREAM, IPPROTO_MPTCP) = 3
+0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0 bind(3, ..., ...) = 0
+0 listen(3, 1) = 0
// 3WHS is OK
+0.0 < S 0:0(0) win 65535 <mss 1460, sackOK, nop, nop, nop, wscale 6, mpcapable v1 flags[flag_h] nokey>
+0.0 > S. 0:0(0) ack 1 <mss 1460, nop, nop, sackOK, nop, wscale 8, mpcapable v1 flags[flag_h] key[skey]>
+0.1 < . 1:1(0) ack 1 win 2048 <mpcapable v1 flags[flag_h] key[ckey=2, skey]>
+0 accept(3, ..., ...) = 4
+0.01 < P. 1:501(500) ack 1 win 2048 <mpcapable v1 flags[flag_h] key[skey, ckey] mpcdatalen 500, nop, nop>
// From here, the MPTCP options will be dropped by a middlebox
+0.0 > . 1:1(0) ack 501 <dss dack8=501 dll=0 nocs>
+0.01 read(4, ..., 500) = 500
+0 write(4, ..., 100) = 100
// The server replies with data, still thinking MPTCP is being used
+0 > P. 1:101(100) ack 501 <dss dack8=501 dsn8=1 ssn=1 dll=100 nocs, nop, nop>
// But the client already did a fallback to TCP, because the two previous packets have been received without MPTCP options
+0.01 < . 501:501(0) ack 101 win 2048
+0.0 < P. 501:601(100) ack 101 win 2048
+0.01 `nstat -s | grep MPTcp` // in case of issue, MPRstTx counter is 1
+0.0 > . 101:101(0) ack 601
But instead of sending a "plain TCP" ACK for the last packet here, we can see a RST packet with MP_TCPRST set to 1:
0.131450 R. 101:101(0) ack 601 win 1049 <mp_reset 1>
The server sends a RST, because when it detects the errors, it first checks if a fallback is possible [1] using subflow_can_fallback() which returns false due to:
return !subflow->fully_established;Technically, the server should not be in fully_established mode here, because it didn't receive a valid DATA_ACK from the client, nor data for more than the initial window (implying a DATA_ACK has been received from the other side).
But not to change the whole behaviour, maybe easier to do something like this if it is OK with the RFC:
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
index a1e28e1d8b4e..61482f8b425b 100644
--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -1282,7 +1282,7 @@ static bool subflow_can_fallback(struct mptcp_subflow_context *subflow)
else if (READ_ONCE(msk->csum_enabled))
return !subflow->valid_csum_seen;
else
- return !READ_ONCE(subflow->fully_established);
+ return READ_ONCE(msk->allow_infinite_fallback);
}
static void mptcp_subflow_fail(struct mptcp_sock *msk, struct sock *ssk)