Skip to content

Commit

Permalink
net-tcp_bbr: v2: adjust skb tx.in_flight upon merge in tcp_shifted_skb()
Browse files Browse the repository at this point in the history
When tcp_shifted_skb() updates state as adjacent SACKed skbs are
coalesced, previously the tx.in_flight was not adjusted, so we could
get contradictory state where the skb's recorded pcount was bigger
than the tx.in_flight (the number of segments that were in_flight
after sending the skb).

Normally have a SACKed skb with contradictory pcount/tx.in_flight
would not matter. However, with SACK reneging, the SACKed bit is
removed, and an skb once again becomes eligible for retransmitting,
fragmenting, SACKing, etc. Packetdrill testing verified the following
sequence is possible in a kernel that does not have this commit:

 - skb N is SACKed
 - skb N+1 is SACKed and combined with skb N using tcp_shifted_skb()
   - tcp_shifted_skb() will increase the pcount of prev,
     but leave tx.in_flight as-is
   - so prev skb can have pcount > tx.in_flight
 - RTO, tcp_timeout_mark_lost(), detect reneg,
   remove "SACKed" bit, mark skb N as lost
   - find pcount of skb N is greater than its tx.in_flight

I suspect this issue iw what caused the bbr2_inflight_hi_from_lost_skb():
  WARN_ON_ONCE(inflight_prev < 0)
to fire in production machines using bbr2.

Tested: See last commit in series for sponge link.

Effort: net-tcp_bbr
Origin-9xx-SHA1: 1a3e997e613d2dcf32b947992882854ebe873715
Change-Id: I1b0b75c27519953430c7db51c6f358f104c7af55
  • Loading branch information
nealcardwell authored and xanmod committed Feb 16, 2021
1 parent b0dfecf commit c9a1234
Showing 1 changed file with 11 additions and 0 deletions.
11 changes: 11 additions & 0 deletions net/ipv4/tcp_input.c
Expand Up @@ -1425,6 +1425,17 @@ static bool tcp_shifted_skb(struct sock *sk, struct sk_buff *prev,
WARN_ON_ONCE(tcp_skb_pcount(skb) < pcount);
tcp_skb_pcount_add(skb, -pcount);

/* Adjust tx.in_flight as pcount is shifted from skb to prev. */
if (WARN_ONCE(TCP_SKB_CB(skb)->tx.in_flight < pcount,
"prev in_flight: %u skb in_flight: %u pcount: %u",
TCP_SKB_CB(prev)->tx.in_flight,
TCP_SKB_CB(skb)->tx.in_flight,
pcount))
TCP_SKB_CB(skb)->tx.in_flight = 0;
else
TCP_SKB_CB(skb)->tx.in_flight -= pcount;
TCP_SKB_CB(prev)->tx.in_flight += pcount;

/* When we're adding to gso_segs == 1, gso_size will be zero,
* in theory this shouldn't be necessary but as long as DSACK
* code can come after this skb later on it's better to keep
Expand Down

0 comments on commit c9a1234

Please sign in to comment.