Skip to content

Adaptive reordering thresholds#470

Merged
ghedo merged 1 commit intomasterfrom
adaptive_reordering_thresh
Jul 7, 2021
Merged

Adaptive reordering thresholds#470
ghedo merged 1 commit intomasterfrom
adaptive_reordering_thresh

Conversation

@ghedo
Copy link
Copy Markdown
Member

@ghedo ghedo commented Apr 20, 2020

This implements adaptive reordering threshold as suggested in:
https://quicwg.org/base-drafts/draft-ietf-quic-recovery.html#section-5.1.1.

The algorithm is similar to what Chrome uses as well.

I did a quick test on my laptop with netem delay 10ms 1ms 10% and this seems to improve things a fair amount:

Before:

finished in 31.92s, 0.03 req/s, 321.44KB/s
requests: 1 total, 1 started, 1 done, 1 succeeded, 0 failed, 0 errored, 0 timeout
status codes: 1 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 10.02MB (10506459) total, 97B (97) headers (space savings 47.57%), 10.00MB (10485760) data

After:

finished in 1.85s, 0.54 req/s, 5.40MB/s
requests: 1 total, 1 started, 1 done, 1 succeeded, 0 failed, 0 errored, 0 timeout
status codes: 1 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 10.00MB (10488607) total, 97B (97) headers (space savings 47.57%), 10.00MB (10485760) data

Though we will need more rigorous tests in lab. In particular, I'm not super confident the code to undo the cwnd update is correct. We will also need a time reordering unit test.

This depends on #468 and #469.

@ghedo ghedo requested a review from a team as a code owner April 20, 2020 14:11
Comment thread src/recovery.rs Outdated
@ghedo ghedo force-pushed the recovery-refactor-sent branch from 0808c46 to 3043f90 Compare April 21, 2020 13:38
Copy link
Copy Markdown
Contributor

@junhochoi junhochoi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to add a new hook (e.g. cwnd_undo()) at CongestionControlOps because when we need to undo cwnd, there can be CC-dependent state changes as well. e.g. in CUBIC I'd like to reset w_max when it happens. e.g. linux tcp has undo_cwnd as a cc hook.

Copy link
Copy Markdown
Contributor

@junhochoi junhochoi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can use some of existing experiences -- I was working on a similar work (better to coordinate)

  • packet threshold: TCP-NCR(RFC4653) defines a new Limited Transmit. (mentioned in quicwg/base-drafts#3572) While I think TCP-NCR itself is not much useful for QUIC, but the idea of changing dupthresh up to flightsize/MSS -- which means we can remove hard coded MAX_PACKET_THRESHOLD), allowing pkt_thresh up to flight_size/MAX_DATAGRAM_SIZE
  • delay threshold: RACK (https://tools.ietf.org/html/draft-ietf-tcpm-rack-08) mentions how to detect and define a delay threshold for reordering. I'd prefer to implement RACK's RACK_update_reo_wnd() which is gradually increasing reordering threshold delay instead of using fixed 5/4 x RTT. QUIC recovery draft already includes many idea of RACK but not this part.

@junhochoi
Copy link
Copy Markdown
Contributor

I did a quick test on my laptop with netem delay 10ms 1ms 10% and this seems to improve things a fair amount:

Have you tested with netem reorder..?

@ghedo ghedo force-pushed the recovery-refactor-sent branch 3 times, most recently from 0b7493c to ff6719e Compare April 24, 2020 17:33
@ghedo ghedo force-pushed the adaptive_reordering_thresh branch from 1a0503e to 3399c85 Compare April 24, 2020 17:50
@ghedo ghedo force-pushed the recovery-refactor-sent branch 2 times, most recently from 0415612 to 1135e74 Compare May 12, 2020 10:36
@ghedo ghedo force-pushed the recovery-refactor-sent branch 4 times, most recently from f2049a6 to c1bd7eb Compare June 2, 2020 10:25
Base automatically changed from recovery-refactor-sent to master June 3, 2020 13:14
@ghedo ghedo force-pushed the adaptive_reordering_thresh branch from 3399c85 to 3f4c50b Compare June 3, 2020 13:35
@ghedo ghedo force-pushed the adaptive_reordering_thresh branch from 3f4c50b to 55935d0 Compare June 29, 2020 12:57
@ghedo ghedo force-pushed the adaptive_reordering_thresh branch from 55935d0 to 8f98857 Compare April 7, 2021 13:24
@ghedo
Copy link
Copy Markdown
Member Author

ghedo commented Apr 7, 2021

Rebased and updated based on #893.

@ghedo ghedo changed the base branch from master to cc-checkpoint April 7, 2021 15:49
@ghedo ghedo force-pushed the adaptive_reordering_thresh branch from 8f98857 to 150ddce Compare April 9, 2021 16:35
@ghedo ghedo changed the base branch from cc-checkpoint to master April 9, 2021 16:35
@ghedo ghedo force-pushed the adaptive_reordering_thresh branch from 150ddce to 36ecfbe Compare April 15, 2021 14:28
@ghedo ghedo changed the title WIP: Adaptive reordering thresholds Adaptive reordering thresholds Apr 15, 2021
@ghedo ghedo force-pushed the adaptive_reordering_thresh branch from 36ecfbe to 072bb64 Compare June 18, 2021 14:18
The idea is to increase packet and time reordering thresholds when
spurious losses are detected (i.e. when a packet previously declared
lost is acked).

In addition, the congestion control state is rolled back to its state
before the last congestion event when a spurious loss is detected.

This is similar to what Chrome currently implements.
@ghedo ghedo force-pushed the adaptive_reordering_thresh branch from 072bb64 to d5da22f Compare July 5, 2021 13:25
Copy link
Copy Markdown
Contributor

@junhochoi junhochoi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we still need to tune the algorithm (or threshold. ideally we need to increase a time/packet threshold based on the previous state) for some case (e.g. in a high bandwidth, the benefit is lower), but still better when there is a reodering happening. 👍

@ghedo ghedo merged commit acacf86 into master Jul 7, 2021
@ghedo ghedo deleted the adaptive_reordering_thresh branch July 7, 2021 14:37
toidiu added a commit that referenced this pull request Aug 28, 2025
Dynamic time threshold logic from
#470 was accidentally removed.

The goal is to make time-based detection less sensitive when a spurious
loss is detected.
toidiu added a commit that referenced this pull request Aug 28, 2025
Dynamic time threshold logic from
#470 was accidentally removed.

The goal is to make time-based detection less sensitive when a spurious
loss is detected.
toidiu added a commit that referenced this pull request Aug 28, 2025
Dynamic time threshold logic from
#470 was accidentally removed.

The goal is to make time-based detection less sensitive when a spurious
loss is detected.
toidiu added a commit that referenced this pull request Sep 2, 2025
Dynamic time threshold logic from
#470 was accidentally removed.

The goal is to make time-based detection less sensitive when a spurious
loss is detected.
toidiu added a commit that referenced this pull request Sep 2, 2025
* fix: adjust time-based loss detection threshold on packet reordering

Dynamic time threshold logic from
#470 was accidentally removed.

The goal is to make time-based detection less sensitive when a spurious
loss is detected.

* refactor: cleanup test

* fix: fix spurious count logic in gcongestion and add test.

Recovery logic was returning the wrong spurious loss count if no
new packets were acked. This fixes the bug.

I also added a new test for time-based loss detection, specifically
testing that the value increases after a spurious loss event.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants