Contribution to buffer bloat and slower convergence due to larger decrease factor #89

goelvidhi · 2021-08-31T08:23:34Z

Markku Kojo said,

This draft uses a larger cwnd decrease factor, resulting in larger
average cwnd and buffer occupation. This means that it is
likely to contribute significantly to buffer bloat, particularly
when considering also the use of concave increase function in the
beginning of the congestion avoidance that keeps the cwnd close
to maximum most of the time as carefully explained in the draft.
This means that CUBIC keeps also buffer bloated router queues
very efficiently full at all times.

Currently the draft does mention the slower convergence speed
as the only side effect for the larger decrease factor and does
not discuss the contribution to buffer bloat. It would be
important to assess this together with measurement data to
back up any observations.

Do we have data in different environments, including buffer-bloated
environments that show how much effect CUBIC has compared to
AIMD TCP?
And, how does larger decrease function impact convergence speed,
particularly in buffer-bloated environments.
Many people have complained that window-based (TCP) congestion
control drives buffer bloat. Of course, also the current standard
AIMD TCP tends to fill in the buffer-bloated queues but it
unlikely does it as effectively as CUBIC? This would be good to
understand better.

goelvidhi · 2021-09-14T07:31:20Z

@markkukojo,
@bbriscoe pointed me to this paper https://www.simula.no/sites/default/files/publications/Simula.simula.618.pdf. If you look at Figure 7, CUBIC has a much lower queuing delay variation than New Reno which means that CUBIC has a shallower saw-teeth than New Reno. But as you said, due to higher β (0.7) and concave increase function, CUBIC can reach W_max faster. This means that although in this paper, CUBIC shows a higher queuing delay than New Reno, this can be easily solved by configuring a lower threshold/interval for loss / ECN at the bottleneck AQM which would make CUBIC perform decreases more often without losing link utilization as significantly (as compared to New Reno). This would also help to solve buffer bloat better than New Reno.

bbriscoe · 2021-09-15T10:21:04Z

I would add to Vidhi's comment by explaining that network operators are not expected to 'solve' this issue by setting a lower AQM threshold wherever Cubic is used (which is clearly impractical, but might be how someone awkward could interpret Vidhi's words). Nonetheless, widespread deployment of Cubic (2006+ timeframe) gives more scope for setting AQM thresholds lower for the same utilization. Indeed, the subsequent round of AQM implementations (2012+ timeframe) could set AQM thresholds to a lower default than if Reno had still been widely deployed.

I don't think 'bufferbloat' is even the right word for an AQM that is set for the amplitude of Reno's sawteeth rather than Cubic's. Bloat implies excessively large, not just a little too large.

The lesson here is that we need to be careful attributing blame. Once AQMs are deployed to address real bufferbloat, the root cause of the residual 'bufferbloat' is not in the buffer, it's in the large variations of congestion control sawteeth (in slow-start as well as congestion avoidance). It is inappropriate to blame Cubic for squeezing the sawteeth up nearer to the threshold - the blame for the threshold needing to be that high in the first place falls on the predecessor to Cubic (Reno).

Should the draft say something about this? I think it should (briefly - to counter any future criticism similar to Markku's). But that would require a new section. It doesn't fit under any of the headings in the existing 'Discussion' section, which follows the structure suggested by RFC5033. But not saying anything would also be OK for me.

bbriscoe · 2021-09-15T10:26:32Z

Regarding slower convergence, the same section could also say that the smaller reduction per round (larger β) means that it takes more rounds to reduce in response to continuing congestion or another flow trying to push in. Nonetheless, convergence speed prior to Cubic was primarily limited by the slow additive increase of Reno, which can be faster once Cubic gets into true Cubic mode.

bbriscoe · 2021-09-23T16:56:03Z

Any new text about slower convergence obviously ought to mention Cubic's optional fast convergence mechanism.

A good paper that is mostly an evaluation of Cubic's convergence (with fast convergence mechanism enabled) is:
Leith, D. J.; Shorten, R. N. & McCullagh, G. Experimental evaluation of Cubic-TCP Proc. Int'l Wkshp on Protocols for Future, Large-scale & Diverse Network Transports (PFLDNeT'07), 2007
It's not particularly complementary about Cubic's convergence.

On that subject, this text in the fast convergence section is infeasible to comply with:

Fast Convergence is designed for network environments with multiple CUBIC flows. In network
environments with only a single CUBIC flow and without any other traffic, Fast Convergence SHOULD
be disabled.

This ought to say whether fast convergence is recommended for use over the public Internet, or not (given the public Internet is designed for both single flows and multiple flows). I believe fast convergence is generally enabled, so this is the sort of thing that ought to be recommended in an RFC that wraps up more than a decade of experience using Cubic.

larseggert · 2021-10-11T07:29:39Z

@lisongxu since you self-assigned this some time ago, would you please prepare a PR to close this issue?

lisongxu · 2021-10-12T02:48:04Z

Thanks, @larseggert . I will

lisongxu · 2021-10-17T22:05:21Z

@larseggert This issue overlaps mainly with two other issues. The buffer bloat part overlaps with issue #94, and the convergence part overlaps with issue #96.

larseggert · 2021-10-18T10:16:43Z

@lisongxu do you think anything more needs to be done here after those issues are closed?

lisongxu · 2021-10-18T14:03:19Z

@larseggert No, thank you

lisongxu self-assigned this Sep 1, 2021

This was referenced Sep 15, 2021

Sec 5.4 #94

Closed

Sec 5.6 #96

Closed

larseggert linked a pull request Oct 18, 2021 that will close this issue

PR of @lisongxu's suggestion for #94 #123

Merged

larseggert closed this as completed in #123 Oct 22, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contribution to buffer bloat and slower convergence due to larger decrease factor #89

Contribution to buffer bloat and slower convergence due to larger decrease factor #89

goelvidhi commented Aug 31, 2021 •

edited by larseggert

Loading

goelvidhi commented Sep 14, 2021 •

edited

Loading

bbriscoe commented Sep 15, 2021

bbriscoe commented Sep 15, 2021

bbriscoe commented Sep 23, 2021

larseggert commented Oct 11, 2021

lisongxu commented Oct 12, 2021

lisongxu commented Oct 17, 2021

larseggert commented Oct 18, 2021

lisongxu commented Oct 18, 2021

Contribution to buffer bloat and slower convergence due to larger decrease factor #89

Contribution to buffer bloat and slower convergence due to larger decrease factor #89

Comments

goelvidhi commented Aug 31, 2021 • edited by larseggert Loading

goelvidhi commented Sep 14, 2021 • edited Loading

bbriscoe commented Sep 15, 2021

bbriscoe commented Sep 15, 2021

bbriscoe commented Sep 23, 2021

larseggert commented Oct 11, 2021

lisongxu commented Oct 12, 2021

lisongxu commented Oct 17, 2021

larseggert commented Oct 18, 2021

lisongxu commented Oct 18, 2021

goelvidhi commented Aug 31, 2021 •

edited by larseggert

Loading

goelvidhi commented Sep 14, 2021 •

edited

Loading