Skip to content

Pad final GSO segment for CPU efficiency #4860

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

mtfriesen
Copy link
Contributor

Description

Describe the purpose of and changes within this Pull Request.

Reduce goodput to potentially increase CPU efficiency by padding all GSO segments, including the final segment.

Testing

Do any existing tests cover this change? Are new tests needed?

CI.

Documentation

Is there any documentation impact for this change?

No.

Copy link

codecov bot commented Feb 26, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 86.06%. Comparing base (dcb20cf) to head (51347fc).
Report is 4 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4860      +/-   ##
==========================================
- Coverage   86.16%   86.06%   -0.10%     
==========================================
  Files          56       56              
  Lines       17634    17630       -4     
==========================================
- Hits        15195    15174      -21     
- Misses       2439     2456      +17     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mtfriesen
Copy link
Contributor Author

mtfriesen commented May 2, 2025

Context:

  1. The perf results were ambiguous, with only one comparison point on the dashboard, and three iterations deviating wildly from that comparison point. There is no statistical confidence in the perf benchmark output.
  2. No VMs in Azure currently support hardware USO, so the practical benefits are limited to possibly better interaction with URO (which requires consistent datagram sizes). The costs include introducing padding bytes on the wire. In the worst case (a huge datagram followed by a 1-byte datagram) goodput is reduced by half. There could obviously be a heuristic to limit the padding loss by choosing whether to pad the final datagram based on the total send size.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant