Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pass enable-udp-aggregation=true to ovn-kubernetes #1533

Merged
merged 2 commits into from
Sep 29, 2022

Conversation

danwinship
Copy link
Contributor

Re-push of #1489; that got reverted (#1510) because QE had reported that it caused performance regressions. However

  1. Most of the problem turned out to be because un-labeled numbers in the perf results had been interpreted as being in milliseconds when they were actually microseconds; there was an increase in latency, but it was tiny and expected, not large and problematic.
  2. The rest of the problem is with a dubious test scenario which tests how fast the network is when using a network protocol that is, essentially, maximally "optimized" for slowness by doing fully-serialized request / response / request / response / request / response... with tiny packets.

No one has been able to suggest a real-world use case that would see the sort of drastic slowdowns seen in the "Request/Response operations" test:

  1. RTP/RTSP/WebRTC and other streaming protocols don't look like that, because they just keep sending packets, rather than sending 1 response packet for each request packet. (And of course, streaming protocols are what the UDP aggregation feature was intended to speed up, and it did speed them up; the 64-byte packet size streaming throughput nearly doubled.)
  2. DNS doesn't look like that, because (a) no one ever does 10,000 DNS requests all at once, (b) if they did, they'd parallelize them rather than doing them 1-by-1, (c) the DNS protocol has specific design features to minimize the need for serialized lookups (eg, when you get a CNAME record, it automatically includes the corresonding A record as well so you don't have to make a second request) because people figured out in the 1980s that you shouldn't design protocols that require lots of small serialized requests and responses.
  3. HTTP/3 doesn't look like that because (a) it uses large packets when sending large data, (b) as in the streaming case, a single request can result in many response packets, rather than being 1-to-1, (c) a single HTTP/3 connection can have multiple concurrent request/response streams.

etc

Additionally, testing a protocol like this inside a kubernetes clusters is an implausibly-"best" best case scenario; in most normal environments, the client and server would be farther apart from each other (in terms of network topology), and thus a protocol like the "Request/Response operations" test that was extremely latency-sensitive would have much worse behavior, making it even more likely that the developers would change the way it worked.

So, this PR re-enables the feature.

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 23, 2022
@danwinship
Copy link
Contributor Author

/retest-required

1 similar comment
@danwinship
Copy link
Contributor Author

/retest-required

@trozet
Copy link
Contributor

trozet commented Aug 29, 2022

@danwinship the request/response test you mentioned in your 2nd point...It looks like the latency has increased for handling single UDP packets. I'm looking at https://bugzilla.redhat.com/show_bug.cgi?id=2085089#c37

However that is the comment I think where the units are wrong, and now we are focusing on the number of request/response in a minute. What I don't understand is if the latency is faster now that we know the real unit is microseconds, how is it possible that the number of request/responses are so much worse?

@mffiedler do you agree with re-enabling this now?

@danwinship
Copy link
Contributor Author

What I don't understand is if the latency is faster now that we know the real unit is microseconds, how is it possible that the number of request/responses are so much worse?

Because "the number of request/responses" is just 1 second divided by the latency of a single request/response. The units don't matter; if you make a single request/response take twice as long, then the total number of serialized requests/responses you can do in a fixed amount of time will be halved.

(Eg, for 64-byte packets, the test reports 76us latency before and 137us latency after, with 14,447 round trips before and 7,574 after, which is within 10% of what you get if you just divide 1s/76us and 1s/137us.)

But the relevant questions are:

  1. Do we think the 76us latency for cross-node pod-to-pod UDP packets measured in the (before) perfscale test is actually a good estimate of real-world use cases? (The latency increase from this PR is fixed at 50us, so if customers have much higher baseline latency than in our test case, then +50us might just be noise. But if they had much lower baseline latency, then +50us would be much more of a performance killer.)
  2. Assuming the answer to question 1 is "yes, it's realistic", are we OK with 137us latency rather than 76us latency for a single small UDP request? (If not, then why? It's not reasonable to say "they might just need the absolute lowest latency possible" because in that case they probably wouldn't be using the pod network at all.)
  3. Assuming the answer to question 2 is "yes, it's fine for a single request", then what is the N such that 76us * N was acceptable, but 137us * N is too slow, and is there any plausible scenario where a customer would actually be sending N serialized UDP request/response pairs? (We talked in the meeting about how DNS might end up sending 5 serialized requests because of ndots, but presumably if your application can handle spending 1/3 of a millisecond doing DNS, then you can also handle spending 2/3 of a millisecond doing DNS. (And that's assuming an instantaneous response from the DNS server anyway.))

@danwinship
Copy link
Contributor Author

/hold

Ben suggests adding a "chicken flag" to disable this in case of emergency

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 30, 2022
@danwinship
Copy link
Contributor Author

/retest

@danwinship
Copy link
Contributor Author

/hold cancel
/retest-required

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 26, 2022
@trozet
Copy link
Contributor

trozet commented Sep 27, 2022

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Sep 27, 2022
@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD f4018b0 and 2 for PR HEAD f1dbb51 in total

@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD b84ba70 and 1 for PR HEAD f1dbb51 in total

Always return a nil OVNConfigBootstrapResult on error (since the
caller ignores it anyway).

If there is no dpu-mode-config override configmap, don't log a
message; that's the normal expected case. If dpu-mode-config parsing
fails, don't bail out of bootstrapOVNConfig completely. Just skip
overriding the NodeMode.
@danwinship
Copy link
Contributor Author

rebased and updated the new unit test to include egressip-node-healthcheck-port=9107

@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Sep 28, 2022
@trozet
Copy link
Contributor

trozet commented Sep 28, 2022

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Sep 28, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 28, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danwinship, trozet

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD b84ba70 and 2 for PR HEAD b485074 in total

@danwinship
Copy link
Contributor Author

/retest-required

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 28, 2022

@danwinship: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-azure-ovn b485074 link false /test e2e-azure-ovn
ci/prow/e2e-network-mtu-migration-sdn-ipv4 b485074 link false /test e2e-network-mtu-migration-sdn-ipv4
ci/prow/e2e-openstack-ovn b485074 link false /test e2e-openstack-ovn
ci/prow/e2e-vsphere-ovn b485074 link false /test e2e-vsphere-ovn
ci/prow/e2e-openstack-sdn b485074 link false /test e2e-openstack-sdn
ci/prow/e2e-hypershift-ovn b485074 link false /test e2e-hypershift-ovn
ci/prow/e2e-aws-sdn-upgrade b485074 link false /test e2e-aws-sdn-upgrade
ci/prow/e2e-ovn-ipsec-step-registry b485074 link false /test e2e-ovn-ipsec-step-registry
ci/prow/e2e-aws-ovn-serial b485074 link false /test e2e-aws-ovn-serial

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@danwinship
Copy link
Contributor Author

/retest-required

@openshift-merge-robot openshift-merge-robot merged commit f09940f into openshift:master Sep 29, 2022
@danwinship danwinship deleted the udp-gro-again branch October 11, 2022 14:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants