-
Notifications
You must be signed in to change notification settings - Fork 38.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Nftables proxy #124272
[WIP] Nftables proxy #124272
Conversation
Please note that we're already in Test Freeze for the Fast forwards are scheduled to happen every 6 hours, whereas the most recent run was: Thu Apr 11 13:45:22 UTC 2024. |
This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/hold let's start to get some data |
/test |
@aojea: The
The following commands are available to trigger optional jobs:
Use
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: aojea The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/test pull-kubernetes-e2e-gce-100-performance |
/test pull-kubernetes-e2e-gce-100-performance |
This is much better, but iptables runs with 10s of sync period, last time we analyzed this the latency in iptables improved but CPU increased #110268 (comment)
The sync rules in nftables The sync rules in iptables from https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/ci-kubernetes-e2e-gci-gce-scalability/1778462542213943296 :) that are way better |
@aojea: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
(So that's the NetworkProgrammingLatency numbers from the
For comparision from a recent run:
(which is to say, the job runs with
Honestly, running kube-proxy with min sync period 10s is just crazy. Like, if a pod crashes, you want to keep sending new connections to its former IP address for another 10 seconds? If we think kube-proxy uses too much CPU with a shorter sync period then we should fix that. (I think it's not clear how much of the CPU was kube-proxy itself vs kube-proxy plus iptables-restore?)
That's the one measuring packet latency for pod-to-service-IP-to-pod traffic? How many services does this job create? We'd probably need at least 10s of thousands for the difference between iptables kube-proxy's rules and nftables kube-proxy's rules to become noticeable. (Note that for purposes of this metric, you don't even need a large cluster to test that; you can create 1 pod and 10,000 services that all point to that same pod.)
So that's p95 of The nftables proxy currently doesn't have the partial-syncing optimizations that iptables has (because we were waiting for the perf jobs before optimizing anything), so this is expected; every sync rewrites every rule, so nftables's |
@@ -1140,7 +1140,7 @@ var defaultKubernetesFeatureGates = map[featuregate.Feature]featuregate.FeatureS | |||
|
|||
NewVolumeManagerReconstruction: {Default: true, PreRelease: featuregate.GA, LockToDefault: true}, // remove in 1.32 | |||
|
|||
NFTablesProxyMode: {Default: false, PreRelease: featuregate.Alpha}, | |||
NFTablesProxyMode: {Default: true, PreRelease: featuregate.Beta}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to update NFTablesProxyMode comment above with version for beta
if [[ "${KUBE_PROXY_MODE:-}" == "nftables" ]];then | ||
params+=" --proxy-mode=nftables" | ||
else | ||
params+=" --iptables-sync-period=1m --iptables-min-sync-period=10s --ipvs-sync-period=1m --ipvs-min-sync-period=10s" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, you could set the iptables- and ipvs-specific options regardless of KUBE_PROXY_MODE.
But probably some of this should be merged with the existing [[ "${KUBE_PROXY_MODE:-}" == "ipvs" ]]
check? And/or it should do --proxy-mode=${KUBE_PROXY_MODE}
.
(Also, typo in the commit message: "configura")
# Optional: Change the kube-proxy implementation. Choices are [iptables, ipvs]. | ||
KUBE_PROXY_MODE=${KUBE_PROXY_MODE:-iptables} | ||
# Optional: Change the kube-proxy implementation. Choices are [iptables, ipvs, nftables]. | ||
KUBE_PROXY_MODE=${KUBE_PROXY_MODE:-nftables} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(should mark this commit [DO NOT MERGE] or something)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it has to be removed , yes
o.logger.Info("Using iptables proxy") | ||
config.Mode = proxyconfigapi.ProxyModeIPTables | ||
o.logger.Info("Using nftables proxy") | ||
config.Mode = proxyconfigapi.ProxyModeNFTables |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(likewise... or just merge this with the other DNM commit)
params+=" --proxy-mode=nftables" | ||
# to avoid the problems caused of adding iptables rules to drop packets with invalid conntrack state | ||
# kube-proxy has an option to set the tcpBeLiberal sysctl that solves the problem and is less disruptive | ||
# https://github.com/kubernetes/kubernetes/issues/94861ss |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in the link (the trailing "ss"), but https://issues.k8s.io/122663#issuecomment-1885024015 is probably a better link anyway.
In particular, we are intentionally requiring that nftables-kube-proxy-based CI set --conntrack-tcp-be-liberal
because we want to test that option.
/close |
@danwinship: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/kind feature