New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor iptables rules for NodePort and ExternalIP services #2002
Refactor iptables rules for NodePort and ExternalIP services #2002
Conversation
/assign danielmellado |
bcc5af6
to
fc33441
Compare
/retest |
not sure if this is the best place to ask this but there are a bunch of functions that return []*net.IIPNet when they really return an ipv4 and ipv6 address, are there any plans to have multiple interfaces in an ip family? should we be looking into just returning (ipv4, ipv6 *net..IPNet)? -- should an issue be opened for this? |
The convention in most places all throughout our code base is to return |
8827918
to
2cc7145
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall lgtm. Question about upgrade inline and wondering why dual stack conversion failed for shared gw mode
So the dual-stack conversion passed, the dual-stack tests executed right after didn't (I didn't understand how to check the output and didn't have time to really dig in)....but I don't understand how dual-stack can even work for node port and external IP services without my second commit in this patch, i.e: how did that ever pass? |
the failing dual-stack job conversion failed because the apiserver pod didn't come up , is unrelated to the PR and to OVN ... I´ll have to see if is a real bug or just that the conversion script is too agressive |
c22d9c9
to
9de9696
Compare
Last failed build is unrelated to this PR, I opened #2075 as a general tracker for flaking unit tests. |
/retest |
1 similar comment
/retest |
Fixes issue: ovn-org#1981 This patch simplifies ovnkube-node to remove the notion of node IP. This notion only existed for the iptables rules to be configured correctly. By utilizing `--dst-type LOCAL` we however don't need that anymore and can simplify our logic. Signed-off-by: Alexander Constantinescu <aconstan@redhat.com>
While working on the previous commits in this patch, I realized getGatewayIPTRules wasn't "dual-stack correct". This fixes that. Signed-off-by: Alexander Constantinescu <aconstan@redhat.com>
Our iptables rules for NodePort and ExternalIP services were setting up rules in filter FORWARD, which is a table used for routing packets which don't have the node IP as neither source nore destination IP. This is not needed for NodePort and ExternalIP. For NodePort services we currently DNAT directly to the gatewayIP/clusterIP when a packet hits a node, and this happens on every node, which thus, does not warrant setting up filter FORWARD rules. In no case will node port packets just be forwarded to another node. For ExternalIP there is one case where routing external IP packets to another node does occur, but we use routing table rules to do this. Which means the filter FORWARD rules are not needed in that case either. The problem with having NodePort and ExternalIP iptables rules in filter FORWARD is that: every packet on the cluster going from one node to another will be looked up in those chains. They will never match, but the lookup can be exteremely costly on cluster with a lot of NodePort services defined, see: https://bugzilla.redhat.com/show_bug.cgi?id=1923157 Also, remove NAT PREROUTING in shared gateway mode, since we configure OVS flows on the shared gateway bridge to steer traffic into OVN thus rendering the iptables rules superflous. Signed-off-by: Alexander Constantinescu <aconstan@redhat.com> Delete stale iptables jump rules for updates On updates we need to remove the legacy jump rules for each mode. This will handle that. Unfortunately there's no generic way to do so depending on the services that we find, so we still need them hard-coded. For posteriority's sake we should really always have them, since we don't release ovn-kubernetes and we never know when someone deploying ovn-kubernetes reaches N+2 and can remove these. Signed-off-by: Alexander Constantinescu <aconstan@redhat.com>
Signed-off-by: Alexander Constantinescu <aconstan@redhat.com>
9de9696
to
1ba1ce8
Compare
- What this PR does and why is it needed
This patch refactors the iptables rules setup for
NodePort
andExternalIP
specific services. Mainly it does the following three things:NodePort
type services, this fixes In local gw mode, creating a nodeport service inhibit north-south traffic on the same port #1981filter FORWARD
setup forNodePort
andExternalIP
services. That chain is used for routing packets which do not have the node IP as source nor destination IP, essentially just forwarding it to another node. However, forNodePort
andExternalIP
services we just DNAT the packets on the gateway IP / Cluster IP (depending on the mode we're running in). The problem with addingNodePort
andExternalIP
services to filter FORWARD is that for every packet that traverses a node, a lookup will be performed in the underlying iptables implementation in the kernel, which depending on the cluster: can have severe performance impacts on networking, see: https://bugzilla.redhat.com/show_bug.cgi?id=1923157 ./assign @trozet
- Special notes for reviewers
- How to verify it
- Description for the changelog