Skip to content

Commit 2dc7869

Browse files
committed
ovn-northd: Address scale issues with DNAT flows.
When the commit [1] added Distributed NAT support in OVN, it didn't address the requirement of making East/West NAT traffic distributed. The E/W NAT traffic was still centralized. Later a couple of patches [2], addressed this requirement. But the approach taken in [2] resulted in a lot of logical flows as number of dnat_and_snat entries increase, as reported in @Reported-at. This patch - reverts the approch taken in [2]. - removing the flows which does the NAT direct (REGBIT_NAT_REDIRECT) to the gateway chassis. - and to solve the E/W centralized NAT it does the following: * Since for each NAT entry we know the MAC binding to be used for the external_ip - either the external_mac if set or the MAC of the distributed gateway router port, this patch adds the flows in the S_ROUTER_IN_ARP_RESOLVE stage to set the eth.dst to the MAC if the IP destination is external_ip. * The existing flows in the S_ROUTER_OUT_EGR_LOOP are now added by additional match - is_chassis_resident('P') - where 'P' is logical_port of the NAT entry if set, otherwise it is the chassis resident port of distributed router port. With this additional match, the packet will be loopbacked to apply the unSNAT/DNAT rules on the relevant chassis. Suppose if a logical port 'P' with IP 'A' has a dnat_and_snat entry with external_mac/logical_port set, and if the packet's IP destination is one of the DNAT IP - then the packet will be sent out of the local chassis, since eth.dst is resolved in the S_ROUTER_IN_ARP_RESOLVE stage. If the external_mac/logical_port is not in NAT entry, then the packet will be redirected to the gateway chassis. With this patch, for the logical resource reported in @Reported-at, the number of logical flows come down to around 45k from 650k. [1] - ceacd9d("ovn: distributed NAT flows") [2] - 551e3d9("OVN: fix DVR Floating IP support") 8244c6b("OVN: do not distribute traffic for local FIP") Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2020-January/049714.html Reported-by: Daniel Alvarez Sanchez <dalvarez@redhat.com> Signed-off-by: Numan Siddique <numans@ovn.org> Acked-by: Dumitru Ceara <dceara@redhat.com> Tested-By: Daniel Alvarez Sanchez <dalvarez@redhat.com> Acked-By: Daniel Alvarez Sanchez <dalvarez@redhat.com>
1 parent 17f8e25 commit 2dc7869

File tree

3 files changed

+99
-364
lines changed

3 files changed

+99
-364
lines changed

northd/ovn-northd.8.xml

+57-134
Original file line numberDiff line numberDiff line change
@@ -1587,6 +1587,24 @@ next;
15871587
</p>
15881588

15891589
<ul>
1590+
<li>
1591+
<p>
1592+
For each NAT entry of a distributed logical router (with
1593+
distributed gateway router port) of type <code>snat</code>,
1594+
a priorirty-120 flow with the match <code>inport == <var>P</var>
1595+
&amp;&amp; ip4.src == <var>A</var></code> advances the packet to
1596+
the next pipeline, where <var>P</var> is the distributed logical
1597+
router port and <var>A</var> is the <code>external_ip</code> set
1598+
in the NAT entry. If <var>A</var> is an IPv6 address, then
1599+
<code>ip6.src</code> is used for the match.
1600+
</p>
1601+
1602+
<p>
1603+
The above flow is required to handle the routing of the East/west NAT
1604+
traffic.
1605+
</p>
1606+
</li>
1607+
15901608
<li>
15911609
<p>
15921610
L3 admission control: A priority-100 flow drops packets that match
@@ -2099,21 +2117,6 @@ icmp6 {
20992117
<code>redirect-chassis</code>.
21002118
</p>
21012119

2102-
<p>
2103-
For each configuration in the OVN Northbound database, that asks
2104-
to change the source IP address of a packet from <var>A</var> to
2105-
<var>B</var>, a priority-50 flow matches
2106-
<code>ip &amp;&amp; ip4.dst == <var>B</var></code> or
2107-
<code>ip &amp;&amp; ip6.dst == <var>B</var></code>
2108-
with an action
2109-
<code>REGBIT_NAT_REDIRECT = 1; next;</code>. This flow is for
2110-
east/west traffic to a NAT destination IPv4/IPv6 address. By
2111-
setting the <code>REGBIT_NAT_REDIRECT</code> flag, in the
2112-
ingress table <code>Gateway Redirect</code> this will trigger a
2113-
redirect to the instance of the gateway port on the
2114-
<code>redirect-chassis</code>.
2115-
</p>
2116-
21172120
<p>
21182121
A priority-0 logical flow with match <code>1</code> has actions
21192122
<code>next;</code>.
@@ -2269,20 +2272,6 @@ icmp6 {
22692272
<code>redirect-chassis</code>.
22702273
</p>
22712274

2272-
<p>
2273-
For each configuration in the OVN Northbound database, that asks
2274-
to change the destination IP address of a packet from <var>A</var> to
2275-
<var>B</var>, a priority-50 flow matches <code>ip &amp;&amp;
2276-
ip4.dst == <var>B</var></code> or <code>ip &amp;&amp;
2277-
ip6.dst == <var>B</var></code> with an action
2278-
<code>REGBIT_NAT_REDIRECT = 1; next;</code>. This flow is for
2279-
east/west traffic to a NAT destination IPv4/IPv6 address. By
2280-
setting the <code>REGBIT_NAT_REDIRECT</code> flag, in the
2281-
ingress table <code>Gateway Redirect</code> this will trigger a
2282-
redirect to the instance of the gateway port on the
2283-
<code>redirect-chassis</code>.
2284-
</p>
2285-
22862275
<p>
22872276
A priority-0 logical flow with match <code>1</code> has actions
22882277
<code>next;</code>.
@@ -2416,54 +2405,6 @@ output;
24162405
</p>
24172406
</li>
24182407

2419-
<li>
2420-
<p>
2421-
For distributed logical routers where one of the logical router
2422-
ports specifies a <code>redirect-chassis</code>, a priority-400
2423-
logical flow for each ip source/destination couple that matches the
2424-
<code>dnat_and_snat</code> NAT rules configured. These flows will
2425-
allow to properly forward traffic to the external connections if
2426-
available and avoid sending it through the tunnel.
2427-
Assuming the two following NAT rules have been configured:
2428-
</p>
2429-
2430-
<pre>
2431-
external_ip{0,1} = <var>EIP{0,1}</var>;
2432-
external_mac{0,1} = <var>MAC{0,1}</var>;
2433-
logical_ip{0,1} = <var>LIP{0,1}</var>;
2434-
</pre>
2435-
2436-
<p>
2437-
the following action will be applied:
2438-
</p>
2439-
2440-
<pre>
2441-
eth.dst = <var>MAC0</var>;
2442-
eth.src = <var>MAC1</var>;
2443-
reg0 = ip4.dst; /* xxreg0 = ip6.dst; in the IPv6 case */
2444-
reg1 = <var>EIP1</var>; /* xxreg1 in the IPv6 case */
2445-
outport = <code>redirect-chassis-port</code>;
2446-
<code>REGBIT_DISTRIBUTED_NAT = 1; next;</code>.
2447-
</pre>
2448-
2449-
<p>
2450-
Morover a priority-400 logical flow is configured for each
2451-
<code>dnat_and_snat</code> NAT rule configured in order to
2452-
not send traffic for local FIP through the overlay tunnels
2453-
but manage it in the local hypervisor
2454-
</p>
2455-
</li>
2456-
2457-
<li>
2458-
<p>
2459-
For distributed logical routers where one of the logical router
2460-
ports specifies a <code>redirect-chassis</code>, a priority-300
2461-
logical flow with match <code>REGBIT_NAT_REDIRECT == 1</code> has
2462-
actions <code>ip.ttl--; next;</code>. The <code>outport</code>
2463-
will be set later in the Gateway Redirect table.
2464-
</p>
2465-
</li>
2466-
24672408
<li>
24682409
<p>
24692410
IPv4 routing table. For each route to IPv4 network <var>N</var> with
@@ -2630,23 +2571,6 @@ outport = <var>P</var>;
26302571
</p>
26312572
</li>
26322573

2633-
<li>
2634-
<p>
2635-
For distributed logical routers where one of the logical router
2636-
ports specifies a <code>redirect-chassis</code>, a priority-400
2637-
logical flow with match <code>REGBIT_DISTRIBUTED_NAT == 1</code>
2638-
has action <code>next;</code>
2639-
</p>
2640-
<p>
2641-
For distributed logical routers where one of the logical router
2642-
ports specifies a <code>redirect-chassis</code>, a priority-200
2643-
logical flow with match <code>REGBIT_NAT_REDIRECT == 1</code> has
2644-
actions <code>eth.dst = <var>E</var>; next;</code>, where
2645-
<var>E</var> is the ethernet address of the router's distributed
2646-
gateway port.
2647-
</p>
2648-
</li>
2649-
26502574
<li>
26512575
<p>
26522576
Static MAC bindings. MAC bindings can be known statically based on
@@ -2721,6 +2645,35 @@ outport = <var>P</var>;
27212645
</p>
27222646
</li>
27232647

2648+
<li>
2649+
<p>
2650+
Static MAC bindings from NAT entries. MAC bindings can also be known
2651+
for the entries in the <code>NAT</code> table. Below flows are
2652+
programmed for distributed logical routers i.e with a distributed
2653+
router port.
2654+
</p>
2655+
2656+
<p>
2657+
For each row in the <code>NAT</code> table with IPv4 address
2658+
<var>A</var> in the <ref column="external_ip"
2659+
table="NAT" db="OVN_Northbound"/> column of
2660+
<ref table="NAT" db="OVN_Northbound"/> table, a priority-100
2661+
flow with the match <code>outport === <var>P</var> &amp;&amp;
2662+
reg0 == <var>A</var></code> has actions <code>eth.dst = <var>E</var>;
2663+
next;</code>, where <code>P</code> is the distributed logical router
2664+
port, <var>E</var> is the Ethernet address if set in the
2665+
<ref column="external_mac" table="NAT" db="OVN_Northbound"/> column
2666+
of <ref table="NAT" db="OVN_Northbound"/> table for of type
2667+
<code>dnat_and_snat</code>, otherwise the Ethernet address of the
2668+
distributed logical router port.
2669+
</p>
2670+
2671+
<p>
2672+
For IPv6 NAT entries, same flows are added, but using the register
2673+
<code>xxreg0</code> for the match.
2674+
</p>
2675+
</li>
2676+
27242677
<li>
27252678
<p>
27262679
Dynamic MAC bindings. These flows resolve MAC-to-IP bindings
@@ -2843,20 +2796,6 @@ icmp4 {
28432796
</p>
28442797

28452798
<ul>
2846-
<li>
2847-
A priority-300 logical flow with match
2848-
<code>REGBIT_DISTRIBUTED_NAT == 1</code> has action
2849-
<code>next;</code>
2850-
</li>
2851-
<li>
2852-
A priority-200 logical flow with match
2853-
<code>REGBIT_NAT_REDIRECT == 1</code> has actions
2854-
<code>outport = <var>CR</var>; next;</code>, where <var>CR</var>
2855-
is the <code>chassisredirect</code> port representing the instance
2856-
of the logical router distributed gateway port on the
2857-
<code>redirect-chassis</code>.
2858-
</li>
2859-
28602799
<li>
28612800
A priority-150 logical flow with match
28622801
<code>outport == <var>GW</var> &amp;&amp;
@@ -3148,19 +3087,6 @@ nd_ns {
31483087
ports specifies a <code>redirect-chassis</code>.
31493088
</p>
31503089

3151-
<p>
3152-
Earlier in the ingress pipeline, some east-west traffic was
3153-
redirected to the <code>chassisredirect</code> port, based on
3154-
flows in the <code>UNSNAT</code> and <code>DNAT</code> ingress
3155-
tables setting the <code>REGBIT_NAT_REDIRECT</code> flag, which
3156-
then triggered a match to a flow in the
3157-
<code>Gateway Redirect</code> ingress table. The intention was
3158-
not to actually send traffic out the distributed gateway port
3159-
instance on the <code>redirect-chassis</code>. This traffic was
3160-
sent to the distributed gateway port instance in order for DNAT
3161-
and/or SNAT processing to be applied.
3162-
</p>
3163-
31643090
<p>
31653091
While UNDNAT and SNAT processing have already occurred by this
31663092
point, this traffic needs to be forced through egress loopback on
@@ -3176,23 +3102,20 @@ nd_ns {
31763102

31773103
<ul>
31783104
<li>
3179-
<p>
3180-
For each <code>dnat_and_snat</code> NAT rule couple in the
3181-
OVN Northbound database on a distributed router,
3182-
a priority-200 logical with match
3183-
<code>ip4.dst == <var>external_ip0</var> &amp;&amp;
3184-
ip4.src == <var>external_ip1</var></code>, has action
3185-
<code>next;</code>
3186-
</p>
3187-
31883105
<p>
31893106
For each NAT rule in the OVN Northbound database on a
31903107
distributed router, a priority-100 logical flow with match
31913108
<code>ip4.dst == <var>E</var> &amp;&amp;
3192-
outport == <var>GW</var></code>, where <var>E</var> is the
3193-
external IP address specified in the NAT rule, and <var>GW</var>
3194-
is the logical router distributed gateway port, with the
3195-
following actions:
3109+
outport == <var>GW</var> &amp;&amp;
3110+
is_chassis_resident(<var>P</var>)</code>, where <var>E</var> is the
3111+
external IP address specified in the NAT rule, <var>GW</var>
3112+
is the logical router distributed gateway port. For dnat_and_snat
3113+
NAT rule, <var>P</var> is the logical port specified in the NAT rule.
3114+
If <ref column="logical_port"
3115+
table="NAT" db="OVN_Northbound"/> column of
3116+
<ref table="NAT" db="OVN_Northbound"/> table is NOT set, then
3117+
<var>P</var> is the <code>chassisredirect port</code> of
3118+
<var>GW</var> with the following actions:
31963119
</p>
31973120

31983121
<pre>

0 commit comments

Comments
 (0)