Skip to content

Commit

Permalink
northd: Make the use of common zone in NAT configurable
Browse files Browse the repository at this point in the history
There are essentially three problems with the current
combination of DGP + SNAT + LB:

1) The first packet is being SNATed in common zone due
to a problem with pinctrl not preserving ct_mark/ct_label.
The commit would create a SNAT entry within the same with DNAT
entry.

2) The unSNAT for reply always happened in common zone because of
the loopback check which would be triggered only when we loop
the packet through the LR. Now there are two possibilities how
the reply packet would be handled:

a) If the entry for SNAT in common zone did not time out yet, the
packet would be passed through unSNAT in common zone which would
be fine and continue on. However, the unDNAT wouldn't work due to
the limitation of CT not capable of doing unSNAT/unDNAT on the same
packet twice in the same zone. So the reply would arrive to
the original interface, but with wrong source address.

b) If the entry for SNAT timed out it would loop and do unSNAT correctly
in separate zone and then also unDNAT. This is not possible anymore with
a recent change 8c341b9 (northd: Drop packets destined to router owned NAT IP for DGP).
The reply would be dropped before looping after that change co the traffic
would never arrive to the original interface.

3) The unDNAT was happening only if the DGP was outport meaning
the reply traffic was routed out, but in the opposite case
the unDNAT was happening only because of the looping which made
outport=inport. That's why it worked before introduction of explicit drop.

In order to fix all those issues do two changes:

1) Include inport in the unDNAT match, so that we have both
routing directions covered e.g. (inport == "dgp_port" || outport == "dpg_port").

2) Always use the separate zone for SNAT and DNAT. As the common
zone was needed for HWOL make the common zone optional with
configuration option called "use_common_zone". This option is
by default "false" and can be specified per LR. Use of separate
zones also eliminates the need for the flag propagation
in "lr_out_chk_dnat_local" stage, removing the match on ct_mark/ct_label.

The "SNAT in separate zone from DNAT" system test is moved to run only
with kernel datapath. The reason is that this test doesn't work with
userspace datapath due to recirculation limit, currently set to 6 [0].

[0]https://github.com/openvswitch/ovs/blob/9d840923d32124fe427de76e8234c49d64e4bb77/lib/dpif-netdev.c#L102
Reported-at: https://bugzilla.redhat.com/2161281
Signed-off-by: Ales Musil <amusil@redhat.com>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
  • Loading branch information
almusil authored and dceara committed Apr 19, 2023
1 parent bd2e4a4 commit b8c40e7
Show file tree
Hide file tree
Showing 7 changed files with 726 additions and 522 deletions.
626 changes: 375 additions & 251 deletions northd/northd.c

Large diffs are not rendered by default.

90 changes: 7 additions & 83 deletions northd/ovn-northd.8.xml
Original file line number Diff line number Diff line change
Expand Up @@ -3252,13 +3252,11 @@ icmp6 {
<p>
The first flow matches <code>ip &amp;&amp;
ip4.dst == <var>B</var> &amp;&amp; inport == <var>GW</var>
&amp;&amp; flags.loopback == 0</code> or
<code>ip &amp;&amp;
ip6.dst == <var>B</var> &amp;&amp; inport == <var>GW</var>
&amp;&amp; flags.loopback == 0</code>
</code> or <code>ip &amp;&amp; ip6.dst == <var>B</var> &amp;&amp;
inport == <var>GW</var></code>
where <var>GW</var> is the distributed gateway port
corresponding to the NAT rule (specified or inferred), with an
action <code>ct_snat_in_czone;</code> to unSNAT in the common
action <code>ct_snat;</code> to unSNAT in the common
zone. If the NAT rule is of type dnat_and_snat and has
<code>stateless=true</code> in the options, then the action
would be <code>next;</code>.
Expand All @@ -3271,32 +3269,6 @@ icmp6 {
<var>GW</var>.
</p>
</li>

<li>
<p>
The second flow matches <code>ip &amp;&amp;
ip4.dst == <var>B</var> &amp;&amp; inport == <var>GW</var>
&amp;&amp; flags.loopback == 1 &amp;&amp;
flags.use_snat_zone == 1</code> or
<code>ip &amp;&amp;
ip6.dst == <var>B</var> &amp;&amp; inport == <var>GW</var>
&amp;&amp; flags.loopback == 0 &amp;&amp;
flags.use_snat_zone == 1</code>
where <var>GW</var> is the distributed gateway port
corresponding to the NAT rule (specified or inferred), with an
action <code>ct_snat;</code> to unSNAT in the snat zone. If the
NAT rule is of type dnat_and_snat and has
<code>stateless=true</code> in the options, then the action
would be <code>ip4/6.dst=(<var>B</var>)</code>.
</p>

<p>
If the NAT entry is of type <code>snat</code>, then there is an
additional match <code>is_chassis_resident(<var>cr-GW</var>)
</code> where <var>cr-GW</var> is the chassis resident port of
<var>GW</var>.
</p>
</li>
</ul>

<p>
Expand Down Expand Up @@ -4610,46 +4582,12 @@ nd_ns {
</p>

<ul>
<li>
<p>
For each NAT rule in the OVN Northbound database on a
distributed router, a priority-50 logical flow with match
<code>ip4.dst == <var>E</var> &amp;&amp;
is_chassis_resident(<var>P</var>)</code>, where <var>E</var> is the
external IP address specified in the NAT rule, <var>GW</var>
is the logical router distributed gateway port. For dnat_and_snat
NAT rule, <var>P</var> is the logical port specified in the NAT rule.
If <ref column="logical_port"
table="NAT" db="OVN_Northbound"/> column of
<ref table="NAT" db="OVN_Northbound"/> table is NOT set, then
<var>P</var> is the <code>chassisredirect port</code> of
<var>GW</var> with the actions:
<code>REGBIT_DST_NAT_IP_LOCAL = 1; next; </code>
</p>
</li>

<li>
A priority-0 logical flow with match <code>1</code> has actions
<code>REGBIT_DST_NAT_IP_LOCAL = 0; next;</code>.
</li>
</ul>

<p>
This table also installs a priority-50 logical flow for each logical
router that has NATs configured on it. The flow has match
<code>ip &amp;&amp; ct_label.natted == 1</code> and action
<code>REGBIT_DST_NAT_IP_LOCAL = 1; next;</code>. This is intended
to ensure that traffic that was DNATted locally will use a separate
conntrack zone for SNAT if SNAT is required later in the egress
pipeline. Note that this flow checks the value of
<code>ct_label.natted</code>, which is set in the ingress pipeline.
This means that ovn-northd assumes that this value is carried over
from the ingress pipeline to the egress pipeline and is not altered
or cleared. If conntrack label values are ever changed to be cleared
between the ingress and egress pipelines, then the match conditions
of this flow will be updated accordingly.
</p>

<h3>Egress Table 1: UNDNAT</h3>

<p>
Expand Down Expand Up @@ -4691,7 +4629,7 @@ nd_ns {
gateway chassis that matches
<code>ip &amp;&amp; ip4.src == <var>B</var> &amp;&amp;
outport == <var>GW</var></code>, where <var>GW</var> is the logical
router gateway port with an action <code>ct_dnat_in_czone;</code>.
router gateway port with an action <code>ct_dnat;</code>.
If the backend IPv4 address <var>B</var> is also configured with
L4 port <var>PORT</var> of protocol <var>P</var>, then the
match also includes <code>P.src</code> == <var>PORT</var>. These
Expand All @@ -4713,15 +4651,15 @@ nd_ns {
matches <code>ip &amp;&amp; ip4.src == <var>B</var>
&amp;&amp; outport == <var>GW</var></code>, where <var>GW</var>
is the logical router gateway port, with an action
<code>ct_dnat_in_czone;</code>. If the NAT rule is of type
<code>ct_dnat;</code>. If the NAT rule is of type
dnat_and_snat and has <code>stateless=true</code> in the
options, then the action would be <code>next;</code>.
</p>

<p>
If the NAT rule cannot be handled in a distributed manner, then
the priority-100 flow above is only programmed on the
gateway chassis with the action <code>ct_dnat_in_czone</code>.
gateway chassis with the action <code>ct_dnat</code>.
</p>

<p>
Expand Down Expand Up @@ -4897,24 +4835,11 @@ nd_ns {
and match <code>ip &amp;&amp; ip4.src == <var>A</var> &amp;&amp;
outport == <var>GW</var></code>, where <var>GW</var> is the
logical router gateway port, with an action
<code>ct_snat_in_czone(<var>B</var>);</code> to SNATed in the
<code>ct_snat(<var>B</var>);</code> to SNATed in the
common zone. If the NAT rule is of type dnat_and_snat and has
<code>stateless=true</code> in the options, then the action
would be <code>ip4/6.src=(<var>B</var>)</code>.
</li>

<li>
The second flow is added with the calculated priority
<code><var>P</var> + 1 </code> and match
<code>ip &amp;&amp; ip4.src == <var>A</var> &amp;&amp;
outport == <var>GW</var> &amp;&amp;
REGBIT_DST_NAT_IP_LOCAL == 0</code>, where <var>GW</var> is the
logical router gateway port, with an action
<code>ct_snat(<var>B</var>);</code> to SNAT in the snat zone.
If the NAT rule is of type dnat_and_snat and has
<code>stateless=true</code> in the options, then the action would
be <code>ip4/6.src=(<var>B</var>)</code>.
</li>
</ul>

<p>
Expand Down Expand Up @@ -5010,7 +4935,6 @@ clone {
outport = "";
flags = 0;
flags.loopback = 1;
flags.use_snat_zone = REGBIT_DST_NAT_IP_LOCAL;
reg0 = 0;
reg1 = 0;
...
Expand Down
9 changes: 9 additions & 0 deletions ovn-nb.xml
Original file line number Diff line number Diff line change
Expand Up @@ -314,6 +314,15 @@
</p>
</column>

<column name="options" key="use_common_zone" type='{"type": "boolean"}'>
Default value is <code>false</code>. If set to <code>true</code>
the SNAT and DNAT happens in common zone, instead of happening in
separate zones, depending on the configuration. However, this option
breaks traffic when there is configuration of DGP + LB + SNAT on
this LR. The value <code>true</code> should be used only in case
of HWOL compatibility with GDP.
</column>

<group title="Options for configuring interconnection route advertisement">
<p>
These options control how routes are advertised between OVN
Expand Down

0 comments on commit b8c40e7

Please sign in to comment.