Skip to content

Commit

Permalink
northd: Use separate SNAT for already-DNATted traffic.
Browse files Browse the repository at this point in the history
Commit 4deac45 introduced functionality
in ovn-northd to use a single zone for SNAT and DNAT when possible. It
accounts for certain situations, such as hairpinned traffic, where we
still need a separate SNAT and DNAT zone.

However, one situation it did not account for was when traffic traverses
a logical router and is DNATted as a result of a load balancer, then
when the traffic egresses the router, it needs to be SNATted. In this
situation, we try to use the same CT zone for the SNAT as for the load
balancer DNAT, which does not work.

This commit fixes the issue by setting the DNAT_LOCAL bit in the initial
stage of the egress pipeline if the packet was dnatted during the
ingress pipeline. This ensures that when the SNAT stage is reached, a
separate CT zone is used for SNAT.

Signed-off-by: Mark Michelson <mmichels@redhat.com>
Acked-by: Numan Siddique <numans@ovn.org>
Signed-off-by: Mark Michelson <mmichels@redhat.com>
  • Loading branch information
putnopvut committed Oct 3, 2022
1 parent 33e3be7 commit aa8c6d8
Show file tree
Hide file tree
Showing 4 changed files with 163 additions and 3 deletions.
26 changes: 24 additions & 2 deletions northd/northd.c
Original file line number Diff line number Diff line change
Expand Up @@ -13429,7 +13429,8 @@ static void
build_lrouter_nat_defrag_and_lb(struct ovn_datapath *od, struct hmap *lflows,
const struct hmap *ports, struct ds *match,
struct ds *actions,
const struct shash *meter_groups)
const struct shash *meter_groups,
bool ct_lb_mark)
{
if (!od->nbr) {
return;
Expand Down Expand Up @@ -13633,6 +13634,26 @@ build_lrouter_nat_defrag_and_lb(struct ovn_datapath *od, struct hmap *lflows,
}
}

if (od->nbr->n_nat) {
ds_clear(match);
const char *ct_natted = ct_lb_mark ?
"ct_mark.natted" :
"ct_label.natted";
ds_put_format(match, "ip && %s == 1", ct_natted);
/* This flow is unique since it is in the egress pipeline but checks
* the value of ct_label.natted, which would have been set in the
* ingress pipeline. If a change is ever introduced that clears or
* otherwise invalidates the ct_label between the ingress and egress
* pipelines, then an alternative will need to be devised.
*/
ds_clear(actions);
ds_put_cstr(actions, REGBIT_DST_NAT_IP_LOCAL" = 1; next;");
ovn_lflow_add_with_hint(lflows, od, S_ROUTER_OUT_CHECK_DNAT_LOCAL,
50, ds_cstr(match), ds_cstr(actions),
&od->nbr->header_);

}

/* Handle force SNAT options set in the gateway router. */
if (od->is_gw_router) {
if (dnat_force_snat_ip) {
Expand Down Expand Up @@ -13732,7 +13753,8 @@ build_lswitch_and_lrouter_iterate_by_od(struct ovn_datapath *od,
build_misc_local_traffic_drop_flows_for_lrouter(od, lsi->lflows);
build_lrouter_arp_nd_for_datapath(od, lsi->lflows, lsi->meter_groups);
build_lrouter_nat_defrag_and_lb(od, lsi->lflows, lsi->ports, &lsi->match,
&lsi->actions, lsi->meter_groups);
&lsi->actions, lsi->meter_groups,
lsi->features->ct_no_masked_label);
}

/* Helper function to combine all lflow generation which is iterated by port.
Expand Down
16 changes: 16 additions & 0 deletions northd/ovn-northd.8.xml
Original file line number Diff line number Diff line change
Expand Up @@ -4314,6 +4314,22 @@ nd_ns {
</li>
</ul>

<p>
This table also installs a priority-50 logical flow for each logical
router that has NATs configured on it. The flow has match
<code>ip &amp;&amp; ct_label.natted == 1</code> and action
<code>REGBIT_DST_NAT_IP_LOCAL = 1; next;</code>. This is intended
to ensure that traffic that was DNATted locally will use a separate
conntrack zone for SNAT if SNAT is required later in the egress
pipeline. Note that this flow checks the value of
<code>ct_label.natted</code>, which is set in the ingress pipeline.
This means that ovn-northd assumes that this value is carried over
from the ingress pipeline to the egress pipeline and is not altered
or cleared. If conntrack label values are ever changed to be cleared
between the ingress and egress pipelines, then the match conditions
of this flow will be updated accordingly.
</p>

<h3>Egress Table 1: UNDNAT</h3>

<p>
Expand Down
7 changes: 6 additions & 1 deletion tests/ovn-northd.at
Original file line number Diff line number Diff line change
Expand Up @@ -5000,6 +5000,7 @@ AT_CHECK([grep "lr_in_dnat" lr0flows | sort], [0], [dnl

AT_CHECK([grep "lr_out_chk_dnat_local" lr0flows | sed 's/table=./table=?/' | sort], [0], [dnl
table=? (lr_out_chk_dnat_local), priority=0 , match=(1), action=(reg9[[4]] = 0; next;)
table=? (lr_out_chk_dnat_local), priority=50 , match=(ip && ct_mark.natted == 1), action=(reg9[[4]] = 1; next;)
table=? (lr_out_chk_dnat_local), priority=50 , match=(ip && ip4.dst == 172.168.0.10 && is_chassis_resident("cr-lr0-public")), action=(reg9[[4]] = 1; next;)
table=? (lr_out_chk_dnat_local), priority=50 , match=(ip && ip4.dst == 172.168.0.20 && is_chassis_resident("cr-lr0-public")), action=(reg9[[4]] = 1; next;)
table=? (lr_out_chk_dnat_local), priority=50 , match=(ip && ip4.dst == 172.168.0.30 && is_chassis_resident("cr-lr0-public")), action=(reg9[[4]] = 1; next;)
Expand Down Expand Up @@ -5074,6 +5075,7 @@ AT_CHECK([grep "lr_in_dnat" lr0flows | sort], [0], [dnl

AT_CHECK([grep "lr_out_chk_dnat_local" lr0flows | sed 's/table=./table=?/' | sort], [0], [dnl
table=? (lr_out_chk_dnat_local), priority=0 , match=(1), action=(reg9[[4]] = 0; next;)
table=? (lr_out_chk_dnat_local), priority=50 , match=(ip && ct_mark.natted == 1), action=(reg9[[4]] = 1; next;)
table=? (lr_out_chk_dnat_local), priority=50 , match=(ip && ip4.dst == 172.168.0.10 && is_chassis_resident("cr-lr0-public")), action=(reg9[[4]] = 1; next;)
table=? (lr_out_chk_dnat_local), priority=50 , match=(ip && ip4.dst == 172.168.0.20 && is_chassis_resident("cr-lr0-public")), action=(reg9[[4]] = 1; next;)
table=? (lr_out_chk_dnat_local), priority=50 , match=(ip && ip4.dst == 172.168.0.30 && is_chassis_resident("cr-lr0-public")), action=(reg9[[4]] = 1; next;)
Expand Down Expand Up @@ -5142,6 +5144,7 @@ AT_CHECK([grep "lr_in_dnat" lr0flows | sort], [0], [dnl

AT_CHECK([grep "lr_out_chk_dnat_local" lr0flows | sed 's/table=./table=?/' | sort], [0], [dnl
table=? (lr_out_chk_dnat_local), priority=0 , match=(1), action=(reg9[[4]] = 0; next;)
table=? (lr_out_chk_dnat_local), priority=50 , match=(ip && ct_mark.natted == 1), action=(reg9[[4]] = 1; next;)
])

AT_CHECK([grep "lr_out_undnat" lr0flows | sed 's/table=./table=?/' | sort], [0], [dnl
Expand Down Expand Up @@ -5202,6 +5205,7 @@ AT_CHECK([grep "lr_in_dnat" lr0flows | sort], [0], [dnl

AT_CHECK([grep "lr_out_chk_dnat_local" lr0flows | sed 's/table=./table=?/' | sort], [0], [dnl
table=? (lr_out_chk_dnat_local), priority=0 , match=(1), action=(reg9[[4]] = 0; next;)
table=? (lr_out_chk_dnat_local), priority=50 , match=(ip && ct_mark.natted == 1), action=(reg9[[4]] = 1; next;)
])

AT_CHECK([grep "lr_out_undnat" lr0flows | sed 's/table=./table=?/' | sort], [0], [dnl
Expand Down Expand Up @@ -5267,6 +5271,7 @@ AT_CHECK([grep "lr_in_dnat" lr0flows | sort], [0], [dnl

AT_CHECK([grep "lr_out_chk_dnat_local" lr0flows | sed 's/table=./table=?/' | sort], [0], [dnl
table=? (lr_out_chk_dnat_local), priority=0 , match=(1), action=(reg9[[4]] = 0; next;)
table=? (lr_out_chk_dnat_local), priority=50 , match=(ip && ct_mark.natted == 1), action=(reg9[[4]] = 1; next;)
])

AT_CHECK([grep "lr_out_undnat" lr0flows | sed 's/table=./table=?/' | sort], [0], [dnl
Expand Down Expand Up @@ -5345,6 +5350,7 @@ AT_CHECK([grep "lr_in_dnat" lr0flows | sort], [0], [dnl

AT_CHECK([grep "lr_out_chk_dnat_local" lr0flows | sed 's/table=./table=?/' | sort], [0], [dnl
table=? (lr_out_chk_dnat_local), priority=0 , match=(1), action=(reg9[[4]] = 0; next;)
table=? (lr_out_chk_dnat_local), priority=50 , match=(ip && ct_mark.natted == 1), action=(reg9[[4]] = 1; next;)
])

AT_CHECK([grep "lr_out_undnat" lr0flows | sed 's/table=./table=?/' | sort], [0], [dnl
Expand Down Expand Up @@ -6029,7 +6035,6 @@ AT_CHECK([grep -e "(lr_in_ip_routing ).*outport" lr0flows | sed 's/table=../ta
])

AT_CLEANUP
])

OVN_FOR_EACH_NORTHD([
AT_SETUP([check exclude-lb-vips-from-garp option])
Expand Down
117 changes: 117 additions & 0 deletions tests/system-ovn.at
Original file line number Diff line number Diff line change
Expand Up @@ -8494,3 +8494,120 @@ OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d

AT_CLEANUP
])

OVN_FOR_EACH_NORTHD([
AT_SETUP([SNAT in separate zone from DNAT])

CHECK_CONNTRACK()
CHECK_CONNTRACK_NAT()
ovn_start
OVS_TRAFFIC_VSWITCHD_START()
ADD_BR([br-int])

# Set external-ids in br-int needed for ovn-controller
ovs-vsctl \
-- set Open_vSwitch . external-ids:system-id=hv1 \
-- set Open_vSwitch . external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
-- set Open_vSwitch . external-ids:ovn-encap-type=geneve \
-- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \
-- set bridge br-int fail-mode=secure other-config:disable-in-band=true

# The goal of this test is to ensure that when traffic is first DNATted
# (by way of a load balancer), and then SNATted, the SNAT happens in a
# separate conntrack zone from the DNAT.

start_daemon ovn-controller

check ovn-nbctl ls-add public

check ovn-nbctl lr-add r1
check ovn-nbctl lrp-add r1 r1_public 00:de:ad:ff:00:01 172.16.0.1/16
check ovn-nbctl lrp-add r1 r1_s1 00:de:ad:fe:00:01 173.0.1.1/24
check ovn-nbctl lrp-set-gateway-chassis r1_public hv1

check ovn-nbctl lb-add r1_lb 30.0.0.1 172.16.0.102
check ovn-nbctl lr-lb-add r1 r1_lb

check ovn-nbctl ls-add s1
check ovn-nbctl lsp-add s1 s1_r1
check ovn-nbctl lsp-set-type s1_r1 router
check ovn-nbctl lsp-set-addresses s1_r1 router
check ovn-nbctl lsp-set-options s1_r1 router-port=r1_s1

check ovn-nbctl lsp-add s1 vm1
check ovn-nbctl lsp-set-addresses vm1 "00:de:ad:01:00:01 173.0.1.2"

check ovn-nbctl lsp-add public public_r1
check ovn-nbctl lsp-set-type public_r1 router
check ovn-nbctl lsp-set-addresses public_r1 router
check ovn-nbctl lsp-set-options public_r1 router-port=r1_public nat-addresses=router

check ovn-nbctl lr-add r2
check ovn-nbctl lrp-add r2 r2_public 00:de:ad:ff:00:02 172.16.0.2/16
check ovn-nbctl lrp-add r2 r2_s2 00:de:ad:fe:00:02 173.0.2.1/24
check ovn-nbctl lr-nat-add r2 dnat_and_snat 172.16.0.102 173.0.2.2
check ovn-nbctl lrp-set-gateway-chassis r2_public hv1

check ovn-nbctl ls-add s2
check ovn-nbctl lsp-add s2 s2_r2
check ovn-nbctl lsp-set-type s2_r2 router
check ovn-nbctl lsp-set-addresses s2_r2 router
check ovn-nbctl lsp-set-options s2_r2 router-port=r2_s2

check ovn-nbctl lsp-add s2 vm2
check ovn-nbctl lsp-set-addresses vm2 "00:de:ad:01:00:02 173.0.2.2"

check ovn-nbctl lsp-add public public_r2
check ovn-nbctl lsp-set-type public_r2 router
check ovn-nbctl lsp-set-addresses public_r2 router
check ovn-nbctl lsp-set-options public_r2 router-port=r2_public nat-addresses=router

ADD_NAMESPACES(vm1)
ADD_VETH(vm1, vm1, br-int, "173.0.1.2/24", "00:de:ad:01:00:01", \
"173.0.1.1")
ADD_NAMESPACES(vm2)
ADD_VETH(vm2, vm2, br-int, "173.0.2.2/24", "00:de:ad:01:00:02", \
"173.0.2.1")

check ovn-nbctl lr-nat-add r1 dnat_and_snat 172.16.0.101 173.0.1.2 vm1 00:00:00:01:02:03
check ovn-nbctl --wait=hv sync

# Next, make sure that a ping works as expected
NS_CHECK_EXEC([vm1], [ping -q -c 3 -i 0.3 -w 2 30.0.0.1 | FORMAT_PING], \
[0], [dnl
3 packets transmitted, 3 received, 0% packet loss, time 0ms
])

# Finally, make sure that conntrack shows two separate zones being used for
# DNAT and SNAT
AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(30.0.0.1) | \
sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl
icmp,orig=(src=173.0.1.2,dst=30.0.0.1,id=<cleared>,type=8,code=0),reply=(src=172.16.0.102,dst=173.0.1.2,id=<cleared>,type=0,code=0),zone=<cleared>,mark=2
])

# The final two entries appear identical here. That is because FORMAT_CT
# scrubs the zone numbers. In actuality, the zone numbers are different,
# which is why there are two entries.
AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(172.16.0.102) | \
sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl
icmp,orig=(src=172.16.0.101,dst=172.16.0.102,id=<cleared>,type=8,code=0),reply=(src=173.0.2.2,dst=172.16.0.101,id=<cleared>,type=0,code=0),zone=<cleared>
icmp,orig=(src=173.0.1.2,dst=172.16.0.102,id=<cleared>,type=8,code=0),reply=(src=172.16.0.102,dst=172.16.0.101,id=<cleared>,type=0,code=0),zone=<cleared>
icmp,orig=(src=173.0.1.2,dst=172.16.0.102,id=<cleared>,type=8,code=0),reply=(src=172.16.0.102,dst=172.16.0.101,id=<cleared>,type=0,code=0),zone=<cleared>
])

OVS_APP_EXIT_AND_WAIT([ovn-controller])

as ovn-sb
OVS_APP_EXIT_AND_WAIT([ovsdb-server])

as ovn-nb
OVS_APP_EXIT_AND_WAIT([ovsdb-server])

as northd
OVS_APP_EXIT_AND_WAIT([NORTHD_TYPE])

as
OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d
/connection dropped.*/d"])
AT_CLEANUP
])

0 comments on commit aa8c6d8

Please sign in to comment.