Skip to content

Commit

Permalink
northd: move hairpin stages before acl_after_lb
Browse files Browse the repository at this point in the history
In the current codebase ct_commit {} action clears ct_state metadata of
the incoming packet. This behaviour introduces an issue if we need to
check the connection tracking state in the subsequent pipeline stages,
e.g. for hairpin traffic:

table=14(ls_in_pre_hairpin  ), priority=100  , match=(ip && ct.trk), action=(reg0[6] = chk_lb_hairpin(); reg0[12] = chk_lb_hairpin_reply(); next;)

Fix the issue moving PRE_HAIRPIN,NAT_HAIRPIN and HAIRPIN stages before
ACL_AFTER_LB and STATEFUL ones.

Suggested-by: Han Zhou <hzhou@ovn.org>
Suggested-by: Dumitru Ceara <dceara@redhat.com>
Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2103086
Acked-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
(cherry picked from commit af4dfa7)
  • Loading branch information
LorenzoBianconi authored and dceara committed Jan 18, 2023
1 parent 9481d05 commit 254724d
Show file tree
Hide file tree
Showing 4 changed files with 112 additions and 92 deletions.
10 changes: 5 additions & 5 deletions northd/northd.c
Original file line number Diff line number Diff line change
Expand Up @@ -113,11 +113,11 @@ enum ovn_stage {
PIPELINE_STAGE(SWITCH, IN, QOS_MARK, 10, "ls_in_qos_mark") \
PIPELINE_STAGE(SWITCH, IN, QOS_METER, 11, "ls_in_qos_meter") \
PIPELINE_STAGE(SWITCH, IN, LB, 12, "ls_in_lb") \
PIPELINE_STAGE(SWITCH, IN, ACL_AFTER_LB, 13, "ls_in_acl_after_lb") \
PIPELINE_STAGE(SWITCH, IN, STATEFUL, 14, "ls_in_stateful") \
PIPELINE_STAGE(SWITCH, IN, PRE_HAIRPIN, 15, "ls_in_pre_hairpin") \
PIPELINE_STAGE(SWITCH, IN, NAT_HAIRPIN, 16, "ls_in_nat_hairpin") \
PIPELINE_STAGE(SWITCH, IN, HAIRPIN, 17, "ls_in_hairpin") \
PIPELINE_STAGE(SWITCH, IN, PRE_HAIRPIN, 13, "ls_in_pre_hairpin") \
PIPELINE_STAGE(SWITCH, IN, NAT_HAIRPIN, 14, "ls_in_nat_hairpin") \
PIPELINE_STAGE(SWITCH, IN, HAIRPIN, 15, "ls_in_hairpin") \
PIPELINE_STAGE(SWITCH, IN, ACL_AFTER_LB, 16, "ls_in_acl_after_lb") \
PIPELINE_STAGE(SWITCH, IN, STATEFUL, 17, "ls_in_stateful") \
PIPELINE_STAGE(SWITCH, IN, ARP_ND_RSP, 18, "ls_in_arp_rsp") \
PIPELINE_STAGE(SWITCH, IN, DHCP_OPTIONS, 19, "ls_in_dhcp_options") \
PIPELINE_STAGE(SWITCH, IN, DHCP_RESPONSE, 20, "ls_in_dhcp_response") \
Expand Down
136 changes: 68 additions & 68 deletions northd/ovn-northd.8.xml
Original file line number Diff line number Diff line change
Expand Up @@ -944,7 +944,73 @@
</li>
</ul>

<h3>Ingress table 13: <code>from-lport</code> ACLs after LB</h3>
<h3>Ingress Table 13: Pre-Hairpin</h3>
<ul>
<li>
If the logical switch has load balancer(s) configured, then a
priority-100 flow is added with the match
<code>ip &amp;&amp; ct.trk</code> to check if the
packet needs to be hairpinned (if after load balancing the destination
IP matches the source IP) or not by executing the actions
<code>reg0[6] = chk_lb_hairpin();</code> and
<code>reg0[12] = chk_lb_hairpin_reply();</code> and advances the packet
to the next table.
</li>

<li>
A priority-0 flow that simply moves traffic to the next table.
</li>
</ul>

<h3>Ingress Table 14: Nat-Hairpin</h3>
<ul>
<li>
If the logical switch has load balancer(s) configured, then a
priority-100 flow is added with the match
<code>ip &amp;&amp; ct.new &amp;&amp; ct.trk &amp;&amp;
reg0[6] == 1</code> which hairpins the traffic by
NATting source IP to the load balancer VIP by executing the action
<code>ct_snat_to_vip</code> and advances the packet to the next table.
</li>

<li>
If the logical switch has load balancer(s) configured, then a
priority-100 flow is added with the match
<code>ip &amp;&amp; ct.est &amp;&amp; ct.trk &amp;&amp;
reg0[6] == 1</code> which hairpins the traffic by
NATting source IP to the load balancer VIP by executing the action
<code>ct_snat</code> and advances the packet to the next table.
</li>

<li>
If the logical switch has load balancer(s) configured, then a
priority-90 flow is added with the match
<code>ip &amp;&amp; reg0[12] == 1</code> which matches on the replies
of hairpinned traffic (i.e., destination IP is VIP,
source IP is the backend IP and source L4 port is backend port for L4
load balancers) and executes <code>ct_snat</code> and advances the
packet to the next table.
</li>

<li>
A priority-0 flow that simply moves traffic to the next table.
</li>
</ul>

<h3>Ingress Table 15: Hairpin</h3>
<ul>
<li>
A priority-1 flow that hairpins traffic matched by non-default
flows in the Pre-Hairpin table. Hairpinning is done at L2, Ethernet
addresses are swapped and the packets are looped back on the input
port.
</li>
<li>
A priority-0 flow that simply moves traffic to the next table.
</li>
</ul>

<h3>Ingress table 16: <code>from-lport</code> ACLs after LB</h3>

<p>
Logical flows in this table closely reproduce those in the
Expand Down Expand Up @@ -1006,7 +1072,7 @@
</li>
</ul>

<h3>Ingress Table 14: Stateful</h3>
<h3>Ingress Table 17: Stateful</h3>

<ul>
<li>
Expand All @@ -1029,72 +1095,6 @@
</li>
</ul>

<h3>Ingress Table 15: Pre-Hairpin</h3>
<ul>
<li>
If the logical switch has load balancer(s) configured, then a
priority-100 flow is added with the match
<code>ip &amp;&amp; ct.trk</code> to check if the
packet needs to be hairpinned (if after load balancing the destination
IP matches the source IP) or not by executing the actions
<code>reg0[6] = chk_lb_hairpin();</code> and
<code>reg0[12] = chk_lb_hairpin_reply();</code> and advances the packet
to the next table.
</li>

<li>
A priority-0 flow that simply moves traffic to the next table.
</li>
</ul>

<h3>Ingress Table 16: Nat-Hairpin</h3>
<ul>
<li>
If the logical switch has load balancer(s) configured, then a
priority-100 flow is added with the match
<code>ip &amp;&amp; ct.new &amp;&amp; ct.trk &amp;&amp;
reg0[6] == 1</code> which hairpins the traffic by
NATting source IP to the load balancer VIP by executing the action
<code>ct_snat_to_vip</code> and advances the packet to the next table.
</li>

<li>
If the logical switch has load balancer(s) configured, then a
priority-100 flow is added with the match
<code>ip &amp;&amp; ct.est &amp;&amp; ct.trk &amp;&amp;
reg0[6] == 1</code> which hairpins the traffic by
NATting source IP to the load balancer VIP by executing the action
<code>ct_snat</code> and advances the packet to the next table.
</li>

<li>
If the logical switch has load balancer(s) configured, then a
priority-90 flow is added with the match
<code>ip &amp;&amp; reg0[12] == 1</code> which matches on the replies
of hairpinned traffic (i.e., destination IP is VIP,
source IP is the backend IP and source L4 port is backend port for L4
load balancers) and executes <code>ct_snat</code> and advances the
packet to the next table.
</li>

<li>
A priority-0 flow that simply moves traffic to the next table.
</li>
</ul>

<h3>Ingress Table 17: Hairpin</h3>
<ul>
<li>
A priority-1 flow that hairpins traffic matched by non-default
flows in the Pre-Hairpin table. Hairpinning is done at L2, Ethernet
addresses are swapped and the packets are looped back on the input
port.
</li>
<li>
A priority-0 flow that simply moves traffic to the next table.
</li>
</ul>

<h3>Ingress Table 18: ARP/ND responder</h3>

<p>
Expand Down
34 changes: 17 additions & 17 deletions tests/ovn-northd.at
Original file line number Diff line number Diff line change
Expand Up @@ -2270,7 +2270,7 @@ check ovn-nbctl --wait=sb \
-- ls-lb-add ls lb

AT_CHECK([ovn-sbctl lflow-list ls | grep -e ls_in_acl_hint -e ls_out_acl_hint -e ls_in_acl -e ls_out_acl | sort], [0], [dnl
table=13(ls_in_acl_after_lb ), priority=0 , match=(1), action=(next;)
table=16(ls_in_acl_after_lb ), priority=0 , match=(1), action=(next;)
table=3 (ls_out_acl_hint ), priority=0 , match=(1), action=(next;)
table=3 (ls_out_acl_hint ), priority=1 , match=(ct.est && ct_mark.blocked == 0), action=(reg0[[10]] = 1; next;)
table=3 (ls_out_acl_hint ), priority=2 , match=(ct.est && ct_mark.blocked == 1), action=(reg0[[9]] = 1; next;)
Expand Down Expand Up @@ -2311,7 +2311,7 @@ ovn-nbctl --wait=sb clear logical_switch ls acls
ovn-nbctl --wait=sb clear logical_switch ls load_balancer

AT_CHECK([ovn-sbctl lflow-list ls | grep -e ls_in_acl_hint -e ls_out_acl_hint -e ls_in_acl -e ls_out_acl | sort], [0], [dnl
table=13(ls_in_acl_after_lb ), priority=0 , match=(1), action=(next;)
table=16(ls_in_acl_after_lb ), priority=0 , match=(1), action=(next;)
table=3 (ls_out_acl_hint ), priority=65535, match=(1), action=(next;)
table=4 (ls_out_acl ), priority=65535, match=(1), action=(next;)
table=8 (ls_in_acl_hint ), priority=65535, match=(1), action=(next;)
Expand Down Expand Up @@ -3952,9 +3952,9 @@ check_stateful_flows() {
])

AT_CHECK([grep "ls_in_stateful" sw0flows | sort], [0], [dnl
table=14(ls_in_stateful ), priority=0 , match=(1), action=(next;)
table=14(ls_in_stateful ), priority=100 , match=(reg0[[1]] == 1 && reg0[[13]] == 0), action=(ct_commit { ct_mark.blocked = 0; }; next;)
table=14(ls_in_stateful ), priority=100 , match=(reg0[[1]] == 1 && reg0[[13]] == 1), action=(ct_commit { ct_mark.blocked = 0; ct_label.label = reg3; }; next;)
table=17(ls_in_stateful ), priority=0 , match=(1), action=(next;)
table=17(ls_in_stateful ), priority=100 , match=(reg0[[1]] == 1 && reg0[[13]] == 0), action=(ct_commit { ct_mark.blocked = 0; }; next;)
table=17(ls_in_stateful ), priority=100 , match=(reg0[[1]] == 1 && reg0[[13]] == 1), action=(ct_commit { ct_mark.blocked = 0; ct_label.label = reg3; }; next;)
])

AT_CHECK([grep "ls_out_pre_lb" sw0flows | sort], [0], [dnl
Expand Down Expand Up @@ -4016,9 +4016,9 @@ AT_CHECK([grep "ls_in_lb" sw0flows | sort], [0], [dnl
])

AT_CHECK([grep "ls_in_stateful" sw0flows | sort], [0], [dnl
table=14(ls_in_stateful ), priority=0 , match=(1), action=(next;)
table=14(ls_in_stateful ), priority=100 , match=(reg0[[1]] == 1 && reg0[[13]] == 0), action=(ct_commit { ct_mark.blocked = 0; }; next;)
table=14(ls_in_stateful ), priority=100 , match=(reg0[[1]] == 1 && reg0[[13]] == 1), action=(ct_commit { ct_mark.blocked = 0; ct_label.label = reg3; }; next;)
table=17(ls_in_stateful ), priority=0 , match=(1), action=(next;)
table=17(ls_in_stateful ), priority=100 , match=(reg0[[1]] == 1 && reg0[[13]] == 0), action=(ct_commit { ct_mark.blocked = 0; }; next;)
table=17(ls_in_stateful ), priority=100 , match=(reg0[[1]] == 1 && reg0[[13]] == 1), action=(ct_commit { ct_mark.blocked = 0; ct_label.label = reg3; }; next;)
])

AT_CHECK([grep "ls_out_pre_lb" sw0flows | sort], [0], [dnl
Expand Down Expand Up @@ -4062,9 +4062,9 @@ AT_CHECK([grep -w "ls_in_acl" sw0flows | grep 2002 | sort], [0], [dnl
table=9 (ls_in_acl ), priority=2002 , match=(reg0[[8]] == 1 && (tcp)), action=(reg0[[1]] = 1; reg0[[13]] = 1; reg3 = 1234; next;)
])
AT_CHECK([grep "ls_in_stateful" sw0flows | sort], [0], [dnl
table=14(ls_in_stateful ), priority=0 , match=(1), action=(next;)
table=14(ls_in_stateful ), priority=100 , match=(reg0[[1]] == 1 && reg0[[13]] == 0), action=(ct_commit { ct_mark.blocked = 0; }; next;)
table=14(ls_in_stateful ), priority=100 , match=(reg0[[1]] == 1 && reg0[[13]] == 1), action=(ct_commit { ct_mark.blocked = 0; ct_label.label = reg3; }; next;)
table=17(ls_in_stateful ), priority=0 , match=(1), action=(next;)
table=17(ls_in_stateful ), priority=100 , match=(reg0[[1]] == 1 && reg0[[13]] == 0), action=(ct_commit { ct_mark.blocked = 0; }; next;)
table=17(ls_in_stateful ), priority=100 , match=(reg0[[1]] == 1 && reg0[[13]] == 1), action=(ct_commit { ct_mark.blocked = 0; ct_label.label = reg3; }; next;)
])

AT_CHECK([grep -w "ls_out_acl" sw0flows | grep 2002 | sort], [0], [dnl
Expand All @@ -4091,9 +4091,9 @@ AT_CHECK([grep -w "ls_in_acl" sw0flows | grep 2002 | sort], [0], [dnl
table=9 (ls_in_acl ), priority=2002 , match=(reg0[[8]] == 1 && (udp)), action=(next;)
])
AT_CHECK([grep "ls_in_stateful" sw0flows | sort], [0], [dnl
table=14(ls_in_stateful ), priority=0 , match=(1), action=(next;)
table=14(ls_in_stateful ), priority=100 , match=(reg0[[1]] == 1 && reg0[[13]] == 0), action=(ct_commit { ct_mark.blocked = 0; }; next;)
table=14(ls_in_stateful ), priority=100 , match=(reg0[[1]] == 1 && reg0[[13]] == 1), action=(ct_commit { ct_mark.blocked = 0; ct_label.label = reg3; }; next;)
table=17(ls_in_stateful ), priority=0 , match=(1), action=(next;)
table=17(ls_in_stateful ), priority=100 , match=(reg0[[1]] == 1 && reg0[[13]] == 0), action=(ct_commit { ct_mark.blocked = 0; }; next;)
table=17(ls_in_stateful ), priority=100 , match=(reg0[[1]] == 1 && reg0[[13]] == 1), action=(ct_commit { ct_mark.blocked = 0; ct_label.label = reg3; }; next;)
])

AT_CHECK([grep -w "ls_out_acl" sw0flows | grep 2002 | sort], [0], [dnl
Expand All @@ -4120,9 +4120,9 @@ AT_CHECK([grep -w "ls_in_acl" sw0flows | grep 2002 | sort], [0], [dnl
table=9 (ls_in_acl ), priority=2002 , match=(reg0[[8]] == 1 && (udp)), action=(next;)
])
AT_CHECK([grep "ls_in_stateful" sw0flows | sort], [0], [dnl
table=14(ls_in_stateful ), priority=0 , match=(1), action=(next;)
table=14(ls_in_stateful ), priority=100 , match=(reg0[[1]] == 1 && reg0[[13]] == 0), action=(ct_commit { ct_mark.blocked = 0; }; next;)
table=14(ls_in_stateful ), priority=100 , match=(reg0[[1]] == 1 && reg0[[13]] == 1), action=(ct_commit { ct_mark.blocked = 0; ct_label.label = reg3; }; next;)
table=17(ls_in_stateful ), priority=0 , match=(1), action=(next;)
table=17(ls_in_stateful ), priority=100 , match=(reg0[[1]] == 1 && reg0[[13]] == 0), action=(ct_commit { ct_mark.blocked = 0; }; next;)
table=17(ls_in_stateful ), priority=100 , match=(reg0[[1]] == 1 && reg0[[13]] == 1), action=(ct_commit { ct_mark.blocked = 0; ct_label.label = reg3; }; next;)
])

AT_CHECK([grep -w "ls_out_acl" sw0flows | grep 2002 | sort], [0], [dnl
Expand Down
24 changes: 22 additions & 2 deletions tests/system-ovn.at
Original file line number Diff line number Diff line change
Expand Up @@ -4661,7 +4661,7 @@ ADD_VETH(lsp, lsp, br-int, "42.42.42.1/24", "00:00:00:00:00:01", \
ovn-nbctl --wait=hv -t 3 sync

# Start IPv4 TCP server on lsp.
NS_CHECK_EXEC([lsp], [timeout 2s nc -k -l 42.42.42.1 4041 &], [0])
NETNS_DAEMONIZE([lsp], [nc -l -k 42.42.42.1 4041], [lsp0.pid])

# Check that IPv4 TCP hairpin connection succeeds on both VIPs.
NS_CHECK_EXEC([lsp], [nc 88.88.88.88 8080 -z], [0], [ignore], [ignore])
Expand All @@ -4685,6 +4685,16 @@ OVS_WAIT_UNTIL([
test "${total_pkts}" = "3"
])

ovn-nbctl pg-add pg0 lsp
ovn-nbctl --apply-after-lb acl-add pg0 from-lport 1004 "ip4 && ip4.dst == 10.0.0.2" drop
ovn-nbctl --apply-after-lb acl-add pg0 from-lport 1002 "ip4 && tcp" allow-related
ovn-nbctl --apply-after-lb acl-add pg0 from-lport 1002 "ip4 && udp" allow
ovn-nbctl --wait=hv sync

## Check that IPv4 TCP hairpin connection succeeds on both VIPs.
NS_CHECK_EXEC([lsp], [nc 88.88.88.88 8080 -z], [0], [ignore], [ignore])
NS_CHECK_EXEC([lsp], [nc 88.88.88.89 8080 -z], [0], [ignore], [ignore])

OVS_APP_EXIT_AND_WAIT([ovn-controller])

as ovn-sb
Expand Down Expand Up @@ -4751,7 +4761,7 @@ OVS_WAIT_UNTIL([test "$(ip netns exec lsp ip a | grep 4200::1 | grep tentative)"
ovn-nbctl --wait=hv -t 3 sync

# Start IPv6 TCP server on lsp.
NS_CHECK_EXEC([lsp], [timeout 2s nc -k -l 4200::1 4041 &], [0])
NETNS_DAEMONIZE([lsp], [nc -l -k 4200::1 4041], [lsp0.pid])

# Check that IPv6 TCP hairpin connection succeeds on both VIPs.
NS_CHECK_EXEC([lsp], [nc 8800::0088 8080 -z], [0], [ignore], [ignore])
Expand All @@ -4775,6 +4785,16 @@ OVS_WAIT_UNTIL([
test "${total_pkts}" = "3"
])

ovn-nbctl pg-add pg0 lsp
ovn-nbctl --apply-after-lb acl-add pg0 from-lport 1002 "ip6 && tcp" allow-related
ovn-nbctl --apply-after-lb acl-add pg0 from-lport 1002 "ip6 && udp" allow
ovn-nbctl --apply-after-lb acl-add pg0 from-lport 1000 "ip6" drop
ovn-nbctl --wait=hv sync

# Check that IPv6 TCP hairpin connection succeeds on both VIPs.
NS_CHECK_EXEC([lsp], [nc 8800::0088 8080 -z], [0], [ignore], [ignore])
NS_CHECK_EXEC([lsp], [nc 8800::0089 8080 -z], [0], [ignore], [ignore])

OVS_APP_EXIT_AND_WAIT([ovn-controller])

as ovn-sb
Expand Down

0 comments on commit 254724d

Please sign in to comment.