Skip to content

Commit bd86266

Browse files
david-marchandigsilya
authored andcommitted
ofproto-dpif-upcall: Pause revalidators when purging.
This issue has been observed when running traffic tests with a dpdk enabled userspace datapath (though those tests are added in a separate series). However, the described issue also affects the kernel datapath which is why this patch is sent separately. A main thread executing the 'revalidator/purge' command could race with revalidator threads that can be dumping/sweeping the purged flows at the same time. This race can be reproduced (with dpif debug logs) by running the conntrack - ICMP related unit tests with the userspace datapath: 2023-10-09T14:11:55.242Z|00177|unixctl|DBG|received request revalidator/purge[], id=0 2023-10-09T14:11:55.242Z|00044|dpif(revalidator17)|DBG|netdev@ovs-netdev: flow_dump ufid:68ff6817-fb3b-4b30-8412-9cf175318294 <empty>, packets:0, bytes:0, used:never 2023-10-09T14:11:55.242Z|00178|dpif|DBG|netdev@ovs-netdev: flow_del ufid:07046e91-30a6-4862-9048-1a76b5a88a5b recirc_id(0),dp_hash(0),skb_priority(0),in_port(2),skb_mark(0), ct_state(0),ct_zone(0),ct_mark(0),ct_label(0), packet_type(ns=0,id=0), eth(src=a6:0a:bf:e2:f3:f2,dst=62:23:0f:f6:2c:75), eth_type(0x0800),ipv4(src=10.1.1.1,dst=10.1.1.2,proto=17,tos=0, ttl=64,frag=no),udp(src=37380,dst=10000), packets:0, bytes:0, used:never ... 2023-10-09T14:11:55.242Z|00049|dpif(revalidator17)|WARN|netdev@ovs-netdev: failed to flow_get (No such file or directory) ufid:07046e91-30a6-4862-9048-1a76b5a88a5b <empty>, packets:0, bytes:0, used:never 2023-10-09T14:11:55.242Z|00050|ofproto_dpif_upcall(revalidator17)|WARN| Failed to acquire udpif_key corresponding to unexpected flow (No such file or directory): ufid:07046e91-30a6-4862-9048-1a76b5a88a5b ... 2023-10-09T14:11:55.242Z|00183|unixctl|DBG|replying with success, id=0: "" To avoid this race, a first part of the fix is to pause (if not already paused) the revalidators while the main thread is purging the datapath flows. Then a second issue is observed by running the same unit test with the kernel datapath. Its dpif implementation dumps flows via a netlink request (see dpif_flow_dump_create(), dpif_netlink_flow_dump_create(), nl_dump_start(), nl_sock_send__()) in the leader revalidator thread, before pausing revalidators: 2023-10-09T14:44:28.742Z|00122|unixctl|DBG|received request revalidator/purge[], id=0 ... 2023-10-09T14:44:28.742Z|00125|dpif|DBG|system@ovs-system: flow_del ufid:70102d81-30a1-44b9-aa76-3d02a9ffd2c9 recirc_id(0),dp_hash(0), skb_priority(0),in_port(2),skb_mark(0),ct_state(0),ct_zone(0), ct_mark(0),ct_label(0),eth(src=a6:0a:bf:e2:f3:f2, dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=10.1.1.1, tip=10.1.1.2,op=1,sha=a6:0a:bf:e2:f3:f2,tha=00:00:00:00:00:00), packets:0, bytes:0, used:never ... 2023-10-09T14:44:28.742Z|00129|unixctl|DBG|replying with success, id=0: "" ... 2023-10-09T14:44:28.742Z|00006|dpif(revalidator21)|DBG|system@ovs-system: flow_dump ufid:70102d81-30a1-44b9-aa76-3d02a9ffd2c9 <empty>, packets:0, bytes:0, used:never ... 2023-10-09T14:44:28.742Z|00012|dpif(revalidator21)|WARN|system@ovs-system: failed to flow_get (No such file or directory) ufid:70102d81-30a1-44b9-aa76-3d02a9ffd2c9 <empty>, packets:0, bytes:0, used:never 2023-10-09T14:44:28.742Z|00013|ofproto_dpif_upcall(revalidator21)|WARN| Failed to acquire udpif_key corresponding to unexpected flow (No such file or directory): ufid:70102d81-30a1-44b9-aa76-3d02a9ffd2c9 To avoid evaluating already deleted flows, the second part of the fix is to ensure that dumping from the leader revalidator thread is done out of any pause request. As a result of this patch, the unit test "offloads - delete ufid mapping if device not exist - offloads enabled" does not need to waive the random warning logs when purging dp flows. Fixes: 98bb428 ("tests: Add command to purge revalidators of flows.") Acked-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Simon Horman <horms@ovn.org> Signed-off-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
1 parent d581473 commit bd86266

2 files changed

Lines changed: 15 additions & 4 deletions

File tree

ofproto/ofproto-dpif-upcall.c

Lines changed: 15 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -990,7 +990,7 @@ udpif_revalidator(void *arg)
990990
udpif->reval_exit = latch_is_set(&udpif->exit_latch);
991991

992992
start_time = time_msec();
993-
if (!udpif->reval_exit) {
993+
if (!udpif->reval_exit && !udpif->pause) {
994994
bool terse_dump;
995995

996996
terse_dump = udpif_use_ufid(udpif);
@@ -1000,10 +1000,15 @@ udpif_revalidator(void *arg)
10001000
}
10011001
}
10021002

1003-
/* Wait for the leader to start the flow dump. */
1003+
/* Wait for the leader to reach this point. */
10041004
ovs_barrier_block(&udpif->reval_barrier);
10051005
if (udpif->pause) {
10061006
revalidator_pause(revalidator);
1007+
if (!udpif->reval_exit) {
1008+
/* The main thread resumed all validators, but the leader
1009+
* didn't start the dump, go to next iteration. */
1010+
continue;
1011+
}
10071012
}
10081013

10091014
if (udpif->reval_exit) {
@@ -3217,11 +3222,19 @@ upcall_unixctl_purge(struct unixctl_conn *conn, int argc OVS_UNUSED,
32173222
struct udpif *udpif;
32183223

32193224
LIST_FOR_EACH (udpif, list_node, &all_udpifs) {
3225+
bool wake_up = false;
32203226
int n;
32213227

3228+
if (!latch_is_set(&udpif->pause_latch)) {
3229+
udpif_pause_revalidators(udpif);
3230+
wake_up = true;
3231+
}
32223232
for (n = 0; n < udpif->n_revalidators; n++) {
32233233
revalidator_purge(&udpif->revalidators[n]);
32243234
}
3235+
if (wake_up) {
3236+
udpif_resume_revalidators(udpif);
3237+
}
32253238
}
32263239
unixctl_command_reply(conn, "");
32273240
}

tests/system-offloads-traffic.at

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -799,8 +799,6 @@ AT_CHECK([test $(ovs-appctl dpctl/dump-flows | grep -c "eth_type(0x0800)") -eq 0
799799

800800
OVS_TRAFFIC_VSWITCHD_STOP(["/could not open network device ovs-p0/d
801801
/on nonexistent port/d
802-
/failed to flow_get/d
803-
/Failed to acquire udpif_key/d
804802
/No such device/d
805803
/failed to offload flow/d
806804
"])

0 commit comments

Comments
 (0)