Skip to content

Commit

Permalink
ofctrl: Fix the assert seen when flood removing flows.
Browse files Browse the repository at this point in the history
In one of the scaled deployments, ovn-controller is asserting with the
below stack trace

****************
(gdb) bt
   #0  0x00007fa6aed4970f in raise () from /lib64/libc.so.6
   ovn-org#1  0x00007fa6aed33b25 in abort () from /lib64/libc.so.6
   ovn-org#2  0x000055d46594a714 in ovs_abort_valist (err_no=err_no@entry=0, format=format@entry=0x55d465a2a190 "%s: assertion %s failed in %s()", args=args@entry=0x7ffd24307ba0) at lib/util.c:419
   ovn-org#3  0x000055d465952504 in vlog_abort_valist (module_=<optimized out>, message=0x55d465a2a190 "%s: assertion %s failed in %s()", args=args@entry=0x7ffd24307ba0) at lib/vlog.c:1249
   ovn-org#4  0x000055d4659525aa in vlog_abort (module=module@entry=0x55d465ce6880 <this_module>, message=message@entry=0x55d465a2a190 "%s: assertion %s failed in %s()") at lib/vlog.c:1263
   ovn-org#5  0x000055d46594a42b in ovs_assert_failure (where=where@entry=0x55d465a05529 "controller/ofctrl.c:1198", function=function@entry=0x55d465a05ca0 <__func__.33798> "flood_remove_flows_for_sb_uuid",
    condition=condition@entry=0x55d465a05a80 "ovs_list_is_empty(&f->list_node)") at lib/util.c:86
   ovn-org#6  0x000055d465877aa2 in flood_remove_flows_for_sb_uuid (flow_table=flow_table@entry=0x55d46688d080, sb_uuid=sb_uuid@entry=0x55d53dbec538, flood_remove_nodes=flood_remove_nodes@entry=0x7ffd24307ed0) at controller/ofctrl.c:1205
   ovn-org#7  0x000055d465877c2e in flood_remove_flows_for_sb_uuid (flow_table=flow_table@entry=0x55d46688d080, sb_uuid=sb_uuid@entry=0x55d546553898, flood_remove_nodes=flood_remove_nodes@entry=0x7ffd24307ed0) at controller/ofctrl.c:1230
   ovn-org#8  0x000055d465877c2e in flood_remove_flows_for_sb_uuid (flow_table=flow_table@entry=0x55d46688d080, sb_uuid=sb_uuid@entry=0x55d55eafebf0, flood_remove_nodes=flood_remove_nodes@entry=0x7ffd24307ed0) at controller/ofctrl.c:1230
   ovn-org#9  0x000055d465877dc2 in ofctrl_flood_remove_flows (flow_table=0x55d46688d080, flood_remove_nodes=flood_remove_nodes@entry=0x7ffd24307ed0) at controller/ofctrl.c:1250
   ovn-org#10 0x000055d465872b24 in lflow_handle_changed_ref (ref_type=ref_type@entry=REF_TYPE_PORTGROUP, ref_name=ref_name@entry=0x55d49375a050 "5564_pg_6415729e_58ec_4b8b_bc99_2ceef5c44bac", l_ctx_in=l_ctx_in@entry=0x7ffd24307fd0,
    l_ctx_out=l_ctx_out@entry=0x7ffd24307f90, changed=changed@entry=0x7ffd24307f8f) at controller/lflow.c:612
   ovn-org#11 0x000055d46588f2f8 in _flow_output_resource_ref_handler (node=<optimized out>, data=<optimized out>, ref_type=REF_TYPE_PORTGROUP) at controller/ovn-controller.c:2181
   ovn-org#12 0x000055d4658a8163 in engine_compute (recompute_allowed=<optimized out>, node=<optimized out>) at lib/inc-proc-eng.c:306
   ovn-org#13 engine_run_node (recompute_allowed=true, node=0x7ffd2430d4b0) at lib/inc-proc-eng.c:352
   ovn-org#14 engine_run (recompute_allowed=recompute_allowed@entry=true) at lib/inc-proc-eng.c:377
   ovn-org#15 0x000055d46586613d in main (argc=<optimized out>, argv=<optimized out>) at controller/ovn-controller.c:2794

***************

This assertion is seen when a port group gets updated and it is referenced by many
logical flows (with conj actions).  The function
ofctrl_flood_remove_flows(), calls flood_remove_flows_for_sb_uuid() for
each sb uuid in the hmap - flood_remove_nodes using HMAP_FOR_EACH (flood_remove_nodes).
flood_remove_flows_for_sb_uuid() also takes the hmap
'flood_remove_nodes' as an argument and it inserts few items into it
when it has to call itself recursively.  When an item is inserted, its possible that
the hmap may get expanded.  And if this happens, the HMAP_FOR_EACH ()
skips few entries causing some of the desired flows not getting cleared.

Later when ofctrl_add_or_append_flow() is called, there would be
multiple  'struct sb_flow_ref' references for the same desired flow.
And this causes the above assertion later when the same port group gets
updated.

This patch fixes this issue by cloning the hmap 'flood_remove_nodes' and
using it to iterate the flood remove nodes.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1928012
CC: Han Zhou <hzhou@ovn.org>
CC: Dumitru Ceara <dceara@ovn.org>
Signed-off-by: Numan Siddique <numans@ovn.org>
Signed-off-by: 0-day Robot <robot@bytheb.org>
  • Loading branch information
numansiddique authored and ovsrobot committed Feb 16, 2021
1 parent e778855 commit c69e93a
Showing 1 changed file with 18 additions and 0 deletions.
18 changes: 18 additions & 0 deletions controller/ofctrl.c
Original file line number Diff line number Diff line change
Expand Up @@ -1247,10 +1247,28 @@ ofctrl_flood_remove_flows(struct ovn_desired_flow_table *flow_table,
struct hmap *flood_remove_nodes)
{
struct ofctrl_flood_remove_node *ofrn;

/* flood_remove_flows_for_sb_uuid() modifies the hmap by inserting
* few entries when it calls itself recursively. Clone the
* hmap 'flood_remove_nodes' before calling. Inserting an item to
* hmap may expand it. */
struct hmap flood_remove_uuids = HMAP_INITIALIZER(&flood_remove_uuids);
HMAP_FOR_EACH (ofrn, hmap_node, flood_remove_nodes) {
ofctrl_flood_remove_add_node(&flood_remove_uuids, &ofrn->sb_uuid);
}

/* Iterate using the cloned hmap - 'flood_remove_uuids', but pass
* the hmap 'flood_remove_nodes' provided by the caller.
* flood_remove_flows_for_sb_uuid() will delete the other lflows
* referenced by the sb_uuid, which needs to be re-added later
* if those sb_uuids were not deleted. The caller
* (in lflow.c) will add re-add lflows which are not deleted. */
HMAP_FOR_EACH_POP (ofrn, hmap_node, &flood_remove_uuids) {
flood_remove_flows_for_sb_uuid(flow_table, &ofrn->sb_uuid,
flood_remove_nodes);
free(ofrn);
}
hmap_destroy(&flood_remove_uuids);

/* remove any related group and meter info */
HMAP_FOR_EACH (ofrn, hmap_node, flood_remove_nodes) {
Expand Down

0 comments on commit c69e93a

Please sign in to comment.