Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
ofctrl: Fix the assert seen when flood removing flows.
In one of the scaled deployments, ovn-controller is asserting with the below stack trace **************** (gdb) bt #0 0x00007fa6aed4970f in raise () from /lib64/libc.so.6 ovn-org#1 0x00007fa6aed33b25 in abort () from /lib64/libc.so.6 ovn-org#2 0x000055d46594a714 in ovs_abort_valist (err_no=err_no@entry=0, format=format@entry=0x55d465a2a190 "%s: assertion %s failed in %s()", args=args@entry=0x7ffd24307ba0) at lib/util.c:419 ovn-org#3 0x000055d465952504 in vlog_abort_valist (module_=<optimized out>, message=0x55d465a2a190 "%s: assertion %s failed in %s()", args=args@entry=0x7ffd24307ba0) at lib/vlog.c:1249 ovn-org#4 0x000055d4659525aa in vlog_abort (module=module@entry=0x55d465ce6880 <this_module>, message=message@entry=0x55d465a2a190 "%s: assertion %s failed in %s()") at lib/vlog.c:1263 ovn-org#5 0x000055d46594a42b in ovs_assert_failure (where=where@entry=0x55d465a05529 "controller/ofctrl.c:1198", function=function@entry=0x55d465a05ca0 <__func__.33798> "flood_remove_flows_for_sb_uuid", condition=condition@entry=0x55d465a05a80 "ovs_list_is_empty(&f->list_node)") at lib/util.c:86 ovn-org#6 0x000055d465877aa2 in flood_remove_flows_for_sb_uuid (flow_table=flow_table@entry=0x55d46688d080, sb_uuid=sb_uuid@entry=0x55d53dbec538, flood_remove_nodes=flood_remove_nodes@entry=0x7ffd24307ed0) at controller/ofctrl.c:1205 ovn-org#7 0x000055d465877c2e in flood_remove_flows_for_sb_uuid (flow_table=flow_table@entry=0x55d46688d080, sb_uuid=sb_uuid@entry=0x55d546553898, flood_remove_nodes=flood_remove_nodes@entry=0x7ffd24307ed0) at controller/ofctrl.c:1230 ovn-org#8 0x000055d465877c2e in flood_remove_flows_for_sb_uuid (flow_table=flow_table@entry=0x55d46688d080, sb_uuid=sb_uuid@entry=0x55d55eafebf0, flood_remove_nodes=flood_remove_nodes@entry=0x7ffd24307ed0) at controller/ofctrl.c:1230 ovn-org#9 0x000055d465877dc2 in ofctrl_flood_remove_flows (flow_table=0x55d46688d080, flood_remove_nodes=flood_remove_nodes@entry=0x7ffd24307ed0) at controller/ofctrl.c:1250 ovn-org#10 0x000055d465872b24 in lflow_handle_changed_ref (ref_type=ref_type@entry=REF_TYPE_PORTGROUP, ref_name=ref_name@entry=0x55d49375a050 "5564_pg_6415729e_58ec_4b8b_bc99_2ceef5c44bac", l_ctx_in=l_ctx_in@entry=0x7ffd24307fd0, l_ctx_out=l_ctx_out@entry=0x7ffd24307f90, changed=changed@entry=0x7ffd24307f8f) at controller/lflow.c:612 ovn-org#11 0x000055d46588f2f8 in _flow_output_resource_ref_handler (node=<optimized out>, data=<optimized out>, ref_type=REF_TYPE_PORTGROUP) at controller/ovn-controller.c:2181 ovn-org#12 0x000055d4658a8163 in engine_compute (recompute_allowed=<optimized out>, node=<optimized out>) at lib/inc-proc-eng.c:306 ovn-org#13 engine_run_node (recompute_allowed=true, node=0x7ffd2430d4b0) at lib/inc-proc-eng.c:352 ovn-org#14 engine_run (recompute_allowed=recompute_allowed@entry=true) at lib/inc-proc-eng.c:377 ovn-org#15 0x000055d46586613d in main (argc=<optimized out>, argv=<optimized out>) at controller/ovn-controller.c:2794 *************** This assertion is seen when a port group gets updated and it is referenced by many logical flows (with conj actions). The function ofctrl_flood_remove_flows(), calls flood_remove_flows_for_sb_uuid() for each sb uuid in the hmap - flood_remove_nodes using HMAP_FOR_EACH (flood_remove_nodes). flood_remove_flows_for_sb_uuid() also takes the hmap 'flood_remove_nodes' as an argument and it inserts few items into it when it has to call itself recursively. When an item is inserted, its possible that the hmap may get expanded. And if this happens, the HMAP_FOR_EACH () skips few entries causing some of the desired flows not getting cleared. Later when ofctrl_add_or_append_flow() is called, there would be multiple 'struct sb_flow_ref' references for the same desired flow. And this causes the above assertion later when the same port group gets updated. This patch fixes this issue by cloning the hmap 'flood_remove_nodes' and using it to iterate the flood remove nodes. Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1928012 CC: Han Zhou <hzhou@ovn.org> CC: Dumitru Ceara <dceara@ovn.org> Signed-off-by: Numan Siddique <numans@ovn.org> Signed-off-by: 0-day Robot <robot@bytheb.org>
- Loading branch information