Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
ofctrl: Fix the assert seen when flood removing flows with conj actions.
In one of the scaled deployments, ovn-controller is asserting with the below stack trace *** (gdb) bt 0 raise () from /lib64/libc.so.6 1 abort () from /lib64/libc.so.6 2 ovs_abort_valist ("%s: assertion %s failed in %s()") at lib/util.c:419 3 vlog_abort_valist ("%s: assertion %s failed in %s()") at lib/vlog.c:1249 4 vlog_abort ("%s: assertion %s failed in %s()") at lib/vlog.c:1263 5 ovs_assert_failure (where="controller/ofctrl.c:1216", function="flood_remove_flows_for_sb_uuid", condition="ovs_list_is_empty(&f->list_node)") at lib/util.c:86 6 flood_remove_flows_for_sb_uuid (sb_uuid=...134, flood_remove_nodes=...ef0) at controller/ofctrl.c:1216 9 ofctrl_flood_remove_flows (flood_remove_nodes=...ef0) at controller/ofctrl.c:1280 10 lflow_handle_changed_ref (ref_type=REF_TYPE_PORTGROUP, ref_name= "5564_pg_14...aac") at controller/lflow.c:522 11 _flow_output_resource_ref_handler (ref_type=REF_TYPE_PORTGROUP) at controller/ovn-controller.c:2220 12 engine_compute () at lib/inc-proc-eng.c:306 13 engine_run_node (recompute_allowed=true) at lib/inc-proc-eng.c:352 14 engine_run (recompute_allowed=true) at lib/inc-proc-eng.c:377 15 main () at controller/ovn-controller.c:2887 *** The reason for this assert is different from the assertion bug fixed in the commit 858d1dd("ofctrl: Fix the assert seen when flood removing flows.") The above assertion is seen if lflow.c calls ofctrl_add_or_append_flow() twice for the same flow uuid 'S'. When this function is called for the first time, a new desired flow 'f' is created and 'f->references' will have [(S, f)]. When it is called for the 2nd time, the list 'f->references' is updated as [(S, f), (S,f)]. Later when flood_remove_flows_for_sb_uuid() is called for 'S', the above assertion is seen. This scenarion can happen when ovn-controller receives an update message from SB ovsdb-server in which a new logical flow 'F' belonging to a datapath 'D' is added which would result in conjunction flows. If this datapath 'D' is added to 'local_datapaths' in the same loop, then logical flow 'F' is first processed in lflow_add_flows_for_datapath() and the second time in lflow_handle_changed_flows(). This issue can be fixed in 2 ways 1. Flood remove all the tracked flows in lflow_handle_changed_flows() first and then process the modified or newly added logical flows. 2. In the function ofctrl_add_or_append_flow(), check if there is already a reference for the flow uuid 'S' in the existing desired flows and only append the actions if it doesn't exist. This patch has taken the approach (1) to ensure correctness of flows. Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1929978 Signed-off-by: Numan Siddique <numans@ovn.org> Acked-by: Dumitru Ceara <dceara@redhat.com>
- Loading branch information