New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache the modules affected by xlate_actions(). #2
Conversation
This function checks for a rule in the classifier: * If the rule exists, reset its modified time. * If an equivalent rule exists, reset that rule's modified time. * If no rule exists, re-install the rule and reset its modified time. * Finally, return the rule that was modified. This function will be used to ensure that hard timeouts for learnt rules are refreshed if traffic consistently hits a rule with a learn action in it. The first user will be the next commit. Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com> --- v1: Ensure rule->modified is updated correctly. RFC: First post.
This patch adds a new object called 'struct xlate_cache' which can be set in 'struct xlate_in', and passed to xlate_actions() to cache the modules affected by this flow translation. Subsequently, the caller can pass the xcache to xlate_push_stats() to credit stats and perform side effects for a lower cost than full flow translation. These changes are aimed currently at long-lived flows, decreasing the average dump duration for such flows by 50-80%. This allows more flows to be supported in the datapath at a given time. Applying these changes to short-lived flows is left for a later commit. Signed-off-by: Joe Stringer <joestringer@nicira.com> --- v1: Add caching for fin_timeout action. Expire netflows on xlate_cache_clear(). Account to bonds using a copy of 'flow' rather than hash. Always build XC_NORMAL entry (previously only if may_learn is true) Rename xlate_from_cache()->xlate_push_stats() Add may_learn parameter to xlate_push_stats() Tidy up xlate_actions__() mirror/netflow code. Fold in style fixups. RFC: First post.
Previously we would revalidate all flows if the "need_revalidate" flag was raised. This patch modifies the logic to delete low throughput flows rather than revalidate them. High-throughput flows are unaffected by this change. This patch identifies the flows based on the mean time between packets since the last dump. This change is primarily targeted at situations where: * Flow dump duration is high (~1 second) * Revalidation is triggered. (eg, by bridge reconfiguration or learning) After the need_revalidate flag is set, next time a new flow dump session starts, revalidators will begin revalidating the flows. This full revalidation is more expensive, which significantly increases the flow dump duration. At the end of this dump session, the datapath flow management algorithms kick in for the next dump: * If flow dump duration becomes too long, the flow limit is decreased. * The number of flows in the datapath then exceeds the flow_limit. * As the flow_limit is exceeded, max_idle is temporarily set to 100ms. * Revalidators delete all flows that haven't seen traffic recently. The effect of this is that many low-throughput flows are deleted after revalidation, even if they are valid. The revalidation is unnecessary for flows that would be deleted anyway, so this patch skips the revalidation step for those flows. Note that this patch will only perform this optimization if the flow has already been dumped at least once, and only if the time since the last dump is sufficiently long. This gives the flow a chance to become high-throughput. Signed-off-by: Joe Stringer <joestringer@nicira.com> --- v1: Determine "high-throughput" by packets rather than bytes. Calculate the mean time between packets for comparison, rather than comparing the number of packets since the last dump. RFC: First post.
Thanks for the review, Ethan. Will you take care of merging this? This version conflicts with "Remove the flow dumper", so depending on which is merged, I will need to rebase the other and repost. |
Could you submit a new pull request with all the acked bys and everything, Ethan On Thu, Apr 17, 2014 at 3:24 PM, Joe Stringer notifications@github.comwrote:
|
Sure. |
There are two problematic situations. A deadlock can happen when is_percpu is false because it can get interrupted while holding the spinlock. Then it executes ovs_flow_stats_update() in softirq context which tries to get the same lock. The second sitation is that when is_percpu is true, the code correctly disables BH but only for the local CPU, so the following can happen when locking the remote CPU without disabling BH: CPU#0 CPU#1 ovs_flow_stats_get() stats_read() +->spin_lock remote CPU#1 ovs_flow_stats_get() | <interrupted> stats_read() | ... +--> spin_lock remote CPU#0 | | <interrupted> | ovs_flow_stats_update() | ... | spin_lock local CPU#0 <--+ ovs_flow_stats_update() +---------------------------------- spin_lock local CPU#1 This patch disables BH for both cases fixing the deadlocks. ================================= [ INFO: inconsistent lock state ] 3.14.0-rc8-00007-g632b06a #1 Tainted: G I --------------------------------- inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. swapper/0/0 [HC0[0]:SC1[5]:HE1:SE0] takes: (&(&cpu_stats->lock)->rlock){+.?...}, at: [<ffffffffa05dd8a1>] ovs_flow_stats_update+0x51/0xd0 [openvswitch] {SOFTIRQ-ON-W} state was registered at: [<ffffffff810f973f>] __lock_acquire+0x68f/0x1c40 [<ffffffff810fb4e2>] lock_acquire+0xa2/0x1d0 [<ffffffff817d8d9e>] _raw_spin_lock+0x3e/0x80 [<ffffffffa05dd9e4>] ovs_flow_stats_get+0xc4/0x1e0 [openvswitch] [<ffffffffa05da855>] ovs_flow_cmd_fill_info+0x185/0x360 [openvswitch] [<ffffffffa05daf05>] ovs_flow_cmd_build_info.constprop.27+0x55/0x90 [openvswitch] [<ffffffffa05db41d>] ovs_flow_cmd_new_or_set+0x4dd/0x570 [openvswitch] [<ffffffff816c245d>] genl_family_rcv_msg+0x1cd/0x3f0 [<ffffffff816c270e>] genl_rcv_msg+0x8e/0xd0 [<ffffffff816c0239>] netlink_rcv_skb+0xa9/0xc0 [<ffffffff816c0798>] genl_rcv+0x28/0x40 [<ffffffff816bf830>] netlink_unicast+0x100/0x1e0 [<ffffffff816bfc57>] netlink_sendmsg+0x347/0x770 [<ffffffff81668e9c>] sock_sendmsg+0x9c/0xe0 [<ffffffff816692d9>] ___sys_sendmsg+0x3a9/0x3c0 [<ffffffff8166a911>] __sys_sendmsg+0x51/0x90 [<ffffffff8166a962>] SyS_sendmsg+0x12/0x20 [<ffffffff817e3ce9>] system_call_fastpath+0x16/0x1b irq event stamp: 1740726 hardirqs last enabled at (1740726): [<ffffffff8175d5e0>] ip6_finish_output2+0x4f0/0x840 hardirqs last disabled at (1740725): [<ffffffff8175d59b>] ip6_finish_output2+0x4ab/0x840 softirqs last enabled at (1740674): [<ffffffff8109be12>] _local_bh_enable+0x22/0x50 softirqs last disabled at (1740675): [<ffffffff8109db05>] irq_exit+0xc5/0xd0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&(&cpu_stats->lock)->rlock); <Interrupt> lock(&(&cpu_stats->lock)->rlock); *** DEADLOCK *** 5 locks held by swapper/0/0: #0: (((&ifa->dad_timer))){+.-...}, at: [<ffffffff810a7155>] call_timer_fn+0x5/0x320 #1: (rcu_read_lock){.+.+..}, at: [<ffffffff81788a55>] mld_sendpack+0x5/0x4a0 #2: (rcu_read_lock_bh){.+....}, at: [<ffffffff8175d149>] ip6_finish_output2+0x59/0x840 #3: (rcu_read_lock_bh){.+....}, at: [<ffffffff8168ba75>] __dev_queue_xmit+0x5/0x9b0 #4: (rcu_read_lock){.+.+..}, at: [<ffffffffa05e41b5>] internal_dev_xmit+0x5/0x110 [openvswitch] stack backtrace: CPU: 0 PID: 0 Comm: swapper/0 Tainted: G I 3.14.0-rc8-00007-g632b06a #1 Hardware name: /DX58SO, BIOS SOX5810J.86A.5599.2012.0529.2218 05/29/2012 0000000000000000 0fcf20709903df0c ffff88042d603808 ffffffff817cfe3c ffffffff81c134c0 ffff88042d603858 ffffffff817cb6da 0000000000000005 ffffffff00000001 ffff880400000000 0000000000000006 ffffffff81c134c0 Call Trace: <IRQ> [<ffffffff817cfe3c>] dump_stack+0x4d/0x66 [<ffffffff817cb6da>] print_usage_bug+0x1f4/0x205 [<ffffffff810f7f10>] ? check_usage_backwards+0x180/0x180 [<ffffffff810f8963>] mark_lock+0x223/0x2b0 [<ffffffff810f96d3>] __lock_acquire+0x623/0x1c40 [<ffffffff810f5707>] ? __lock_is_held+0x57/0x80 [<ffffffffa05e26c6>] ? masked_flow_lookup+0x236/0x250 [openvswitch] [<ffffffff810fb4e2>] lock_acquire+0xa2/0x1d0 [<ffffffffa05dd8a1>] ? ovs_flow_stats_update+0x51/0xd0 [openvswitch] [<ffffffff817d8d9e>] _raw_spin_lock+0x3e/0x80 [<ffffffffa05dd8a1>] ? ovs_flow_stats_update+0x51/0xd0 [openvswitch] [<ffffffffa05dd8a1>] ovs_flow_stats_update+0x51/0xd0 [openvswitch] [<ffffffffa05dcc64>] ovs_dp_process_received_packet+0x84/0x120 [openvswitch] [<ffffffff810f93f7>] ? __lock_acquire+0x347/0x1c40 [<ffffffffa05e3bea>] ovs_vport_receive+0x2a/0x30 [openvswitch] [<ffffffffa05e4218>] internal_dev_xmit+0x68/0x110 [openvswitch] [<ffffffffa05e41b5>] ? internal_dev_xmit+0x5/0x110 [openvswitch] [<ffffffff8168b4a6>] dev_hard_start_xmit+0x2e6/0x8b0 [<ffffffff8168be87>] __dev_queue_xmit+0x417/0x9b0 [<ffffffff8168ba75>] ? __dev_queue_xmit+0x5/0x9b0 [<ffffffff8175d5e0>] ? ip6_finish_output2+0x4f0/0x840 [<ffffffff8168c430>] dev_queue_xmit+0x10/0x20 [<ffffffff8175d641>] ip6_finish_output2+0x551/0x840 [<ffffffff8176128a>] ? ip6_finish_output+0x9a/0x220 [<ffffffff8176128a>] ip6_finish_output+0x9a/0x220 [<ffffffff8176145f>] ip6_output+0x4f/0x1f0 [<ffffffff81788c29>] mld_sendpack+0x1d9/0x4a0 [<ffffffff817895b8>] mld_send_initial_cr.part.32+0x88/0xa0 [<ffffffff817691b0>] ? addrconf_dad_completed+0x220/0x220 [<ffffffff8178e301>] ipv6_mc_dad_complete+0x31/0x50 [<ffffffff817690d7>] addrconf_dad_completed+0x147/0x220 [<ffffffff817691b0>] ? addrconf_dad_completed+0x220/0x220 [<ffffffff8176934f>] addrconf_dad_timer+0x19f/0x1c0 [<ffffffff810a71e9>] call_timer_fn+0x99/0x320 [<ffffffff810a7155>] ? call_timer_fn+0x5/0x320 [<ffffffff817691b0>] ? addrconf_dad_completed+0x220/0x220 [<ffffffff810a76c4>] run_timer_softirq+0x254/0x3b0 [<ffffffff8109d47d>] __do_softirq+0x12d/0x480 Signed-off-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jesse Gross <jesse@nicira.com>
…butes This patch fixes following kernel crash that could happen, if geneve vport was not added yet, but revalidator thread attempted to dump flows. To reproduce: 1. switch tunnel type between geneve and gre in a loop; and 2. run ping. BUG: unable to handle kernel NULL pointer dereference at 0000000000000048 IP: [<ffffffffa0385470>] ovs_nla_put_flow+0x3d0/0x7c0 [openvswitch] PGD 3b32b067 PUD 3b2ef067 PMD 0 Oops: 0000 [#2] SMP ... CPU: 0 PID: 6450 Comm: revalidator2 Tainted: GF D O 3.13.0-24-generic #46-Ubuntu Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2012 task: ffff88003b4aafe0 ti: ffff88003d314000 task.ti: ffff88003d314000 RIP: 0010:[<ffffffffa0385470>] [<ffffffffa0385470>] ovs_nla_put_flow+0x3d0/0x7c0 [openvswitch] RSP: 0018:ffff88003d315a10 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88003a9a9960 RCX: 0000000000000000 RDX: 0000000000000002 RSI: ffffffffffffffc8 RDI: ffff88003babcb80 RBP: ffff88003d315a68 R08: 0000000000000000 R09: 0000000000000004 R10: ffff880039c23034 R11: 0000000000000008 R12: ffff88003a861600 R13: ffff88003a9a9960 R14: ffff88003babcb80 R15: qffff88003a861600 FS: 00007ff0f5d94700(0000) GS:ffff88003f600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000048 CR3: 000000003b55b000 CR4: 00000000000007f0 Stack: ffffffff81385093 0000000000000000 0000000000000000 0000000000000000 ffff880000000000 ffff88003d315a58 ffff880039c23014 ffff88003a9a97a0 ffff88003babcb80 ffff880039c23018 ffff88003a861600 ffff88003d315ad0 Call Trace: [<ffffffff81385093>] ? __nla_reserve+0x43/0x50 [<ffffffffa037e683>] ovs_flow_cmd_fill_info+0x93/0x2b0 [openvswitch] [<ffffffffa0387159>] ? ovs_flow_tbl_dump_next+0x49/0xc0 [openvswitch] [<ffffffffa037e920>] ovs_flow_cmd_dump+0x80/0xd0 [openvswitch] [<ffffffff81645004>] netlink_dump+0x84/0x240 [<ffffffff816458eb>] __netlink_dump_start+0x1ab/0x220 [<ffffffff816498d7>] genl_family_rcv_msg+0x337/0x370 [<ffffffffa037e8a0>] ? ovs_flow_cmd_fill_info+0x2b0/0x2b0 [openvswitch] [<ffffffff811a2778>] ? __kmalloc_node_track_caller+0x58/0x1e0 [<ffffffff81649910>] ? genl_family_rcv_msg+0x370/0x370 [<ffffffff816499a1>] genl_rcv_msg+0x91/0xd0 [<ffffffff81647a29>] netlink_rcv_skb+0xa9/0xc0 [<ffffffff81647f28>] genl_rcv+0x28/0x40 [<ffffffff81647055>] netlink_unicast+0xd5/0x1b0 [<ffffffff8164742f>] netlink_sendmsg+0x2ff/0x740 [<ffffffff816024eb>] sock_sendmsg+0x8b/0xc0 [<ffffffff811bbaa1>] ? __sb_end_write+0x31/0x60 [<ffffffff811d42bf>] ? touch_atime+0x10f/0x140 [<ffffffff811c2471>] ? pipe_read+0x371/0x400 [<ffffffff81602691>] SYSC_sendto+0x121/0x1c0 [<ffffffff8109dd84>] ? vtime_account_user+0x54/0x60 [<ffffffff81020d35>] ? syscall_trace_enter+0x145/0x250 [<ffffffff8160319e>] SyS_sendto+0xe/0x10 [<ffffffff8172663f>] tracesys+0xe1/0xe6 Signed-Off-By: Ansis Atteka <aatteka@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
Otherwise creating the first dpif-netdev bridge fails because there are no handlers: Program terminated with signal 8, Arithmetic exception. #0 0x080971e9 in dp_execute_cb (aux_=aux_@entry=0xffcfaa54, packet=packet@entry=0xffcfac54, md=md@entry=0xffcfac84, a=a@entry=0x8f58930, may_steal=false) at ../lib/dpif-netdev.c:2154 #1 0x080b5adb in odp_execute_actions__ (dp=dp@entry=0xffcfaa54, packet=packet@entry=0xffcfac54, steal=steal@entry=false, md=md@entry=0xffcfac84, actions=actions@entry=0x8f58930, actions_len=actions_len@entry=20, dp_execute_action=dp_execute_action@entry=0x8097040 <dp_execute_cb>, more_actions=more_actions@entry=false) at ../lib/odp-execute.c:218 #2 0x080b5def in odp_execute_actions (dp=dp@entry=0xffcfaa54, packet=packet@entry=0xffcfac54, steal=steal@entry=false, md=md@entry=0xffcfac84, actions=0x8f58930, actions_len=20, dp_execute_action=dp_execute_action@entry=0x8097040 <dp_execute_cb>) at ../lib/odp-execute.c:285 #3 0x08095098 in dp_netdev_execute_actions (actions_len=<optimized out>, actions=<optimized out>, md=0xffcfac84, may_steal=false, packet=0xffcfac54, key=0xffcfaa5c, dp=<optimized out>) at ../lib/dpif-netdev.c:2227 #4 dpif_netdev_execute (dpif=0x8f59598, execute=0xffcfac78) at ../lib/dpif-netdev.c:1551 #5 0x0809a56c in dpif_execute (dpif=0x8f59598, execute=execute@entry=0xffcfac78) at ../lib/dpif.c:1227 #6 0x08071071 in check_variable_length_userdata (backer=<optimized out>) at ../ofproto/ofproto-dpif.c:1040 #7 open_dpif_backer (backerp=0x8f5834c, type=<optimized out>) at ../ofproto/ofproto-dpif.c:921 #8 construct (ofproto_=0x8f581c0) at ../ofproto/ofproto-dpif.c:1120 #9 0x080675e0 in ofproto_create (datapath_name=0x8f57310 "br0", datapath_type=<optimized out>, ofprotop=ofprotop@entry=0x8f576c8) at ../ofproto/ofproto.c:564 #10 0x080529aa in bridge_reconfigure (ovs_cfg=ovs_cfg@entry=0x8f596d8) at ../vswitchd/bridge.c:572 #11 0x08055e33 in bridge_run () at ../vswitchd/bridge.c:2339 #12 0x0804cdbd in main (argc=9, argv=0xffcfb554) at ../vswitchd/ovs-vswitchd.c:116 This bug was introduced by commit 9bcefb9 (ofproto-dpif: fix an ovs crash when dpif_recv_set returns error). CC: Andy Zhou <azhou@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Justin Pettit <jpettit@nicira.com>
packet execute is setting egress_tun_info in skb->cb, rather than packet->cb. skb is netlink msg skb. This causes corruption in netlink skb state stored in skb->cb (NETLINK_CB) which results in following deadlock in netlink code. ============================================= [ INFO: possible recursive locking detected ] 3.2.62 openvswitch#2 --------------------------------------------- handler55/22851 is trying to acquire lock: (genl_mutex){+.+.+.}, at: [<ffffffff81471ad7>] genl_lock+0x17/0x20 but task is already holding lock: (genl_mutex){+.+.+.}, at: [<ffffffff81471ad7>] genl_lock+0x17/0x20 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(genl_mutex); lock(genl_mutex); *** DEADLOCK *** May be due to missing lock nesting notation 1 lock held by handler55/22851: #0: (genl_mutex){+.+.+.}, at: [<ffffffff81471ad7>] genl_lock+0x17/0x20 stack backtrace: Pid: 22851, comm: handler55 Tainted: G O 3.2.62 openvswitch#2 Call Trace: [<ffffffff81097bb2>] print_deadlock_bug+0xf2/0x100 [<ffffffff81099b99>] validate_chain+0x579/0x860 [<ffffffff8109a17c>] __lock_acquire+0x2fc/0x4f0 [<ffffffff8109aab0>] lock_acquire+0xa0/0x180 [<ffffffff81519070>] __mutex_lock_common+0x60/0x420 [<ffffffff8151959a>] mutex_lock_nested+0x4a/0x60 [<ffffffff81471ad7>] genl_lock+0x17/0x20 [<ffffffff81471af6>] genl_rcv+0x16/0x40 [<ffffffff8146ff72>] netlink_unicast+0x2f2/0x310 [<ffffffff81470159>] netlink_ack+0x109/0x1f0 [<ffffffff8147030b>] netlink_rcv_skb+0xcb/0xd0 [<ffffffff81471b05>] genl_rcv+0x25/0x40 [<ffffffff8146ff72>] netlink_unicast+0x2f2/0x310 [<ffffffff8147134c>] netlink_sendmsg+0x28c/0x3d0 [<ffffffff8143375f>] sock_sendmsg+0xef/0x120 [<ffffffff81435766>] ___sys_sendmsg+0x416/0x430 [<ffffffff81435949>] __sys_sendmsg+0x49/0x90 [<ffffffff814359a9>] sys_sendmsg+0x19/0x20 [<ffffffff8152432b>] system_call_fastpath+0x16/0x1b Reported-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
packet execute is setting egress_tun_info in skb->cb, rather than packet->cb. skb is netlink msg skb. This causes corruption in netlink skb state stored in skb->cb (NETLINK_CB) which results in following deadlock in netlink code. ============================================= [ INFO: possible recursive locking detected ] 3.2.62 #2 --------------------------------------------- handler55/22851 is trying to acquire lock: (genl_mutex){+.+.+.}, at: [<ffffffff81471ad7>] genl_lock+0x17/0x20 but task is already holding lock: (genl_mutex){+.+.+.}, at: [<ffffffff81471ad7>] genl_lock+0x17/0x20 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(genl_mutex); lock(genl_mutex); *** DEADLOCK *** May be due to missing lock nesting notation 1 lock held by handler55/22851: #0: (genl_mutex){+.+.+.}, at: [<ffffffff81471ad7>] genl_lock+0x17/0x20 stack backtrace: Pid: 22851, comm: handler55 Tainted: G O 3.2.62 #2 Call Trace: [<ffffffff81097bb2>] print_deadlock_bug+0xf2/0x100 [<ffffffff81099b99>] validate_chain+0x579/0x860 [<ffffffff8109a17c>] __lock_acquire+0x2fc/0x4f0 [<ffffffff8109aab0>] lock_acquire+0xa0/0x180 [<ffffffff81519070>] __mutex_lock_common+0x60/0x420 [<ffffffff8151959a>] mutex_lock_nested+0x4a/0x60 [<ffffffff81471ad7>] genl_lock+0x17/0x20 [<ffffffff81471af6>] genl_rcv+0x16/0x40 [<ffffffff8146ff72>] netlink_unicast+0x2f2/0x310 [<ffffffff81470159>] netlink_ack+0x109/0x1f0 [<ffffffff8147030b>] netlink_rcv_skb+0xcb/0xd0 [<ffffffff81471b05>] genl_rcv+0x25/0x40 [<ffffffff8146ff72>] netlink_unicast+0x2f2/0x310 [<ffffffff8147134c>] netlink_sendmsg+0x28c/0x3d0 [<ffffffff8143375f>] sock_sendmsg+0xef/0x120 [<ffffffff81435766>] ___sys_sendmsg+0x416/0x430 [<ffffffff81435949>] __sys_sendmsg+0x49/0x90 [<ffffffff814359a9>] sys_sendmsg+0x19/0x20 [<ffffffff8152432b>] system_call_fastpath+0x16/0x1b Reported-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Joe Stringer <joestringer@nicira.com>
Primary goals of netdev-windows.c are: 1) To query the 'network device' information of a vport such as MTU, etc. 2) Monitor changes to the 'network device' information such as link status. In this change, we implement only #1. #2 can also be implemented, but it does not seem to be required for the purposes of implement 'ovs-dpctl.exe show'. Signed-off-by: Nithin Raju <nithin@vmware.com> Acked-by: Ankur Sharma <ankursharma@vmware.com> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Tested-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Modify Cflags in package-config files
Introduce a new logical port type called "localnet". A logical port with this type also has an option called "network_name". A "localnet" logical port represents a connection to a network that is locally accessible from each chassis running ovn-controller. ovn-controller will use the ovn-bridge-mappings configuration to figure out which patch port on br-int should be used for this port. OpenStack Neutron has an API extension called "provider networks" which allows an administrator to specify that it would like ports directly attached to some pre-existing network in their environment. There was a previous thread where we got into the details of this here: http://openvswitch.org/pipermail/dev/2015-June/056765.html The case where this would be used is an environment that isn't actually interested in virtual networks and just wants all of their compute resources connected up to externally managed networks. Even in this environment, OVN still has a lot of value to add. OVN implements port security and ACLs for all ports connected to these networks. OVN also provides the configuration interface and control plane to manage this across many hypervisors. As a specific example, consider an environment with two hypvervisors (A and B) with two VMs on each hypervisor (A1, A2, B1, B2). Now imagine that the desired setup from an OpenStack perspective is to have all of these VMs attached to the same provider network, which is a physical network we'll refer to as "physnet1". The first step here is to configure each hypervisor with bridge mappings that tell ovn-controller that a local bridge called "br-eth1" is used to reach the network called "physnet1". We can simulate the inital setup of this environment in ovs-sandbox with the following commands: # Setup the local hypervisor (A) ovs-vsctl add-br br-eth1 ovs-vsctl set open . external-ids:ovn-bridge-mappings=physnet1:br-eth1 # Create a fake remote hypervisor (B) ovn-sbctl chassis-add fakechassis geneve 127.0.0.1 To get the behavior we want, we model every Neutron port connected to a Neutron provider network as an OVN logical switch with 2 ports. The first port is a normal logical port to be used by the VM. The second logical port is a special port with its type set to "localnet". You could imagine an alternative configuration where there are many OVN logical ports with a single OVN "localnet" logical port on the same OVN logical switch. This setup provides something different, where the logical ports would communicate with eath other in logical space via tunnnels between hypervisors. For Neutron's use case, we want all ports communicating via an existing network without the use of an overlay. To simulate the creation of the OVN logical switches and OVN logical ports for A1, A2, B1, and B2, you can run the following commands: # Create 4 OVN logical switches. Each logical switch has 2 ports, # port1 for a VM and physnet1 for the existing network we are # connecting to. for n in 1 2 3 4; do ovn-nbctl lswitch-add provnet1-$n ovn-nbctl lport-add provnet1-$n provnet1-$n-port1 ovn-nbctl lport-set-macs provnet1-$n-port1 00:00:00:00:00:0$n ovn-nbctl lport-set-port-security provnet1-$n-port1 00:00:00:00:00:0$n ovn-nbctl lport-add provnet1-$n provnet1-$n-physnet1 ovn-nbctl lport-set-macs provnet1-$n-physnet1 unknown ovn-nbctl lport-set-type provnet1-$n-physnet1 localnet ovn-nbctl lport-set-options provnet1-$n-physnet1 network_name=physnet1 done # Bind lport1 (A1) and lport2 (A2) to the local hypervisor. ovs-vsctl add-port br-int lport1 -- set Interface lport1 external_ids:iface-id=provnet1-1-port1 ovs-vsctl add-port br-int lport2 -- set Interface lport2 external_ids:iface-id=provnet1-1-port1 # Bind the other 2 ports to the fake remote hypervisor. ovn-sbctl lport-bind provnet1-3-port1 fakechassis ovn-sbctl lport-bind provnet1-4-port1 fakechassis After running these commands, we have the following logical configuration: $ ovn-nbctl show lswitch 035645fc-b2ff-4e26-b953-69addba80a9a (provnet1-4) lport provnet1-4-physnet1 macs: unknown lport provnet1-4-port1 macs: 00:00:00:00:00:04 lswitch 66212a85-b3b6-4688-bcf6-8062941a2d96 (provnet1-2) lport provnet1-2-physnet1 macs: unknown lport provnet1-2-port1 macs: 00:00:00:00:00:02 lswitch fc5b1141-0216-4fa7-86f3-461811c1fc9b (provnet1-3) lport provnet1-3-physnet1 macs: unknown lport provnet1-3-port1 macs: 00:00:00:00:00:03 lswitch 9b1d2636-e654-4d43-84e8-a921af611b33 (provnet1-1) lport provnet1-1-physnet1 macs: unknown lport provnet1-1-port1 macs: 00:00:00:00:00:01 We can also look at OVN_Southbound to see that 2 logical ports are bound to each hypervisor: $ ovn-sbctl show Chassis "56b18105-5706-46ef-80c4-ff20979ab068" Encap geneve ip: "127.0.0.1" Port_Binding "provnet1-1-port1" Port_Binding "provnet1-2-port1" Chassis fakechassis Encap geneve ip: "127.0.0.1" Port_Binding "provnet1-3-port1" Port_Binding "provnet1-4-port1" Now we can generate several packets to test how a packet would be processed on hypervisor A. The OpenFlow port numbers in this demo are: 1 - patch port to br-eth1 (physnet1) 2 - tunnel to fakechassis 3 - lport1 (A1) 4 - lport2 (A2) Packet test #1: A1 to A2 - This will be output to ofport 1. Despite both VMs being local to this hypervisor, all packets betwen the VMs go through physnet1. In practice, this will get optimized at br-eth1. ovs-appctl ofproto/trace br-int \ > in_port=3,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02 -generate Packet test #2: physnet1 to A2 - Consider this a continuation of test is attached to will be considered. The end result should be that the only output is to ofport 4 (A2). ovs-appctl ofproto/trace br-int \ > in_port=1,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02 -generate Packet test #3: A1 to B1 - This will be output to ofport 1, as physnet1 is to be used to reach any other port. When it arrives at hypervisor B, processing would look just like test #2. ovs-appctl ofproto/trace br-int \ > in_port=3,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:03 -generate Packet test #4: A1 broadcast. - Again, the packet will only be sent to physnet1. ovs-appctl ofproto/trace br-int \ > in_port=3,dl_src=00:00:00:00:00:01,dl_dst=ff:ff:ff:ff:ff:ff -generate Packet test #5: B1 broadcast arriving at hypervisor A. This is somewhat a continuation of test #4. When a broadcast packet arrives from physnet1 on hypervisor A, we should see it output to both A1 and A2 (ofports 3 and 4). ovs-appctl ofproto/trace br-int \ > in_port=1,dl_src=00:00:00:00:00:03,dl_dst=ff:ff:ff:ff:ff:ff -generate Signed-off-by: Russell Bryant <rbryant@redhat.com>
Introduce a new logical port type called "localnet". A logical port with this type also has an option called "network_name". A "localnet" logical port represents a connection to a network that is locally accessible from each chassis running ovn-controller. ovn-controller will use the ovn-bridge-mappings configuration to figure out which patch port on br-int should be used for this port. OpenStack Neutron has an API extension called "provider networks" which allows an administrator to specify that it would like ports directly attached to some pre-existing network in their environment. There was a previous thread where we got into the details of this here: http://openvswitch.org/pipermail/dev/2015-June/056765.html The case where this would be used is an environment that isn't actually interested in virtual networks and just wants all of their compute resources connected up to externally managed networks. Even in this environment, OVN still has a lot of value to add. OVN implements port security and ACLs for all ports connected to these networks. OVN also provides the configuration interface and control plane to manage this across many hypervisors. As a specific example, consider an environment with two hypvervisors (A and B) with two VMs on each hypervisor (A1, A2, B1, B2). Now imagine that the desired setup from an OpenStack perspective is to have all of these VMs attached to the same provider network, which is a physical network we'll refer to as "physnet1". The first step here is to configure each hypervisor with bridge mappings that tell ovn-controller that a local bridge called "br-eth1" is used to reach the network called "physnet1". We can simulate the inital setup of this environment in ovs-sandbox with the following commands: # Setup the local hypervisor (A) ovs-vsctl add-br br-eth1 ovs-vsctl set open . external-ids:ovn-bridge-mappings=physnet1:br-eth1 # Create a fake remote hypervisor (B) ovn-sbctl chassis-add fakechassis geneve 127.0.0.1 To get the behavior we want, we model every Neutron port connected to a Neutron provider network as an OVN logical switch with 2 ports. The first port is a normal logical port to be used by the VM. The second logical port is a special port with its type set to "localnet". You could imagine an alternative configuration where there are many OVN logical ports with a single OVN "localnet" logical port on the same OVN logical switch. This setup provides something different, where the logical ports would communicate with eath other in logical space via tunnnels between hypervisors. For Neutron's use case, we want all ports communicating via an existing network without the use of an overlay. To simulate the creation of the OVN logical switches and OVN logical ports for A1, A2, B1, and B2, you can run the following commands: # Create 4 OVN logical switches. Each logical switch has 2 ports, # port1 for a VM and physnet1 for the existing network we are # connecting to. for n in 1 2 3 4; do ovn-nbctl lswitch-add provnet1-$n ovn-nbctl lport-add provnet1-$n provnet1-$n-port1 ovn-nbctl lport-set-macs provnet1-$n-port1 00:00:00:00:00:0$n ovn-nbctl lport-set-port-security provnet1-$n-port1 00:00:00:00:00:0$n ovn-nbctl lport-add provnet1-$n provnet1-$n-physnet1 ovn-nbctl lport-set-macs provnet1-$n-physnet1 unknown ovn-nbctl lport-set-type provnet1-$n-physnet1 localnet ovn-nbctl lport-set-options provnet1-$n-physnet1 network_name=physnet1 done # Bind lport1 (A1) and lport2 (A2) to the local hypervisor. ovs-vsctl add-port br-int lport1 -- set Interface lport1 external_ids:iface-id=provnet1-1-port1 ovs-vsctl add-port br-int lport2 -- set Interface lport2 external_ids:iface-id=provnet1-2-port1 # Bind the other 2 ports to the fake remote hypervisor. ovn-sbctl lport-bind provnet1-3-port1 fakechassis ovn-sbctl lport-bind provnet1-4-port1 fakechassis After running these commands, we have the following logical configuration: $ ovn-nbctl show lswitch 035645fc-b2ff-4e26-b953-69addba80a9a (provnet1-4) lport provnet1-4-physnet1 macs: unknown lport provnet1-4-port1 macs: 00:00:00:00:00:04 lswitch 66212a85-b3b6-4688-bcf6-8062941a2d96 (provnet1-2) lport provnet1-2-physnet1 macs: unknown lport provnet1-2-port1 macs: 00:00:00:00:00:02 lswitch fc5b1141-0216-4fa7-86f3-461811c1fc9b (provnet1-3) lport provnet1-3-physnet1 macs: unknown lport provnet1-3-port1 macs: 00:00:00:00:00:03 lswitch 9b1d2636-e654-4d43-84e8-a921af611b33 (provnet1-1) lport provnet1-1-physnet1 macs: unknown lport provnet1-1-port1 macs: 00:00:00:00:00:01 We can also look at OVN_Southbound to see that 2 logical ports are bound to each hypervisor: $ ovn-sbctl show Chassis "56b18105-5706-46ef-80c4-ff20979ab068" Encap geneve ip: "127.0.0.1" Port_Binding "provnet1-1-port1" Port_Binding "provnet1-2-port1" Chassis fakechassis Encap geneve ip: "127.0.0.1" Port_Binding "provnet1-3-port1" Port_Binding "provnet1-4-port1" Now we can generate several packets to test how a packet would be processed on hypervisor A. The OpenFlow port numbers in this demo are: 1 - patch port to br-eth1 (physnet1) 2 - tunnel to fakechassis 3 - lport1 (A1) 4 - lport2 (A2) Packet test #1: A1 to A2 - This will be output to ofport 1. Despite both VMs being local to this hypervisor, all packets betwen the VMs go through physnet1. In practice, this will get optimized at br-eth1. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02 -generate Packet test #2: physnet1 to A2 - Consider this a continuation of test is attached to will be considered. The end result should be that the only output is to ofport 4 (A2). ovs-appctl ofproto/trace br-int \ in_port=1,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02 -generate Packet test #3: A1 to B1 - This will be output to ofport 1, as physnet1 is to be used to reach any other port. When it arrives at hypervisor B, processing would look just like test #2. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:03 -generate Packet test #4: A1 broadcast. - Again, the packet will only be sent to physnet1. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=ff:ff:ff:ff:ff:ff -generate Packet test #5: B1 broadcast arriving at hypervisor A. This is somewhat a continuation of test #4. When a broadcast packet arrives from physnet1 on hypervisor A, we should see it output to both A1 and A2 (ofports 3 and 4). ovs-appctl ofproto/trace br-int \ in_port=1,dl_src=00:00:00:00:00:03,dl_dst=ff:ff:ff:ff:ff:ff -generate Signed-off-by: Russell Bryant <rbryant@redhat.com>
Introduce a new logical port type called "localnet". A logical port with this type also has an option called "network_name". A "localnet" logical port represents a connection to a network that is locally accessible from each chassis running ovn-controller. ovn-controller will use the ovn-bridge-mappings configuration to figure out which patch port on br-int should be used for this port. OpenStack Neutron has an API extension called "provider networks" which allows an administrator to specify that it would like ports directly attached to some pre-existing network in their environment. There was a previous thread where we got into the details of this here: http://openvswitch.org/pipermail/dev/2015-June/056765.html The case where this would be used is an environment that isn't actually interested in virtual networks and just wants all of their compute resources connected up to externally managed networks. Even in this environment, OVN still has a lot of value to add. OVN implements port security and ACLs for all ports connected to these networks. OVN also provides the configuration interface and control plane to manage this across many hypervisors. As a specific example, consider an environment with two hypvervisors (A and B) with two VMs on each hypervisor (A1, A2, B1, B2). Now imagine that the desired setup from an OpenStack perspective is to have all of these VMs attached to the same provider network, which is a physical network we'll refer to as "physnet1". The first step here is to configure each hypervisor with bridge mappings that tell ovn-controller that a local bridge called "br-eth1" is used to reach the network called "physnet1". We can simulate the inital setup of this environment in ovs-sandbox with the following commands: # Setup the local hypervisor (A) ovs-vsctl add-br br-eth1 ovs-vsctl set open . external-ids:ovn-bridge-mappings=physnet1:br-eth1 # Create a fake remote hypervisor (B) ovn-sbctl chassis-add fakechassis geneve 127.0.0.1 To get the behavior we want, we model every Neutron port connected to a Neutron provider network as an OVN logical switch with 2 ports. The first port is a normal logical port to be used by the VM. The second logical port is a special port with its type set to "localnet". You could imagine an alternative configuration where there are many OVN logical ports with a single OVN "localnet" logical port on the same OVN logical switch. This setup provides something different, where the logical ports would communicate with eath other in logical space via tunnnels between hypervisors. For Neutron's use case, we want all ports communicating via an existing network without the use of an overlay. To simulate the creation of the OVN logical switches and OVN logical ports for A1, A2, B1, and B2, you can run the following commands: # Create 4 OVN logical switches. Each logical switch has 2 ports, # port1 for a VM and physnet1 for the existing network we are # connecting to. for n in 1 2 3 4; do ovn-nbctl lswitch-add provnet1-$n ovn-nbctl lport-add provnet1-$n provnet1-$n-port1 ovn-nbctl lport-set-macs provnet1-$n-port1 00:00:00:00:00:0$n ovn-nbctl lport-set-port-security provnet1-$n-port1 00:00:00:00:00:0$n ovn-nbctl lport-add provnet1-$n provnet1-$n-physnet1 ovn-nbctl lport-set-macs provnet1-$n-physnet1 unknown ovn-nbctl lport-set-type provnet1-$n-physnet1 localnet ovn-nbctl lport-set-options provnet1-$n-physnet1 network_name=physnet1 done # Bind lport1 (A1) and lport2 (A2) to the local hypervisor. ovs-vsctl add-port br-int lport1 -- set Interface lport1 external_ids:iface-id=provnet1-1-port1 ovs-vsctl add-port br-int lport2 -- set Interface lport2 external_ids:iface-id=provnet1-2-port1 # Bind the other 2 ports to the fake remote hypervisor. ovn-sbctl lport-bind provnet1-3-port1 fakechassis ovn-sbctl lport-bind provnet1-4-port1 fakechassis After running these commands, we have the following logical configuration: $ ovn-nbctl show lswitch 035645fc-b2ff-4e26-b953-69addba80a9a (provnet1-4) lport provnet1-4-physnet1 macs: unknown lport provnet1-4-port1 macs: 00:00:00:00:00:04 lswitch 66212a85-b3b6-4688-bcf6-8062941a2d96 (provnet1-2) lport provnet1-2-physnet1 macs: unknown lport provnet1-2-port1 macs: 00:00:00:00:00:02 lswitch fc5b1141-0216-4fa7-86f3-461811c1fc9b (provnet1-3) lport provnet1-3-physnet1 macs: unknown lport provnet1-3-port1 macs: 00:00:00:00:00:03 lswitch 9b1d2636-e654-4d43-84e8-a921af611b33 (provnet1-1) lport provnet1-1-physnet1 macs: unknown lport provnet1-1-port1 macs: 00:00:00:00:00:01 We can also look at OVN_Southbound to see that 2 logical ports are bound to each hypervisor: $ ovn-sbctl show Chassis "56b18105-5706-46ef-80c4-ff20979ab068" Encap geneve ip: "127.0.0.1" Port_Binding "provnet1-1-port1" Port_Binding "provnet1-2-port1" Chassis fakechassis Encap geneve ip: "127.0.0.1" Port_Binding "provnet1-3-port1" Port_Binding "provnet1-4-port1" Now we can generate several packets to test how a packet would be processed on hypervisor A. The OpenFlow port numbers in this demo are: 1 - patch port to br-eth1 (physnet1) 2 - tunnel to fakechassis 3 - lport1 (A1) 4 - lport2 (A2) Packet test #1: A1 to A2 - This will be output to ofport 1. Despite both VMs being local to this hypervisor, all packets betwen the VMs go through physnet1. In practice, this will get optimized at br-eth1. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02 -generate Packet test #2: physnet1 to A2 - Consider this a continuation of test is attached to will be considered. The end result should be that the only output is to ofport 4 (A2). ovs-appctl ofproto/trace br-int \ in_port=1,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02 -generate Packet test #3: A1 to B1 - This will be output to ofport 1, as physnet1 is to be used to reach any other port. When it arrives at hypervisor B, processing would look just like test #2. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:03 -generate Packet test #4: A1 broadcast. - Again, the packet will only be sent to physnet1. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=ff:ff:ff:ff:ff:ff -generate Packet test #5: B1 broadcast arriving at hypervisor A. This is somewhat a continuation of test #4. When a broadcast packet arrives from physnet1 on hypervisor A, we should see it output to both A1 and A2 (ofports 3 and 4). ovs-appctl ofproto/trace br-int \ in_port=1,dl_src=00:00:00:00:00:03,dl_dst=ff:ff:ff:ff:ff:ff -generate Signed-off-by: Russell Bryant <rbryant@redhat.com>
Introduce a new logical port type called "localnet". A logical port with this type also has an option called "network_name". A "localnet" logical port represents a connection to a network that is locally accessible from each chassis running ovn-controller. ovn-controller will use the ovn-bridge-mappings configuration to figure out which patch port on br-int should be used for this port. OpenStack Neutron has an API extension called "provider networks" which allows an administrator to specify that it would like ports directly attached to some pre-existing network in their environment. There was a previous thread where we got into the details of this here: http://openvswitch.org/pipermail/dev/2015-June/056765.html The case where this would be used is an environment that isn't actually interested in virtual networks and just wants all of their compute resources connected up to externally managed networks. Even in this environment, OVN still has a lot of value to add. OVN implements port security and ACLs for all ports connected to these networks. OVN also provides the configuration interface and control plane to manage this across many hypervisors. As a specific example, consider an environment with two hypvervisors (A and B) with two VMs on each hypervisor (A1, A2, B1, B2). Now imagine that the desired setup from an OpenStack perspective is to have all of these VMs attached to the same provider network, which is a physical network we'll refer to as "physnet1". The first step here is to configure each hypervisor with bridge mappings that tell ovn-controller that a local bridge called "br-eth1" is used to reach the network called "physnet1". We can simulate the inital setup of this environment in ovs-sandbox with the following commands: # Setup the local hypervisor (A) ovs-vsctl add-br br-eth1 ovs-vsctl set open . external-ids:ovn-bridge-mappings=physnet1:br-eth1 # Create a fake remote hypervisor (B) ovn-sbctl chassis-add fakechassis geneve 127.0.0.1 To get the behavior we want, we model every Neutron port connected to a Neutron provider network as an OVN logical switch with 2 ports. The first port is a normal logical port to be used by the VM. The second logical port is a special port with its type set to "localnet". You could imagine an alternative configuration where there are many OVN logical ports with a single OVN "localnet" logical port on the same OVN logical switch. This setup provides something different, where the logical ports would communicate with eath other in logical space via tunnnels between hypervisors. For Neutron's use case, we want all ports communicating via an existing network without the use of an overlay. To simulate the creation of the OVN logical switches and OVN logical ports for A1, A2, B1, and B2, you can run the following commands: # Create 4 OVN logical switches. Each logical switch has 2 ports, # port1 for a VM and physnet1 for the existing network we are # connecting to. for n in 1 2 3 4; do ovn-nbctl lswitch-add provnet1-$n ovn-nbctl lport-add provnet1-$n provnet1-$n-port1 ovn-nbctl lport-set-macs provnet1-$n-port1 00:00:00:00:00:0$n ovn-nbctl lport-set-port-security provnet1-$n-port1 00:00:00:00:00:0$n ovn-nbctl lport-add provnet1-$n provnet1-$n-physnet1 ovn-nbctl lport-set-macs provnet1-$n-physnet1 unknown ovn-nbctl lport-set-type provnet1-$n-physnet1 localnet ovn-nbctl lport-set-options provnet1-$n-physnet1 network_name=physnet1 done # Bind lport1 (A1) and lport2 (A2) to the local hypervisor. ovs-vsctl add-port br-int lport1 -- set Interface lport1 external_ids:iface-id=provnet1-1-port1 ovs-vsctl add-port br-int lport2 -- set Interface lport2 external_ids:iface-id=provnet1-2-port1 # Bind the other 2 ports to the fake remote hypervisor. ovn-sbctl lport-bind provnet1-3-port1 fakechassis ovn-sbctl lport-bind provnet1-4-port1 fakechassis After running these commands, we have the following logical configuration: $ ovn-nbctl show lswitch 035645fc-b2ff-4e26-b953-69addba80a9a (provnet1-4) lport provnet1-4-physnet1 macs: unknown lport provnet1-4-port1 macs: 00:00:00:00:00:04 lswitch 66212a85-b3b6-4688-bcf6-8062941a2d96 (provnet1-2) lport provnet1-2-physnet1 macs: unknown lport provnet1-2-port1 macs: 00:00:00:00:00:02 lswitch fc5b1141-0216-4fa7-86f3-461811c1fc9b (provnet1-3) lport provnet1-3-physnet1 macs: unknown lport provnet1-3-port1 macs: 00:00:00:00:00:03 lswitch 9b1d2636-e654-4d43-84e8-a921af611b33 (provnet1-1) lport provnet1-1-physnet1 macs: unknown lport provnet1-1-port1 macs: 00:00:00:00:00:01 We can also look at OVN_Southbound to see that 2 logical ports are bound to each hypervisor: $ ovn-sbctl show Chassis "56b18105-5706-46ef-80c4-ff20979ab068" Encap geneve ip: "127.0.0.1" Port_Binding "provnet1-1-port1" Port_Binding "provnet1-2-port1" Chassis fakechassis Encap geneve ip: "127.0.0.1" Port_Binding "provnet1-3-port1" Port_Binding "provnet1-4-port1" Now we can generate several packets to test how a packet would be processed on hypervisor A. The OpenFlow port numbers in this demo are: 1 - patch port to br-eth1 (physnet1) 2 - tunnel to fakechassis 3 - lport1 (A1) 4 - lport2 (A2) Packet test #1: A1 to A2 - This will be output to ofport 1. Despite both VMs being local to this hypervisor, all packets betwen the VMs go through physnet1. In practice, this will get optimized at br-eth1. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02 -generate Packet test #2: physnet1 to A2 - Consider this a continuation of test is attached to will be considered. The end result should be that the only output is to ofport 4 (A2). ovs-appctl ofproto/trace br-int \ in_port=1,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02 -generate Packet test #3: A1 to B1 - This will be output to ofport 1, as physnet1 is to be used to reach any other port. When it arrives at hypervisor B, processing would look just like test #2. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:03 -generate Packet test #4: A1 broadcast. - Again, the packet will only be sent to physnet1. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=ff:ff:ff:ff:ff:ff -generate Packet test #5: B1 broadcast arriving at hypervisor A. This is somewhat a continuation of test #4. When a broadcast packet arrives from physnet1 on hypervisor A, we should see it output to both A1 and A2 (ofports 3 and 4). ovs-appctl ofproto/trace br-int \ in_port=1,dl_src=00:00:00:00:00:03,dl_dst=ff:ff:ff:ff:ff:ff -generate Signed-off-by: Russell Bryant <rbryant@redhat.com>
Introduce a new logical port type called "localnet". A logical port with this type also has an option called "network_name". A "localnet" logical port represents a connection to a network that is locally accessible from each chassis running ovn-controller. ovn-controller will use the ovn-bridge-mappings configuration to figure out which patch port on br-int should be used for this port. OpenStack Neutron has an API extension called "provider networks" which allows an administrator to specify that it would like ports directly attached to some pre-existing network in their environment. There was a previous thread where we got into the details of this here: http://openvswitch.org/pipermail/dev/2015-June/056765.html The case where this would be used is an environment that isn't actually interested in virtual networks and just wants all of their compute resources connected up to externally managed networks. Even in this environment, OVN still has a lot of value to add. OVN implements port security and ACLs for all ports connected to these networks. OVN also provides the configuration interface and control plane to manage this across many hypervisors. As a specific example, consider an environment with two hypvervisors (A and B) with two VMs on each hypervisor (A1, A2, B1, B2). Now imagine that the desired setup from an OpenStack perspective is to have all of these VMs attached to the same provider network, which is a physical network we'll refer to as "physnet1". The first step here is to configure each hypervisor with bridge mappings that tell ovn-controller that a local bridge called "br-eth1" is used to reach the network called "physnet1". We can simulate the inital setup of this environment in ovs-sandbox with the following commands: # Setup the local hypervisor (A) ovs-vsctl add-br br-eth1 ovs-vsctl set open . external-ids:ovn-bridge-mappings=physnet1:br-eth1 # Create a fake remote hypervisor (B) ovn-sbctl chassis-add fakechassis geneve 127.0.0.1 To get the behavior we want, we model every Neutron port connected to a Neutron provider network as an OVN logical switch with 2 ports. The first port is a normal logical port to be used by the VM. The second logical port is a special port with its type set to "localnet". You could imagine an alternative configuration where there are many OVN logical ports with a single OVN "localnet" logical port on the same OVN logical switch. This setup provides something different, where the logical ports would communicate with eath other in logical space via tunnnels between hypervisors. For Neutron's use case, we want all ports communicating via an existing network without the use of an overlay. To simulate the creation of the OVN logical switches and OVN logical ports for A1, A2, B1, and B2, you can run the following commands: # Create 4 OVN logical switches. Each logical switch has 2 ports, # port1 for a VM and physnet1 for the existing network we are # connecting to. for n in 1 2 3 4; do ovn-nbctl lswitch-add provnet1-$n ovn-nbctl lport-add provnet1-$n provnet1-$n-port1 ovn-nbctl lport-set-macs provnet1-$n-port1 00:00:00:00:00:0$n ovn-nbctl lport-set-port-security provnet1-$n-port1 00:00:00:00:00:0$n ovn-nbctl lport-add provnet1-$n provnet1-$n-physnet1 ovn-nbctl lport-set-macs provnet1-$n-physnet1 unknown ovn-nbctl lport-set-type provnet1-$n-physnet1 localnet ovn-nbctl lport-set-options provnet1-$n-physnet1 network_name=physnet1 done # Bind lport1 (A1) and lport2 (A2) to the local hypervisor. ovs-vsctl add-port br-int lport1 -- set Interface lport1 external_ids:iface-id=provnet1-1-port1 ovs-vsctl add-port br-int lport2 -- set Interface lport2 external_ids:iface-id=provnet1-2-port1 # Bind the other 2 ports to the fake remote hypervisor. ovn-sbctl lport-bind provnet1-3-port1 fakechassis ovn-sbctl lport-bind provnet1-4-port1 fakechassis After running these commands, we have the following logical configuration: $ ovn-nbctl show lswitch 035645fc-b2ff-4e26-b953-69addba80a9a (provnet1-4) lport provnet1-4-physnet1 macs: unknown lport provnet1-4-port1 macs: 00:00:00:00:00:04 lswitch 66212a85-b3b6-4688-bcf6-8062941a2d96 (provnet1-2) lport provnet1-2-physnet1 macs: unknown lport provnet1-2-port1 macs: 00:00:00:00:00:02 lswitch fc5b1141-0216-4fa7-86f3-461811c1fc9b (provnet1-3) lport provnet1-3-physnet1 macs: unknown lport provnet1-3-port1 macs: 00:00:00:00:00:03 lswitch 9b1d2636-e654-4d43-84e8-a921af611b33 (provnet1-1) lport provnet1-1-physnet1 macs: unknown lport provnet1-1-port1 macs: 00:00:00:00:00:01 We can also look at OVN_Southbound to see that 2 logical ports are bound to each hypervisor: $ ovn-sbctl show Chassis "56b18105-5706-46ef-80c4-ff20979ab068" Encap geneve ip: "127.0.0.1" Port_Binding "provnet1-1-port1" Port_Binding "provnet1-2-port1" Chassis fakechassis Encap geneve ip: "127.0.0.1" Port_Binding "provnet1-3-port1" Port_Binding "provnet1-4-port1" Now we can generate several packets to test how a packet would be processed on hypervisor A. The OpenFlow port numbers in this demo are: 1 - patch port to br-eth1 (physnet1) 2 - tunnel to fakechassis 3 - lport1 (A1) 4 - lport2 (A2) Packet test #1: A1 to A2 - This will be output to ofport 1. Despite both VMs being local to this hypervisor, all packets betwen the VMs go through physnet1. In practice, this will get optimized at br-eth1. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02 -generate Packet test #2: physnet1 to A2 - Consider this a continuation of test is attached to will be considered. The end result should be that the only output is to ofport 4 (A2). ovs-appctl ofproto/trace br-int \ in_port=1,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02 -generate Packet test #3: A1 to B1 - This will be output to ofport 1, as physnet1 is to be used to reach any other port. When it arrives at hypervisor B, processing would look just like test #2. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:03 -generate Packet test #4: A1 broadcast. - Again, the packet will only be sent to physnet1. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=ff:ff:ff:ff:ff:ff -generate Packet test #5: B1 broadcast arriving at hypervisor A. This is somewhat a continuation of test #4. When a broadcast packet arrives from physnet1 on hypervisor A, we should see it output to both A1 and A2 (ofports 3 and 4). ovs-appctl ofproto/trace br-int \ in_port=1,dl_src=00:00:00:00:00:03,dl_dst=ff:ff:ff:ff:ff:ff -generate Signed-off-by: Russell Bryant <rbryant@redhat.com>
Introduce a new logical port type called "localnet". A logical port with this type also has an option called "network_name". A "localnet" logical port represents a connection to a network that is locally accessible from each chassis running ovn-controller. ovn-controller will use the ovn-bridge-mappings configuration to figure out which patch port on br-int should be used for this port. OpenStack Neutron has an API extension called "provider networks" which allows an administrator to specify that it would like ports directly attached to some pre-existing network in their environment. There was a previous thread where we got into the details of this here: http://openvswitch.org/pipermail/dev/2015-June/056765.html The case where this would be used is an environment that isn't actually interested in virtual networks and just wants all of their compute resources connected up to externally managed networks. Even in this environment, OVN still has a lot of value to add. OVN implements port security and ACLs for all ports connected to these networks. OVN also provides the configuration interface and control plane to manage this across many hypervisors. As a specific example, consider an environment with two hypvervisors (A and B) with two VMs on each hypervisor (A1, A2, B1, B2). Now imagine that the desired setup from an OpenStack perspective is to have all of these VMs attached to the same provider network, which is a physical network we'll refer to as "physnet1". The first step here is to configure each hypervisor with bridge mappings that tell ovn-controller that a local bridge called "br-eth1" is used to reach the network called "physnet1". We can simulate the inital setup of this environment in ovs-sandbox with the following commands: # Setup the local hypervisor (A) ovs-vsctl add-br br-eth1 ovs-vsctl set open . external-ids:ovn-bridge-mappings=physnet1:br-eth1 # Create a fake remote hypervisor (B) ovn-sbctl chassis-add fakechassis geneve 127.0.0.1 To get the behavior we want, we model every Neutron port connected to a Neutron provider network as an OVN logical switch with 2 ports. The first port is a normal logical port to be used by the VM. The second logical port is a special port with its type set to "localnet". You could imagine an alternative configuration where there are many OVN logical ports with a single OVN "localnet" logical port on the same OVN logical switch. This setup provides something different, where the logical ports would communicate with eath other in logical space via tunnnels between hypervisors. For Neutron's use case, we want all ports communicating via an existing network without the use of an overlay. To simulate the creation of the OVN logical switches and OVN logical ports for A1, A2, B1, and B2, you can run the following commands: # Create 4 OVN logical switches. Each logical switch has 2 ports, # port1 for a VM and physnet1 for the existing network we are # connecting to. for n in 1 2 3 4; do ovn-nbctl lswitch-add provnet1-$n ovn-nbctl lport-add provnet1-$n provnet1-$n-port1 ovn-nbctl lport-set-macs provnet1-$n-port1 00:00:00:00:00:0$n ovn-nbctl lport-set-port-security provnet1-$n-port1 00:00:00:00:00:0$n ovn-nbctl lport-add provnet1-$n provnet1-$n-physnet1 ovn-nbctl lport-set-macs provnet1-$n-physnet1 unknown ovn-nbctl lport-set-type provnet1-$n-physnet1 localnet ovn-nbctl lport-set-options provnet1-$n-physnet1 network_name=physnet1 done # Bind lport1 (A1) and lport2 (A2) to the local hypervisor. ovs-vsctl add-port br-int lport1 -- set Interface lport1 external_ids:iface-id=provnet1-1-port1 ovs-vsctl add-port br-int lport2 -- set Interface lport2 external_ids:iface-id=provnet1-2-port1 # Bind the other 2 ports to the fake remote hypervisor. ovn-sbctl lport-bind provnet1-3-port1 fakechassis ovn-sbctl lport-bind provnet1-4-port1 fakechassis After running these commands, we have the following logical configuration: $ ovn-nbctl show lswitch 035645fc-b2ff-4e26-b953-69addba80a9a (provnet1-4) lport provnet1-4-physnet1 macs: unknown lport provnet1-4-port1 macs: 00:00:00:00:00:04 lswitch 66212a85-b3b6-4688-bcf6-8062941a2d96 (provnet1-2) lport provnet1-2-physnet1 macs: unknown lport provnet1-2-port1 macs: 00:00:00:00:00:02 lswitch fc5b1141-0216-4fa7-86f3-461811c1fc9b (provnet1-3) lport provnet1-3-physnet1 macs: unknown lport provnet1-3-port1 macs: 00:00:00:00:00:03 lswitch 9b1d2636-e654-4d43-84e8-a921af611b33 (provnet1-1) lport provnet1-1-physnet1 macs: unknown lport provnet1-1-port1 macs: 00:00:00:00:00:01 We can also look at OVN_Southbound to see that 2 logical ports are bound to each hypervisor: $ ovn-sbctl show Chassis "56b18105-5706-46ef-80c4-ff20979ab068" Encap geneve ip: "127.0.0.1" Port_Binding "provnet1-1-port1" Port_Binding "provnet1-2-port1" Chassis fakechassis Encap geneve ip: "127.0.0.1" Port_Binding "provnet1-3-port1" Port_Binding "provnet1-4-port1" Now we can generate several packets to test how a packet would be processed on hypervisor A. The OpenFlow port numbers in this demo are: 1 - patch port to br-eth1 (physnet1) 2 - tunnel to fakechassis 3 - lport1 (A1) 4 - lport2 (A2) Packet test #1: A1 to A2 - This will be output to ofport 1. Despite both VMs being local to this hypervisor, all packets betwen the VMs go through physnet1. In practice, this will get optimized at br-eth1. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02 -generate Packet test #2: physnet1 to A2 - Consider this a continuation of test is attached to will be considered. The end result should be that the only output is to ofport 4 (A2). ovs-appctl ofproto/trace br-int \ in_port=1,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02 -generate Packet test #3: A1 to B1 - This will be output to ofport 1, as physnet1 is to be used to reach any other port. When it arrives at hypervisor B, processing would look just like test #2. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:03 -generate Packet test #4: A1 broadcast. - Again, the packet will only be sent to physnet1. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=ff:ff:ff:ff:ff:ff -generate Packet test #5: B1 broadcast arriving at hypervisor A. This is somewhat a continuation of test #4. When a broadcast packet arrives from physnet1 on hypervisor A, we should see it output to both A1 and A2 (ofports 3 and 4). ovs-appctl ofproto/trace br-int \ in_port=1,dl_src=00:00:00:00:00:03,dl_dst=ff:ff:ff:ff:ff:ff -generate Signed-off-by: Russell Bryant <rbryant@redhat.com>
Introduce a new logical port type called "localnet". A logical port with this type also has an option called "network_name". A "localnet" logical port represents a connection to a network that is locally accessible from each chassis running ovn-controller. ovn-controller will use the ovn-bridge-mappings configuration to figure out which patch port on br-int should be used for this port. OpenStack Neutron has an API extension called "provider networks" which allows an administrator to specify that it would like ports directly attached to some pre-existing network in their environment. There was a previous thread where we got into the details of this here: http://openvswitch.org/pipermail/dev/2015-June/056765.html The case where this would be used is an environment that isn't actually interested in virtual networks and just wants all of their compute resources connected up to externally managed networks. Even in this environment, OVN still has a lot of value to add. OVN implements port security and ACLs for all ports connected to these networks. OVN also provides the configuration interface and control plane to manage this across many hypervisors. As a specific example, consider an environment with two hypvervisors (A and B) with two VMs on each hypervisor (A1, A2, B1, B2). Now imagine that the desired setup from an OpenStack perspective is to have all of these VMs attached to the same provider network, which is a physical network we'll refer to as "physnet1". The first step here is to configure each hypervisor with bridge mappings that tell ovn-controller that a local bridge called "br-eth1" is used to reach the network called "physnet1". We can simulate the inital setup of this environment in ovs-sandbox with the following commands: # Setup the local hypervisor (A) ovs-vsctl add-br br-eth1 ovs-vsctl set open . external-ids:ovn-bridge-mappings=physnet1:br-eth1 # Create a fake remote hypervisor (B) ovn-sbctl chassis-add fakechassis geneve 127.0.0.1 To get the behavior we want, we model every Neutron port connected to a Neutron provider network as an OVN logical switch with 2 ports. The first port is a normal logical port to be used by the VM. The second logical port is a special port with its type set to "localnet". You could imagine an alternative configuration where there are many OVN logical ports with a single OVN "localnet" logical port on the same OVN logical switch. This setup provides something different, where the logical ports would communicate with eath other in logical space via tunnnels between hypervisors. For Neutron's use case, we want all ports communicating via an existing network without the use of an overlay. To simulate the creation of the OVN logical switches and OVN logical ports for A1, A2, B1, and B2, you can run the following commands: # Create 4 OVN logical switches. Each logical switch has 2 ports, # port1 for a VM and physnet1 for the existing network we are # connecting to. for n in 1 2 3 4; do ovn-nbctl lswitch-add provnet1-$n ovn-nbctl lport-add provnet1-$n provnet1-$n-port1 ovn-nbctl lport-set-macs provnet1-$n-port1 00:00:00:00:00:0$n ovn-nbctl lport-set-port-security provnet1-$n-port1 00:00:00:00:00:0$n ovn-nbctl lport-add provnet1-$n provnet1-$n-physnet1 ovn-nbctl lport-set-macs provnet1-$n-physnet1 unknown ovn-nbctl lport-set-type provnet1-$n-physnet1 localnet ovn-nbctl lport-set-options provnet1-$n-physnet1 network_name=physnet1 done # Bind lport1 (A1) and lport2 (A2) to the local hypervisor. ovs-vsctl add-port br-int lport1 -- set Interface lport1 external_ids:iface-id=provnet1-1-port1 ovs-vsctl add-port br-int lport2 -- set Interface lport2 external_ids:iface-id=provnet1-2-port1 # Bind the other 2 ports to the fake remote hypervisor. ovn-sbctl lport-bind provnet1-3-port1 fakechassis ovn-sbctl lport-bind provnet1-4-port1 fakechassis After running these commands, we have the following logical configuration: $ ovn-nbctl show lswitch 035645fc-b2ff-4e26-b953-69addba80a9a (provnet1-4) lport provnet1-4-physnet1 macs: unknown lport provnet1-4-port1 macs: 00:00:00:00:00:04 lswitch 66212a85-b3b6-4688-bcf6-8062941a2d96 (provnet1-2) lport provnet1-2-physnet1 macs: unknown lport provnet1-2-port1 macs: 00:00:00:00:00:02 lswitch fc5b1141-0216-4fa7-86f3-461811c1fc9b (provnet1-3) lport provnet1-3-physnet1 macs: unknown lport provnet1-3-port1 macs: 00:00:00:00:00:03 lswitch 9b1d2636-e654-4d43-84e8-a921af611b33 (provnet1-1) lport provnet1-1-physnet1 macs: unknown lport provnet1-1-port1 macs: 00:00:00:00:00:01 We can also look at OVN_Southbound to see that 2 logical ports are bound to each hypervisor: $ ovn-sbctl show Chassis "56b18105-5706-46ef-80c4-ff20979ab068" Encap geneve ip: "127.0.0.1" Port_Binding "provnet1-1-port1" Port_Binding "provnet1-2-port1" Chassis fakechassis Encap geneve ip: "127.0.0.1" Port_Binding "provnet1-3-port1" Port_Binding "provnet1-4-port1" Now we can generate several packets to test how a packet would be processed on hypervisor A. The OpenFlow port numbers in this demo are: 1 - patch port to br-eth1 (physnet1) 2 - tunnel to fakechassis 3 - lport1 (A1) 4 - lport2 (A2) Packet test #1: A1 to A2 - This will be output to ofport 1. Despite both VMs being local to this hypervisor, all packets betwen the VMs go through physnet1. In practice, this will get optimized at br-eth1. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02 -generate Packet test #2: physnet1 to A2 - Consider this a continuation of test is attached to will be considered. The end result should be that the only output is to ofport 4 (A2). ovs-appctl ofproto/trace br-int \ in_port=1,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02 -generate Packet test #3: A1 to B1 - This will be output to ofport 1, as physnet1 is to be used to reach any other port. When it arrives at hypervisor B, processing would look just like test #2. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:03 -generate Packet test #4: A1 broadcast. - Again, the packet will only be sent to physnet1. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=ff:ff:ff:ff:ff:ff -generate Packet test #5: B1 broadcast arriving at hypervisor A. This is somewhat a continuation of test #4. When a broadcast packet arrives from physnet1 on hypervisor A, we should see it output to both A1 and A2 (ofports 3 and 4). ovs-appctl ofproto/trace br-int \ in_port=1,dl_src=00:00:00:00:00:03,dl_dst=ff:ff:ff:ff:ff:ff -generate Signed-off-by: Russell Bryant <rbryant@redhat.com>
Introduce a new logical port type called "localnet". A logical port with this type also has an option called "network_name". A "localnet" logical port represents a connection to a network that is locally accessible from each chassis running ovn-controller. ovn-controller will use the ovn-bridge-mappings configuration to figure out which patch port on br-int should be used for this port. OpenStack Neutron has an API extension called "provider networks" which allows an administrator to specify that it would like ports directly attached to some pre-existing network in their environment. There was a previous thread where we got into the details of this here: http://openvswitch.org/pipermail/dev/2015-June/056765.html The case where this would be used is an environment that isn't actually interested in virtual networks and just wants all of their compute resources connected up to externally managed networks. Even in this environment, OVN still has a lot of value to add. OVN implements port security and ACLs for all ports connected to these networks. OVN also provides the configuration interface and control plane to manage this across many hypervisors. As a specific example, consider an environment with two hypvervisors (A and B) with two VMs on each hypervisor (A1, A2, B1, B2). Now imagine that the desired setup from an OpenStack perspective is to have all of these VMs attached to the same provider network, which is a physical network we'll refer to as "physnet1". The first step here is to configure each hypervisor with bridge mappings that tell ovn-controller that a local bridge called "br-eth1" is used to reach the network called "physnet1". We can simulate the inital setup of this environment in ovs-sandbox with the following commands: # Setup the local hypervisor (A) ovs-vsctl add-br br-eth1 ovs-vsctl set open . external-ids:ovn-bridge-mappings=physnet1:br-eth1 # Create a fake remote hypervisor (B) ovn-sbctl chassis-add fakechassis geneve 127.0.0.1 To get the behavior we want, we model every Neutron port connected to a Neutron provider network as an OVN logical switch with 2 ports. The first port is a normal logical port to be used by the VM. The second logical port is a special port with its type set to "localnet". To simulate the creation of the OVN logical switches and OVN logical ports for A1, A2, B1, and B2, you can run the following commands: # Create 4 OVN logical switches. Each logical switch has 2 ports, # port1 for a VM and physnet1 for the existing network we are # connecting to. for n in 1 2 3 4; do ovn-nbctl lswitch-add provnet1-$n ovn-nbctl lport-add provnet1-$n provnet1-$n-port1 ovn-nbctl lport-set-macs provnet1-$n-port1 00:00:00:00:00:0$n ovn-nbctl lport-set-port-security provnet1-$n-port1 00:00:00:00:00:0$n ovn-nbctl lport-add provnet1-$n provnet1-$n-physnet1 ovn-nbctl lport-set-macs provnet1-$n-physnet1 unknown ovn-nbctl lport-set-type provnet1-$n-physnet1 localnet ovn-nbctl lport-set-options provnet1-$n-physnet1 network_name=physnet1 done # Bind lport1 (A1) and lport2 (A2) to the local hypervisor. ovs-vsctl add-port br-int lport1 -- set Interface lport1 external_ids:iface-id=provnet1-1-port1 ovs-vsctl add-port br-int lport2 -- set Interface lport2 external_ids:iface-id=provnet1-2-port1 # Bind the other 2 ports to the fake remote hypervisor. ovn-sbctl lport-bind provnet1-3-port1 fakechassis ovn-sbctl lport-bind provnet1-4-port1 fakechassis After running these commands, we have the following logical configuration: $ ovn-nbctl show lswitch 035645fc-b2ff-4e26-b953-69addba80a9a (provnet1-4) lport provnet1-4-physnet1 macs: unknown lport provnet1-4-port1 macs: 00:00:00:00:00:04 lswitch 66212a85-b3b6-4688-bcf6-8062941a2d96 (provnet1-2) lport provnet1-2-physnet1 macs: unknown lport provnet1-2-port1 macs: 00:00:00:00:00:02 lswitch fc5b1141-0216-4fa7-86f3-461811c1fc9b (provnet1-3) lport provnet1-3-physnet1 macs: unknown lport provnet1-3-port1 macs: 00:00:00:00:00:03 lswitch 9b1d2636-e654-4d43-84e8-a921af611b33 (provnet1-1) lport provnet1-1-physnet1 macs: unknown lport provnet1-1-port1 macs: 00:00:00:00:00:01 We can also look at OVN_Southbound to see that 2 logical ports are bound to each hypervisor: $ ovn-sbctl show Chassis "56b18105-5706-46ef-80c4-ff20979ab068" Encap geneve ip: "127.0.0.1" Port_Binding "provnet1-1-port1" Port_Binding "provnet1-2-port1" Chassis fakechassis Encap geneve ip: "127.0.0.1" Port_Binding "provnet1-3-port1" Port_Binding "provnet1-4-port1" Now we can generate several packets to test how a packet would be processed on hypervisor A. The OpenFlow port numbers in this demo are: 1 - patch port to br-eth1 (physnet1) 2 - tunnel to fakechassis 3 - lport1 (A1) 4 - lport2 (A2) Packet test #1: A1 to A2 - This will be output to ofport 1. Despite both VMs being local to this hypervisor, all packets betwen the VMs go through physnet1. In practice, this will get optimized at br-eth1. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02 -generate Packet test #2: physnet1 to A2 - Consider this a continuation of test is attached to will be considered. The end result should be that the only output is to ofport 4 (A2). ovs-appctl ofproto/trace br-int \ in_port=1,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02 -generate Packet test #3: A1 to B1 - This will be output to ofport 1, as physnet1 is to be used to reach any other port. When it arrives at hypervisor B, processing would look just like test #2. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:03 -generate Packet test #4: A1 broadcast. - Again, the packet will only be sent to physnet1. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=ff:ff:ff:ff:ff:ff -generate Packet test #5: B1 broadcast arriving at hypervisor A. This is somewhat a continuation of test #4. When a broadcast packet arrives from physnet1 on hypervisor A, we should see it output to both A1 and A2 (ofports 3 and 4). ovs-appctl ofproto/trace br-int \ in_port=1,dl_src=00:00:00:00:00:03,dl_dst=ff:ff:ff:ff:ff:ff -generate Signed-off-by: Russell Bryant <rbryant@redhat.com>
Introduce a new logical port type called "localnet". A logical port with this type also has an option called "network_name". A "localnet" logical port represents a connection to a network that is locally accessible from each chassis running ovn-controller. ovn-controller will use the ovn-bridge-mappings configuration to figure out which patch port on br-int should be used for this port. OpenStack Neutron has an API extension called "provider networks" which allows an administrator to specify that it would like ports directly attached to some pre-existing network in their environment. There was a previous thread where we got into the details of this here: http://openvswitch.org/pipermail/dev/2015-June/056765.html The case where this would be used is an environment that isn't actually interested in virtual networks and just wants all of their compute resources connected up to externally managed networks. Even in this environment, OVN still has a lot of value to add. OVN implements port security and ACLs for all ports connected to these networks. OVN also provides the configuration interface and control plane to manage this across many hypervisors. As a specific example, consider an environment with two hypvervisors (A and B) with two VMs on each hypervisor (A1, A2, B1, B2). Now imagine that the desired setup from an OpenStack perspective is to have all of these VMs attached to the same provider network, which is a physical network we'll refer to as "physnet1". The first step here is to configure each hypervisor with bridge mappings that tell ovn-controller that a local bridge called "br-eth1" is used to reach the network called "physnet1". We can simulate the inital setup of this environment in ovs-sandbox with the following commands: # Setup the local hypervisor (A) ovs-vsctl add-br br-eth1 ovs-vsctl set open . external-ids:ovn-bridge-mappings=physnet1:br-eth1 # Create a fake remote hypervisor (B) ovn-sbctl chassis-add fakechassis geneve 127.0.0.1 To get the behavior we want, we model every Neutron port connected to a Neutron provider network as an OVN logical switch with 2 ports. The first port is a normal logical port to be used by the VM. The second logical port is a special port with its type set to "localnet". To simulate the creation of the OVN logical switches and OVN logical ports for A1, A2, B1, and B2, you can run the following commands: # Create 4 OVN logical switches. Each logical switch has 2 ports, # port1 for a VM and physnet1 for the existing network we are # connecting to. for n in 1 2 3 4; do ovn-nbctl lswitch-add provnet1-$n ovn-nbctl lport-add provnet1-$n provnet1-$n-port1 ovn-nbctl lport-set-macs provnet1-$n-port1 00:00:00:00:00:0$n ovn-nbctl lport-set-port-security provnet1-$n-port1 00:00:00:00:00:0$n ovn-nbctl lport-add provnet1-$n provnet1-$n-physnet1 ovn-nbctl lport-set-macs provnet1-$n-physnet1 unknown ovn-nbctl lport-set-type provnet1-$n-physnet1 localnet ovn-nbctl lport-set-options provnet1-$n-physnet1 network_name=physnet1 done # Bind lport1 (A1) and lport2 (A2) to the local hypervisor. ovs-vsctl add-port br-int lport1 -- set Interface lport1 external_ids:iface-id=provnet1-1-port1 ovs-vsctl add-port br-int lport2 -- set Interface lport2 external_ids:iface-id=provnet1-2-port1 # Bind the other 2 ports to the fake remote hypervisor. ovn-sbctl lport-bind provnet1-3-port1 fakechassis ovn-sbctl lport-bind provnet1-4-port1 fakechassis After running these commands, we have the following logical configuration: $ ovn-nbctl show lswitch 035645fc-b2ff-4e26-b953-69addba80a9a (provnet1-4) lport provnet1-4-physnet1 macs: unknown lport provnet1-4-port1 macs: 00:00:00:00:00:04 lswitch 66212a85-b3b6-4688-bcf6-8062941a2d96 (provnet1-2) lport provnet1-2-physnet1 macs: unknown lport provnet1-2-port1 macs: 00:00:00:00:00:02 lswitch fc5b1141-0216-4fa7-86f3-461811c1fc9b (provnet1-3) lport provnet1-3-physnet1 macs: unknown lport provnet1-3-port1 macs: 00:00:00:00:00:03 lswitch 9b1d2636-e654-4d43-84e8-a921af611b33 (provnet1-1) lport provnet1-1-physnet1 macs: unknown lport provnet1-1-port1 macs: 00:00:00:00:00:01 We can also look at OVN_Southbound to see that 2 logical ports are bound to each hypervisor: $ ovn-sbctl show Chassis "56b18105-5706-46ef-80c4-ff20979ab068" Encap geneve ip: "127.0.0.1" Port_Binding "provnet1-1-port1" Port_Binding "provnet1-2-port1" Chassis fakechassis Encap geneve ip: "127.0.0.1" Port_Binding "provnet1-3-port1" Port_Binding "provnet1-4-port1" Now we can generate several packets to test how a packet would be processed on hypervisor A. The OpenFlow port numbers in this demo are: 1 - patch port to br-eth1 (physnet1) 2 - tunnel to fakechassis 3 - lport1 (A1) 4 - lport2 (A2) Packet test #1: A1 to A2 - This will be output to ofport 1. Despite both VMs being local to this hypervisor, all packets betwen the VMs go through physnet1. In practice, this will get optimized at br-eth1. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02 -generate Packet test #2: physnet1 to A2 - Consider this a continuation of test is attached to will be considered. The end result should be that the only output is to ofport 4 (A2). ovs-appctl ofproto/trace br-int \ in_port=1,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02 -generate Packet test #3: A1 to B1 - This will be output to ofport 1, as physnet1 is to be used to reach any other port. When it arrives at hypervisor B, processing would look just like test #2. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:03 -generate Packet test #4: A1 broadcast. - Again, the packet will only be sent to physnet1. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=ff:ff:ff:ff:ff:ff -generate Packet test #5: B1 broadcast arriving at hypervisor A. This is somewhat a continuation of test #4. When a broadcast packet arrives from physnet1 on hypervisor A, we should see it output to both A1 and A2 (ofports 3 and 4). ovs-appctl ofproto/trace br-int \ in_port=1,dl_src=00:00:00:00:00:03,dl_dst=ff:ff:ff:ff:ff:ff -generate Signed-off-by: Russell Bryant <rbryant@redhat.com>
Introduce a new logical port type called "localnet". A logical port with this type also has an option called "network_name". A "localnet" logical port represents a connection to a network that is locally accessible from each chassis running ovn-controller. ovn-controller will use the ovn-bridge-mappings configuration to figure out which patch port on br-int should be used for this port. OpenStack Neutron has an API extension called "provider networks" which allows an administrator to specify that it would like ports directly attached to some pre-existing network in their environment. There was a previous thread where we got into the details of this here: http://openvswitch.org/pipermail/dev/2015-June/056765.html The case where this would be used is an environment that isn't actually interested in virtual networks and just wants all of their compute resources connected up to externally managed networks. Even in this environment, OVN still has a lot of value to add. OVN implements port security and ACLs for all ports connected to these networks. OVN also provides the configuration interface and control plane to manage this across many hypervisors. As a specific example, consider an environment with two hypvervisors (A and B) with two VMs on each hypervisor (A1, A2, B1, B2). Now imagine that the desired setup from an OpenStack perspective is to have all of these VMs attached to the same provider network, which is a physical network we'll refer to as "physnet1". The first step here is to configure each hypervisor with bridge mappings that tell ovn-controller that a local bridge called "br-eth1" is used to reach the network called "physnet1". We can simulate the inital setup of this environment in ovs-sandbox with the following commands: # Setup the local hypervisor (A) ovs-vsctl add-br br-eth1 ovs-vsctl set open . external-ids:ovn-bridge-mappings=physnet1:br-eth1 # Create a fake remote hypervisor (B) ovn-sbctl chassis-add fakechassis geneve 127.0.0.1 To get the behavior we want, we model every Neutron port connected to a Neutron provider network as an OVN logical switch with 2 ports. The first port is a normal logical port to be used by the VM. The second logical port is a special port with its type set to "localnet". To simulate the creation of the OVN logical switches and OVN logical ports for A1, A2, B1, and B2, you can run the following commands: # Create 4 OVN logical switches. Each logical switch has 2 ports, # port1 for a VM and physnet1 for the existing network we are # connecting to. for n in 1 2 3 4; do ovn-nbctl lswitch-add provnet1-$n ovn-nbctl lport-add provnet1-$n provnet1-$n-port1 ovn-nbctl lport-set-macs provnet1-$n-port1 00:00:00:00:00:0$n ovn-nbctl lport-set-port-security provnet1-$n-port1 00:00:00:00:00:0$n ovn-nbctl lport-add provnet1-$n provnet1-$n-physnet1 ovn-nbctl lport-set-macs provnet1-$n-physnet1 unknown ovn-nbctl lport-set-type provnet1-$n-physnet1 localnet ovn-nbctl lport-set-options provnet1-$n-physnet1 network_name=physnet1 done # Bind lport1 (A1) and lport2 (A2) to the local hypervisor. ovs-vsctl add-port br-int lport1 -- set Interface lport1 external_ids:iface-id=provnet1-1-port1 ovs-vsctl add-port br-int lport2 -- set Interface lport2 external_ids:iface-id=provnet1-2-port1 # Bind the other 2 ports to the fake remote hypervisor. ovn-sbctl lport-bind provnet1-3-port1 fakechassis ovn-sbctl lport-bind provnet1-4-port1 fakechassis After running these commands, we have the following logical configuration: $ ovn-nbctl show lswitch 035645fc-b2ff-4e26-b953-69addba80a9a (provnet1-4) lport provnet1-4-physnet1 macs: unknown lport provnet1-4-port1 macs: 00:00:00:00:00:04 lswitch 66212a85-b3b6-4688-bcf6-8062941a2d96 (provnet1-2) lport provnet1-2-physnet1 macs: unknown lport provnet1-2-port1 macs: 00:00:00:00:00:02 lswitch fc5b1141-0216-4fa7-86f3-461811c1fc9b (provnet1-3) lport provnet1-3-physnet1 macs: unknown lport provnet1-3-port1 macs: 00:00:00:00:00:03 lswitch 9b1d2636-e654-4d43-84e8-a921af611b33 (provnet1-1) lport provnet1-1-physnet1 macs: unknown lport provnet1-1-port1 macs: 00:00:00:00:00:01 We can also look at OVN_Southbound to see that 2 logical ports are bound to each hypervisor: $ ovn-sbctl show Chassis "56b18105-5706-46ef-80c4-ff20979ab068" Encap geneve ip: "127.0.0.1" Port_Binding "provnet1-1-port1" Port_Binding "provnet1-2-port1" Chassis fakechassis Encap geneve ip: "127.0.0.1" Port_Binding "provnet1-3-port1" Port_Binding "provnet1-4-port1" Now we can generate several packets to test how a packet would be processed on hypervisor A. The OpenFlow port numbers in this demo are: 1 - patch port to br-eth1 (physnet1) 2 - tunnel to fakechassis 3 - lport1 (A1) 4 - lport2 (A2) Packet test #1: A1 to A2 - This will be output to ofport 1. Despite both VMs being local to this hypervisor, all packets betwen the VMs go through physnet1. In practice, this will get optimized at br-eth1. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02 -generate Packet test #2: physnet1 to A2 - Consider this a continuation of test is attached to will be considered. The end result should be that the only output is to ofport 4 (A2). ovs-appctl ofproto/trace br-int \ in_port=1,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02 -generate Packet test #3: A1 to B1 - This will be output to ofport 1, as physnet1 is to be used to reach any other port. When it arrives at hypervisor B, processing would look just like test #2. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:03 -generate Packet test #4: A1 broadcast. - Again, the packet will only be sent to physnet1. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=ff:ff:ff:ff:ff:ff -generate Packet test #5: B1 broadcast arriving at hypervisor A. This is somewhat a continuation of test #4. When a broadcast packet arrives from physnet1 on hypervisor A, we should see it output to both A1 and A2 (ofports 3 and 4). ovs-appctl ofproto/trace br-int \ in_port=1,dl_src=00:00:00:00:00:03,dl_dst=ff:ff:ff:ff:ff:ff -generate Signed-off-by: Russell Bryant <rbryant@redhat.com>
Introduce a new logical port type called "localnet". A logical port with this type also has an option called "network_name". A "localnet" logical port represents a connection to a network that is locally accessible from each chassis running ovn-controller. ovn-controller will use the ovn-bridge-mappings configuration to figure out which patch port on br-int should be used for this port. OpenStack Neutron has an API extension called "provider networks" which allows an administrator to specify that it would like ports directly attached to some pre-existing network in their environment. There was a previous thread where we got into the details of this here: http://openvswitch.org/pipermail/dev/2015-June/056765.html The case where this would be used is an environment that isn't actually interested in virtual networks and just wants all of their compute resources connected up to externally managed networks. Even in this environment, OVN still has a lot of value to add. OVN implements port security and ACLs for all ports connected to these networks. OVN also provides the configuration interface and control plane to manage this across many hypervisors. As a specific example, consider an environment with two hypvervisors (A and B) with two VMs on each hypervisor (A1, A2, B1, B2). Now imagine that the desired setup from an OpenStack perspective is to have all of these VMs attached to the same provider network, which is a physical network we'll refer to as "physnet1". The first step here is to configure each hypervisor with bridge mappings that tell ovn-controller that a local bridge called "br-eth1" is used to reach the network called "physnet1". We can simulate the inital setup of this environment in ovs-sandbox with the following commands: # Setup the local hypervisor (A) ovs-vsctl add-br br-eth1 ovs-vsctl set open . external-ids:ovn-bridge-mappings=physnet1:br-eth1 # Create a fake remote hypervisor (B) ovn-sbctl chassis-add fakechassis geneve 127.0.0.1 To get the behavior we want, we model every Neutron port connected to a Neutron provider network as an OVN logical switch with 2 ports. The first port is a normal logical port to be used by the VM. The second logical port is a special port with its type set to "localnet". To simulate the creation of the OVN logical switches and OVN logical ports for A1, A2, B1, and B2, you can run the following commands: # Create 4 OVN logical switches. Each logical switch has 2 ports, # port1 for a VM and physnet1 for the existing network we are # connecting to. for n in 1 2 3 4; do ovn-nbctl lswitch-add provnet1-$n ovn-nbctl lport-add provnet1-$n provnet1-$n-port1 ovn-nbctl lport-set-macs provnet1-$n-port1 00:00:00:00:00:0$n ovn-nbctl lport-set-port-security provnet1-$n-port1 00:00:00:00:00:0$n ovn-nbctl lport-add provnet1-$n provnet1-$n-physnet1 ovn-nbctl lport-set-macs provnet1-$n-physnet1 unknown ovn-nbctl lport-set-type provnet1-$n-physnet1 localnet ovn-nbctl lport-set-options provnet1-$n-physnet1 network_name=physnet1 done # Bind lport1 (A1) and lport2 (A2) to the local hypervisor. ovs-vsctl add-port br-int lport1 -- set Interface lport1 external_ids:iface-id=provnet1-1-port1 ovs-vsctl add-port br-int lport2 -- set Interface lport2 external_ids:iface-id=provnet1-2-port1 # Bind the other 2 ports to the fake remote hypervisor. ovn-sbctl lport-bind provnet1-3-port1 fakechassis ovn-sbctl lport-bind provnet1-4-port1 fakechassis After running these commands, we have the following logical configuration: $ ovn-nbctl show lswitch 035645fc-b2ff-4e26-b953-69addba80a9a (provnet1-4) lport provnet1-4-physnet1 macs: unknown lport provnet1-4-port1 macs: 00:00:00:00:00:04 lswitch 66212a85-b3b6-4688-bcf6-8062941a2d96 (provnet1-2) lport provnet1-2-physnet1 macs: unknown lport provnet1-2-port1 macs: 00:00:00:00:00:02 lswitch fc5b1141-0216-4fa7-86f3-461811c1fc9b (provnet1-3) lport provnet1-3-physnet1 macs: unknown lport provnet1-3-port1 macs: 00:00:00:00:00:03 lswitch 9b1d2636-e654-4d43-84e8-a921af611b33 (provnet1-1) lport provnet1-1-physnet1 macs: unknown lport provnet1-1-port1 macs: 00:00:00:00:00:01 We can also look at OVN_Southbound to see that 2 logical ports are bound to each hypervisor: $ ovn-sbctl show Chassis "56b18105-5706-46ef-80c4-ff20979ab068" Encap geneve ip: "127.0.0.1" Port_Binding "provnet1-1-port1" Port_Binding "provnet1-2-port1" Chassis fakechassis Encap geneve ip: "127.0.0.1" Port_Binding "provnet1-3-port1" Port_Binding "provnet1-4-port1" Now we can generate several packets to test how a packet would be processed on hypervisor A. The OpenFlow port numbers in this demo are: 1 - patch port to br-eth1 (physnet1) 2 - tunnel to fakechassis 3 - lport1 (A1) 4 - lport2 (A2) Packet test #1: A1 to A2 - This will be output to ofport 1. Despite both VMs being local to this hypervisor, all packets betwen the VMs go through physnet1. In practice, this will get optimized at br-eth1. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02 -generate Packet test #2: physnet1 to A2 - Consider this a continuation of test is attached to will be considered. The end result should be that the only output is to ofport 4 (A2). ovs-appctl ofproto/trace br-int \ in_port=1,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02 -generate Packet test #3: A1 to B1 - This will be output to ofport 1, as physnet1 is to be used to reach any other port. When it arrives at hypervisor B, processing would look just like test #2. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:03 -generate Packet test #4: A1 broadcast. - Again, the packet will only be sent to physnet1. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=ff:ff:ff:ff:ff:ff -generate Packet test #5: B1 broadcast arriving at hypervisor A. This is somewhat a continuation of test #4. When a broadcast packet arrives from physnet1 on hypervisor A, we should see it output to both A1 and A2 (ofports 3 and 4). ovs-appctl ofproto/trace br-int \ in_port=1,dl_src=00:00:00:00:00:03,dl_dst=ff:ff:ff:ff:ff:ff -generate Signed-off-by: Russell Bryant <rbryant@redhat.com>
Introduce a new logical port type called "localnet". A logical port with this type also has an option called "network_name". A "localnet" logical port represents a connection to a network that is locally accessible from each chassis running ovn-controller. ovn-controller will use the ovn-bridge-mappings configuration to figure out which patch port on br-int should be used for this port. OpenStack Neutron has an API extension called "provider networks" which allows an administrator to specify that it would like ports directly attached to some pre-existing network in their environment. There was a previous thread where we got into the details of this here: http://openvswitch.org/pipermail/dev/2015-June/056765.html The case where this would be used is an environment that isn't actually interested in virtual networks and just wants all of their compute resources connected up to externally managed networks. Even in this environment, OVN still has a lot of value to add. OVN implements port security and ACLs for all ports connected to these networks. OVN also provides the configuration interface and control plane to manage this across many hypervisors. As a specific example, consider an environment with two hypvervisors (A and B) with two VMs on each hypervisor (A1, A2, B1, B2). Now imagine that the desired setup from an OpenStack perspective is to have all of these VMs attached to the same provider network, which is a physical network we'll refer to as "physnet1". The first step here is to configure each hypervisor with bridge mappings that tell ovn-controller that a local bridge called "br-eth1" is used to reach the network called "physnet1". We can simulate the inital setup of this environment in ovs-sandbox with the following commands: # Setup the local hypervisor (A) ovs-vsctl add-br br-eth1 ovs-vsctl set open . external-ids:ovn-bridge-mappings=physnet1:br-eth1 # Create a fake remote hypervisor (B) ovn-sbctl chassis-add fakechassis geneve 127.0.0.1 To get the behavior we want, we model every Neutron port connected to a Neutron provider network as an OVN logical switch with 2 ports. The first port is a normal logical port to be used by the VM. The second logical port is a special port with its type set to "localnet". To simulate the creation of the OVN logical switches and OVN logical ports for A1, A2, B1, and B2, you can run the following commands: # Create 4 OVN logical switches. Each logical switch has 2 ports, # port1 for a VM and physnet1 for the existing network we are # connecting to. for n in 1 2 3 4; do ovn-nbctl lswitch-add provnet1-$n ovn-nbctl lport-add provnet1-$n provnet1-$n-port1 ovn-nbctl lport-set-macs provnet1-$n-port1 00:00:00:00:00:0$n ovn-nbctl lport-set-port-security provnet1-$n-port1 00:00:00:00:00:0$n ovn-nbctl lport-add provnet1-$n provnet1-$n-physnet1 ovn-nbctl lport-set-macs provnet1-$n-physnet1 unknown ovn-nbctl lport-set-type provnet1-$n-physnet1 localnet ovn-nbctl lport-set-options provnet1-$n-physnet1 network_name=physnet1 done # Bind lport1 (A1) and lport2 (A2) to the local hypervisor. ovs-vsctl add-port br-int lport1 -- set Interface lport1 external_ids:iface-id=provnet1-1-port1 ovs-vsctl add-port br-int lport2 -- set Interface lport2 external_ids:iface-id=provnet1-2-port1 # Bind the other 2 ports to the fake remote hypervisor. ovn-sbctl lport-bind provnet1-3-port1 fakechassis ovn-sbctl lport-bind provnet1-4-port1 fakechassis After running these commands, we have the following logical configuration: $ ovn-nbctl show lswitch 035645fc-b2ff-4e26-b953-69addba80a9a (provnet1-4) lport provnet1-4-physnet1 macs: unknown lport provnet1-4-port1 macs: 00:00:00:00:00:04 lswitch 66212a85-b3b6-4688-bcf6-8062941a2d96 (provnet1-2) lport provnet1-2-physnet1 macs: unknown lport provnet1-2-port1 macs: 00:00:00:00:00:02 lswitch fc5b1141-0216-4fa7-86f3-461811c1fc9b (provnet1-3) lport provnet1-3-physnet1 macs: unknown lport provnet1-3-port1 macs: 00:00:00:00:00:03 lswitch 9b1d2636-e654-4d43-84e8-a921af611b33 (provnet1-1) lport provnet1-1-physnet1 macs: unknown lport provnet1-1-port1 macs: 00:00:00:00:00:01 We can also look at OVN_Southbound to see that 2 logical ports are bound to each hypervisor: $ ovn-sbctl show Chassis "56b18105-5706-46ef-80c4-ff20979ab068" Encap geneve ip: "127.0.0.1" Port_Binding "provnet1-1-port1" Port_Binding "provnet1-2-port1" Chassis fakechassis Encap geneve ip: "127.0.0.1" Port_Binding "provnet1-3-port1" Port_Binding "provnet1-4-port1" Now we can generate several packets to test how a packet would be processed on hypervisor A. The OpenFlow port numbers in this demo are: 1 - patch port to br-eth1 (physnet1) 2 - tunnel to fakechassis 3 - lport1 (A1) 4 - lport2 (A2) Packet test #1: A1 to A2 - This will be output to ofport 1. Despite both VMs being local to this hypervisor, all packets betwen the VMs go through physnet1. In practice, this will get optimized at br-eth1. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02 -generate Packet test #2: physnet1 to A2 - Consider this a continuation of test is attached to will be considered. The end result should be that the only output is to ofport 4 (A2). ovs-appctl ofproto/trace br-int \ in_port=1,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02 -generate Packet test #3: A1 to B1 - This will be output to ofport 1, as physnet1 is to be used to reach any other port. When it arrives at hypervisor B, processing would look just like test #2. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:03 -generate Packet test #4: A1 broadcast. - Again, the packet will only be sent to physnet1. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=ff:ff:ff:ff:ff:ff -generate Packet test #5: B1 broadcast arriving at hypervisor A. This is somewhat a continuation of test #4. When a broadcast packet arrives from physnet1 on hypervisor A, we should see it output to both A1 and A2 (ofports 3 and 4). ovs-appctl ofproto/trace br-int \ in_port=1,dl_src=00:00:00:00:00:03,dl_dst=ff:ff:ff:ff:ff:ff -generate Signed-off-by: Russell Bryant <rbryant@redhat.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
This reverts commit 337bebe, which caused a crash in test 1048 "ofproto-dpif - Flow IPFIX sanity check" (now test 1051) with the following backtrace: #0 hmap_first_with_hash (hmap=<optimized out>, hmap=<optimized out>, hash=<optimized out>) at ../lib/hmap.h:328 #1 smap_find__ (smap=0x94, key=key@entry=0x817f7ab "virtual_obs_id", key_len=14, hash=2537071222) at ../lib/smap.c:366 #2 0x0812b9d7 in smap_get_node (smap=0x9738a276, key=0x817f7ab "virtual_obs_id") at ../lib/smap.c:198 #3 0x0812ba30 in smap_get (smap=0x94, key=0x817f7ab "virtual_obs_id") at ../lib/smap.c:189 #4 0x08055a60 in bridge_configure_ipfix (br=<optimized out>) at ../vswitchd/bridge.c:1237 #5 bridge_reconfigure (ovs_cfg=0x94) at ../vswitchd/bridge.c:666 #6 0x080568d3 in bridge_run () at ../vswitchd/bridge.c:2972 #7 0x0804c9dd in main (argc=10, argv=0xffd8b934) at ../vswitchd/ovs-vswitchd.c:112 Signed-off-by: Ben Pfaff <blp@ovn.org>
This reverts commit 337bebe, which caused a crash in test 1048 "ofproto-dpif - Flow IPFIX sanity check" (now test 1051) with the following backtrace: #0 hmap_first_with_hash (hmap=<optimized out>, hmap=<optimized out>, hash=<optimized out>) at ../lib/hmap.h:328 openvswitch#1 smap_find__ (smap=0x94, key=key@entry=0x817f7ab "virtual_obs_id", key_len=14, hash=2537071222) at ../lib/smap.c:366 openvswitch#2 0x0812b9d7 in smap_get_node (smap=0x9738a276, key=0x817f7ab "virtual_obs_id") at ../lib/smap.c:198 openvswitch#3 0x0812ba30 in smap_get (smap=0x94, key=0x817f7ab "virtual_obs_id") at ../lib/smap.c:189 #4 0x08055a60 in bridge_configure_ipfix (br=<optimized out>) at ../vswitchd/bridge.c:1237 #5 bridge_reconfigure (ovs_cfg=0x94) at ../vswitchd/bridge.c:666 openvswitch#6 0x080568d3 in bridge_run () at ../vswitchd/bridge.c:2972 #7 0x0804c9dd in main (argc=10, argv=0xffd8b934) at ../vswitchd/ovs-vswitchd.c:112 Signed-off-by: Ben Pfaff <blp@ovn.org>
UB Sanitizer report: lib/netdev-dummy.c:197:15: runtime error: member access within misaligned address 0x00000217a7f0 for type 'struct dummy_packet_stream', which requires 64 byte alignment ^ #0 dummy_packet_stream_init lib/netdev-dummy.c:197 openvswitch#1 dummy_packet_stream_create lib/netdev-dummy.c:208 openvswitch#2 dummy_packet_conn_set_config lib/netdev-dummy.c:436 [...] Signed-off-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: 0-day Robot <robot@bytheb.org>
UB Sanitizer report: lib/dp-packet.h:587:22: runtime error: member access within misaligned address 0x000001ecde10 for type 'struct dp_packet', which requires 64 byte alignment #0 in dp_packet_set_base lib/dp-packet.h:587 openvswitch#1 in dp_packet_use__ lib/dp-packet.c:46 openvswitch#2 in dp_packet_use lib/dp-packet.c:60 openvswitch#3 in dp_packet_init lib/dp-packet.c:126 #4 in dp_packet_new lib/dp-packet.c:150 [...] Signed-off-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: 0-day Robot <robot@bytheb.org>
UB Sanitizer report: lib/netdev-dummy.c:197:15: runtime error: member access within misaligned address 0x00000217a7f0 for type 'struct dummy_packet_stream', which requires 64 byte alignment ^ #0 dummy_packet_stream_init lib/netdev-dummy.c:197 openvswitch#1 dummy_packet_stream_create lib/netdev-dummy.c:208 openvswitch#2 dummy_packet_conn_set_config lib/netdev-dummy.c:436 [...] Signed-off-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
UB Sanitizer report: lib/dp-packet.h:587:22: runtime error: member access within misaligned address 0x000001ecde10 for type 'struct dp_packet', which requires 64 byte alignment #0 in dp_packet_set_base lib/dp-packet.h:587 openvswitch#1 in dp_packet_use__ lib/dp-packet.c:46 openvswitch#2 in dp_packet_use lib/dp-packet.c:60 openvswitch#3 in dp_packet_init lib/dp-packet.c:126 #4 in dp_packet_new lib/dp-packet.c:150 [...] Signed-off-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Direct leak of 36 byte(s) in 1 object(s) allocated from: #0 0x527d90 in __interceptor_realloc.part.0 asan_malloc_linux.cpp.o #1 0xc5f9fc in xrealloc__ /workspace/ovs/lib/util.c:147:9 openvswitch#2 0xc5f9fc in xrealloc /workspace/ovs/lib/util.c:179:12 openvswitch#3 0x86845d in ds_reserve /workspace/ovs/lib/dynamic-string.c:63:22 #4 0x86954a in ds_put_format_valist /workspace/ovs/lib/dynamic-string.c:164:9 #5 0x869202 in ds_put_format /workspace/ovs/lib/dynamic-string.c:142:5 openvswitch#6 0x7dc664 in ct_dpif_parse_tuple /workspace/ovs/lib/ct-dpif.c #7 0xebb089 in dpctl_flush_conntrack /workspace/ovs/lib/dpctl.c:1717:17 #8 0xeb4eb2 in dpctl_unixctl_handler /workspace/ovs/lib/dpctl.c:3035:17 #9 0xc5d4f8 in process_command /workspace/ovs/lib/unixctl.c:310:13 #10 0xc5d4f8 in run_connection /workspace/ovs/lib/unixctl.c:344:17 #11 0xc5d4f8 in unixctl_server_run /workspace/ovs/lib/unixctl.c:395:21 #12 0x5a643f in main /workspace/ovs/vswitchd/ovs-vswitchd.c:130:9 Signed-off-by: Ales Musil <amusil@redhat.com>
Direct leak of 36 byte(s) in 1 object(s) allocated from: #0 0x527d90 in __interceptor_realloc.part.0 asan_malloc_linux.cpp.o #1 0xc5f9fc in xrealloc__ /workspace/ovs/lib/util.c:147:9 openvswitch#2 0xc5f9fc in xrealloc /workspace/ovs/lib/util.c:179:12 openvswitch#3 0x86845d in ds_reserve /workspace/ovs/lib/dynamic-string.c:63:22 #4 0x86954a in ds_put_format_valist /workspace/ovs/lib/dynamic-string.c:164:9 #5 0x869202 in ds_put_format /workspace/ovs/lib/dynamic-string.c:142:5 openvswitch#6 0x7dc664 in ct_dpif_parse_tuple /workspace/ovs/lib/ct-dpif.c #7 0xebb089 in dpctl_flush_conntrack /workspace/ovs/lib/dpctl.c:1717:17 #8 0xeb4eb2 in dpctl_unixctl_handler /workspace/ovs/lib/dpctl.c:3035:17 #9 0xc5d4f8 in process_command /workspace/ovs/lib/unixctl.c:310:13 #10 0xc5d4f8 in run_connection /workspace/ovs/lib/unixctl.c:344:17 #11 0xc5d4f8 in unixctl_server_run /workspace/ovs/lib/unixctl.c:395:21 #12 0x5a643f in main /workspace/ovs/vswitchd/ovs-vswitchd.c:130:9 Signed-off-by: Ales Musil <amusil@redhat.com>
Direct leak of 36 byte(s) in 1 object(s) allocated from: #0 0x527d90 in __interceptor_realloc.part.0 asan_malloc_linux.cpp.o openvswitch#1 0xc5f9fc in xrealloc__ /workspace/ovs/lib/util.c:147:9 openvswitch#2 0xc5f9fc in xrealloc /workspace/ovs/lib/util.c:179:12 openvswitch#3 0x86845d in ds_reserve /workspace/ovs/lib/dynamic-string.c:63:22 #4 0x86954a in ds_put_format_valist /workspace/ovs/lib/dynamic-string.c:164:9 #5 0x869202 in ds_put_format /workspace/ovs/lib/dynamic-string.c:142:5 openvswitch#6 0x7dc664 in ct_dpif_parse_tuple /workspace/ovs/lib/ct-dpif.c #7 0xebb089 in dpctl_flush_conntrack /workspace/ovs/lib/dpctl.c:1717:17 #8 0xeb4eb2 in dpctl_unixctl_handler /workspace/ovs/lib/dpctl.c:3035:17 #9 0xc5d4f8 in process_command /workspace/ovs/lib/unixctl.c:310:13 #10 0xc5d4f8 in run_connection /workspace/ovs/lib/unixctl.c:344:17 #11 0xc5d4f8 in unixctl_server_run /workspace/ovs/lib/unixctl.c:395:21 #12 0x5a643f in main /workspace/ovs/vswitchd/ovs-vswitchd.c:130:9 Signed-off-by: Ales Musil <amusil@redhat.com> Signed-off-by: 0-day Robot <robot@bytheb.org>
Direct leak of 36 byte(s) in 1 object(s) allocated from: #0 0x527d90 in __interceptor_realloc.part.0 asan_malloc_linux.cpp.o openvswitch#1 0xc5f9fc in xrealloc__ /workspace/ovs/lib/util.c:147:9 openvswitch#2 0xc5f9fc in xrealloc /workspace/ovs/lib/util.c:179:12 openvswitch#3 0x86845d in ds_reserve /workspace/ovs/lib/dynamic-string.c:63:22 #4 0x86954a in ds_put_format_valist /workspace/ovs/lib/dynamic-string.c:164:9 #5 0x869202 in ds_put_format /workspace/ovs/lib/dynamic-string.c:142:5 openvswitch#6 0x7dc664 in ct_dpif_parse_tuple /workspace/ovs/lib/ct-dpif.c #7 0xebb089 in dpctl_flush_conntrack /workspace/ovs/lib/dpctl.c:1717:17 #8 0xeb4eb2 in dpctl_unixctl_handler /workspace/ovs/lib/dpctl.c:3035:17 #9 0xc5d4f8 in process_command /workspace/ovs/lib/unixctl.c:310:13 #10 0xc5d4f8 in run_connection /workspace/ovs/lib/unixctl.c:344:17 #11 0xc5d4f8 in unixctl_server_run /workspace/ovs/lib/unixctl.c:395:21 #12 0x5a643f in main /workspace/ovs/vswitchd/ovs-vswitchd.c:130:9 Signed-off-by: Ales Musil <amusil@redhat.com> Signed-off-by: 0-day Robot <robot@bytheb.org>
UB Sanitizer report: lib/netdev-offload-tc.c:1276:19: runtime error: load of misaligned address 0x7f74e801976c for type 'union ovs_u128', which requires 8 byte alignment #0 in netdev_tc_flow_dump_next lib/netdev-offload-tc.c:1276 openvswitch#1 in netdev_flow_dump_next lib/netdev-offload.c:303 openvswitch#2 in dpif_netlink_flow_dump_next lib/dpif-netlink.c:1921 [...] Signed-off-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: 0-day Robot <robot@bytheb.org>
UB Sanitizer report: lib/netdev-offload-tc.c:1276:19: runtime error: load of misaligned address 0x7f74e801976c for type 'union ovs_u128', which requires 8 byte alignment #0 in netdev_tc_flow_dump_next lib/netdev-offload-tc.c:1276 openvswitch#1 in netdev_flow_dump_next lib/netdev-offload.c:303 openvswitch#2 in dpif_netlink_flow_dump_next lib/dpif-netlink.c:1921 [...] Signed-off-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: 0-day Robot <robot@bytheb.org>
The OpenFlow15 Packet-Out message contains the whole match instead of the in_port. The match has no assignment but used in oxm_put_match. The coredump gdb backtrace is: #0 memcpy_from_metadata (dst=dst@entry=0x7ffcfac2f060, src=src@entry=0x7ffcfac30880, loc=loc@entry=0x10) at lib/tun-metadata.c:467 openvswitch#1 0x00000000004506e8 in metadata_loc_from_match_read (match=0x7ffcfac30598, is_masked=<synthetic pointer>, mask=0x7ffcfac30838, idx=0, map=0x0) at lib/tun-metadata.c:865 openvswitch#2 metadata_loc_from_match_read (is_masked=<synthetic pointer>, mask=0x7ffcfac30838, idx=0, match=0x7ffcfac30598, map=0x0) at lib/tun-metadata.c:854 openvswitch#3 tun_metadata_to_nx_match (b=b@entry=0x892260, oxm=oxm@entry=OFP15_VERSION, match=match@entry=0x7ffcfac30598) at lib/tun-metadata.c:888 #4 0x000000000047c1f8 in nx_put_raw (b=b@entry=0x892260, oxm=oxm@entry=OFP15_VERSION, match=match@entry=0x7ffcfac30598, cookie=<optimized out>, cookie@entry=0, cookie_mask=<optimized out>, cookie_mask@entry=0) at lib/nx-match.c:1186 #5 0x000000000047d693 in oxm_put_match (b=b@entry=0x892260, match=match@entry=0x7ffcfac30598, version=version@entry=OFP15_VERSION) at lib/nx-match.c:1343 openvswitch#6 0x000000000043194e in ofputil_encode_packet_out (po=po@entry=0x7ffcfac30580, protocol=<optimized out>) at lib/ofp-packet.c:1226 #7 0x000000000040a4fe in process_packet_in (sw=sw@entry=0x891d70, oh=<optimized out>) at lib/learning-switch.c:619 #8 0x000000000040acdc in lswitch_process_packet (msg=0x892210, sw=0x891d70) at lib/learning-switch.c:374 #9 lswitch_run (sw=0x891d70) at lib/learning-switch.c:324 #10 0x0000000000406f26 in main (argc=<optimized out>, argv=<optimized out>) at utilities/ovs-testcontroller.c:180 Fix that by setting the packet-out match instead of in_port. Fixes: 577bfa9 ("ofp-util: Add OpenFlow 1.5 packet-out support") Signed-off-by: Faicker Mo <faicker.mo@ucloud.cn> Signed-off-by: 0-day Robot <robot@bytheb.org>
The OpenFlow15 Packet-Out message contains the match instead of the in_port. The flow.tunnel.metadata.tab is not inited but used in the loop of tun_metadata_to_nx_match. The coredump gdb backtrace is: #0 memcpy_from_metadata (dst=dst@entry=0x7ffcfac2f060, src=src@entry=0x7ffcfac30880, loc=loc@entry=0x10) at lib/tun-metadata.c:467 openvswitch#1 0x00000000004506e8 in metadata_loc_from_match_read (match=0x7ffcfac30598, is_masked=<synthetic pointer>, mask=0x7ffcfac30838, idx=0, map=0x0) at lib/tun-metadata.c:865 openvswitch#2 metadata_loc_from_match_read (is_masked=<synthetic pointer>, mask=0x7ffcfac30838, idx=0, match=0x7ffcfac30598, map=0x0) at lib/tun-metadata.c:854 openvswitch#3 tun_metadata_to_nx_match (b=b@entry=0x892260, oxm=oxm@entry=OFP15_VERSION, match=match@entry=0x7ffcfac30598) at lib/tun-metadata.c:888 #4 0x000000000047c1f8 in nx_put_raw (b=b@entry=0x892260, oxm=oxm@entry=OFP15_VERSION, match=match@entry=0x7ffcfac30598, cookie=<optimized out>, cookie@entry=0, cookie_mask=<optimized out>, cookie_mask@entry=0) at lib/nx-match.c:1186 #5 0x000000000047d693 in oxm_put_match (b=b@entry=0x892260, match=match@entry=0x7ffcfac30598, version=version@entry=OFP15_VERSION) at lib/nx-match.c:1343 openvswitch#6 0x000000000043194e in ofputil_encode_packet_out (po=po@entry=0x7ffcfac30580, protocol=<optimized out>) at lib/ofp-packet.c:1226 #7 0x000000000040a4fe in process_packet_in (sw=sw@entry=0x891d70, oh=<optimized out>) at lib/learning-switch.c:619 #8 0x000000000040acdc in lswitch_process_packet (msg=0x892210, sw=0x891d70) at lib/learning-switch.c:374 #9 lswitch_run (sw=0x891d70) at lib/learning-switch.c:324 #10 0x0000000000406f26 in main (argc=<optimized out>, argv=<optimized out>) at utilities/ovs-testcontroller.c:180 Fix that by zeroing out the po variable. Fixes: 577bfa9 ("ofp-util: Add OpenFlow 1.5 packet-out support") Signed-off-by: Faicker Mo <faicker.mo@ucloud.cn> Signed-off-by: 0-day Robot <robot@bytheb.org>
The OpenFlow15 Packet-Out message contains the match instead of the in_port. The flow.tunnel.metadata.tab is not inited but used in the loop of tun_metadata_to_nx_match. The coredump gdb backtrace is: #0 memcpy_from_metadata (dst=dst@entry=0x7ffcfac2f060, src=src@entry=0x7ffcfac30880, loc=loc@entry=0x10) at lib/tun-metadata.c:467 openvswitch#1 0x00000000004506e8 in metadata_loc_from_match_read (match=0x7ffcfac30598, is_masked=<synthetic pointer>, mask=0x7ffcfac30838, idx=0, map=0x0) at lib/tun-metadata.c:865 openvswitch#2 metadata_loc_from_match_read (is_masked=<synthetic pointer>, mask=0x7ffcfac30838, idx=0, match=0x7ffcfac30598, map=0x0) at lib/tun-metadata.c:854 openvswitch#3 tun_metadata_to_nx_match (b=b@entry=0x892260, oxm=oxm@entry=OFP15_VERSION, match=match@entry=0x7ffcfac30598) at lib/tun-metadata.c:888 #4 0x000000000047c1f8 in nx_put_raw (b=b@entry=0x892260, oxm=oxm@entry=OFP15_VERSION, match=match@entry=0x7ffcfac30598, cookie=<optimized out>, cookie@entry=0, cookie_mask=<optimized out>, cookie_mask@entry=0) at lib/nx-match.c:1186 #5 0x000000000047d693 in oxm_put_match (b=b@entry=0x892260, match=match@entry=0x7ffcfac30598, version=version@entry=OFP15_VERSION) at lib/nx-match.c:1343 openvswitch#6 0x000000000043194e in ofputil_encode_packet_out (po=po@entry=0x7ffcfac30580, protocol=<optimized out>) at lib/ofp-packet.c:1226 #7 0x000000000040a4fe in process_packet_in (sw=sw@entry=0x891d70, oh=<optimized out>) at lib/learning-switch.c:619 #8 0x000000000040acdc in lswitch_process_packet (msg=0x892210, sw=0x891d70) at lib/learning-switch.c:374 #9 lswitch_run (sw=0x891d70) at lib/learning-switch.c:324 #10 0x0000000000406f26 in main (argc=<optimized out>, argv=<optimized out>) at utilities/ovs-testcontroller.c:180 Fix that by initing the flow metadata. Fixes: 35eb632 ("ofp-util: Add flow metadata to ofputil_packet_out") Signed-off-by: Faicker Mo <faicker.mo@ucloud.cn> Signed-off-by: 0-day Robot <robot@bytheb.org>
Reported by Address Sanitizer. Indirect leak of 1024 byte(s) in 1 object(s) allocated from: #0 0x7fe09d3bfe70 in __interceptor_malloc (/usr/lib64/libasan.so.4+0xe0e70) openvswitch#1 0x8759be in xmalloc__ lib/util.c:140 openvswitch#2 0x875a9a in xmalloc lib/util.c:175 openvswitch#3 0x7ba0d2 in ofpbuf_init lib/ofpbuf.c:141 #4 0x7ba1d6 in ofpbuf_new lib/ofpbuf.c:169 #5 0x9057f9 in nl_sock_transact lib/netlink-socket.c:1113 openvswitch#6 0x907a7e in nl_transact lib/netlink-socket.c:1817 #7 0x8b5abe in dpif_netlink_dp_transact lib/dpif-netlink.c:5007 #8 0x89a6b5 in dpif_netlink_open lib/dpif-netlink.c:398 #9 0x5de16f in do_open lib/dpif.c:348 #10 0x5de69a in dpif_open lib/dpif.c:393 #11 0x5de71f in dpif_create_and_open lib/dpif.c:419 #12 0x47b918 in open_dpif_backer ofproto/ofproto-dpif.c:764 #13 0x483e4a in construct ofproto/ofproto-dpif.c:1658 #14 0x441644 in ofproto_create ofproto/ofproto.c:556 #15 0x40ba5a in bridge_reconfigure vswitchd/bridge.c:885 openvswitch#16 0x41f1a9 in bridge_run vswitchd/bridge.c:3313 #17 0x42d4fb in main vswitchd/ovs-vswitchd.c:132 openvswitch#18 0x7fe09cc03c86 in __libc_start_main (/usr/lib64/libc.so.6+0x25c86) Fixes: b841e3c ("dpif-netlink: Fix feature negotiation for older kernels.") Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Signed-off-by: 0-day Robot <robot@bytheb.org>
Currently, bundle->cvlans and xbundle->cvlans are pointing to the same memory location. This can cause issues if the main thread modifies bundle->cvlans and frees it while the revalidator thread is still accessing xbundle->cvlans. This can result in use after free errors. AddressSanitizer: heap-use-after-free on address 0x615000007b08 at pc 0x0000004ede1e bp 0x7f3120ee0310 sp 0x7f3120ee0300 READ of size 8 at 0x615000007b08 thread T25 (revalidator25) #0 0x4ede1d in bitmap_is_set lib/bitmap.h:91 openvswitch#1 0x4fcb26 in xbundle_allows_cvlan ofproto/ofproto-dpif-xlate.c:2028 openvswitch#2 0x4fe279 in input_vid_is_valid ofproto/ofproto-dpif-xlate.c:2294 openvswitch#3 0x502abf in xlate_normal ofproto/ofproto-dpif-xlate.c:3051 #4 0x5164dc in xlate_output_action ofproto/ofproto-dpif-xlate.c:5361 #5 0x522576 in do_xlate_actions ofproto/ofproto-dpif-xlate.c:7047 openvswitch#6 0x52a751 in xlate_actions ofproto/ofproto-dpif-xlate.c:8061 #7 0x4e2b66 in xlate_key ofproto/ofproto-dpif-upcall.c:2212 #8 0x4e2e13 in xlate_ukey ofproto/ofproto-dpif-upcall.c:2227 #9 0x4e345d in revalidate_ukey__ ofproto/ofproto-dpif-upcall.c:2276 #10 0x4e3f85 in revalidate_ukey ofproto/ofproto-dpif-upcall.c:2395 #11 0x4e7ac5 in revalidate ofproto/ofproto-dpif-upcall.c:2858 #12 0x4d9ed3 in udpif_revalidator ofproto/ofproto-dpif-upcall.c:1010 #13 0x7cd92e in ovsthread_wrapper lib/ovs-thread.c:423 #14 0x7f312ff01f3a (/usr/lib64/libpthread.so.0+0x8f3a) #15 0x7f312fc8f51f in clone (/usr/lib64/libc.so.6+0xf851f) 0x615000007b08 is located 8 bytes inside of 512-byte region [0x615000007b00,0x615000007d00) freed by thread T0 here: #0 0x7f3130378ad8 in free (/usr/lib64/libasan.so.4+0xe0ad8) openvswitch#1 0x49044e in bundle_set ofproto/ofproto-dpif.c:3431 openvswitch#2 0x444f92 in ofproto_bundle_register ofproto/ofproto.c:1455 openvswitch#3 0x40e6c9 in port_configure vswitchd/bridge.c:1300 #4 0x40bcfd in bridge_reconfigure vswitchd/bridge.c:921 #5 0x41f1a9 in bridge_run vswitchd/bridge.c:3313 openvswitch#6 0x42d4fb in main vswitchd/ovs-vswitchd.c:132 #7 0x7f312fbbcc86 in __libc_start_main (/usr/lib64/libc.so.6+0x25c86) previously allocated by thread T0 here: #0 0x7f3130378e70 in __interceptor_malloc (/usr/lib64/libasan.so.4+0xe0e70) openvswitch#1 0x8757fe in xmalloc__ lib/util.c:140 openvswitch#2 0x8758da in xmalloc lib/util.c:175 openvswitch#3 0x875927 in xmemdup lib/util.c:188 #4 0x475f63 in bitmap_clone lib/bitmap.h:79 #5 0x47797c in vlan_bitmap_clone lib/vlan-bitmap.h:40 openvswitch#6 0x49048d in bundle_set ofproto/ofproto-dpif.c:3433 #7 0x444f92 in ofproto_bundle_register ofproto/ofproto.c:1455 #8 0x40e6c9 in port_configure vswitchd/bridge.c:1300 #9 0x40bcfd in bridge_reconfigure vswitchd/bridge.c:921 #10 0x41f1a9 in bridge_run vswitchd/bridge.c:3313 #11 0x42d4fb in main vswitchd/ovs-vswitchd.c:132 #12 0x7f312fbbcc86 in __libc_start_main (/usr/lib64/libc.so.6+0x25c86) Fixes: fed8962 ("Add new port VLAN mode "dot1q-tunnel"") Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Signed-off-by: 0-day Robot <robot@bytheb.org>
Reported by Address Sanitizer. Indirect leak of 1024 byte(s) in 1 object(s) allocated from: #0 0x7fe09d3bfe70 in __interceptor_malloc (/usr/lib64/libasan.so.4+0xe0e70) openvswitch#1 0x8759be in xmalloc__ lib/util.c:140 openvswitch#2 0x875a9a in xmalloc lib/util.c:175 openvswitch#3 0x7ba0d2 in ofpbuf_init lib/ofpbuf.c:141 #4 0x7ba1d6 in ofpbuf_new lib/ofpbuf.c:169 #5 0x9057f9 in nl_sock_transact lib/netlink-socket.c:1113 openvswitch#6 0x907a7e in nl_transact lib/netlink-socket.c:1817 #7 0x8b5abe in dpif_netlink_dp_transact lib/dpif-netlink.c:5007 #8 0x89a6b5 in dpif_netlink_open lib/dpif-netlink.c:398 #9 0x5de16f in do_open lib/dpif.c:348 #10 0x5de69a in dpif_open lib/dpif.c:393 #11 0x5de71f in dpif_create_and_open lib/dpif.c:419 #12 0x47b918 in open_dpif_backer ofproto/ofproto-dpif.c:764 #13 0x483e4a in construct ofproto/ofproto-dpif.c:1658 #14 0x441644 in ofproto_create ofproto/ofproto.c:556 #15 0x40ba5a in bridge_reconfigure vswitchd/bridge.c:885 openvswitch#16 0x41f1a9 in bridge_run vswitchd/bridge.c:3313 #17 0x42d4fb in main vswitchd/ovs-vswitchd.c:132 openvswitch#18 0x7fe09cc03c86 in __libc_start_main (/usr/lib64/libc.so.6+0x25c86) Fixes: b841e3c ("dpif-netlink: Fix feature negotiation for older kernels.") Suggested-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Signed-off-by: 0-day Robot <robot@bytheb.org>
In the specific call to dpif_netlink_dp_transact() (line 398) in dpif_netlink_open(), the 'dp' content is not being used in the branch when no error is returned (starting line 430). Furthermore, the 'dp' and 'buf' variables are overwritten later in this same branch when a new netlink request is sent (line 437), which results in a memory leak. Reported by Address Sanitizer. Indirect leak of 1024 byte(s) in 1 object(s) allocated from: #0 0x7fe09d3bfe70 in __interceptor_malloc (/usr/lib64/libasan.so.4+0xe0e70) openvswitch#1 0x8759be in xmalloc__ lib/util.c:140 openvswitch#2 0x875a9a in xmalloc lib/util.c:175 openvswitch#3 0x7ba0d2 in ofpbuf_init lib/ofpbuf.c:141 #4 0x7ba1d6 in ofpbuf_new lib/ofpbuf.c:169 #5 0x9057f9 in nl_sock_transact lib/netlink-socket.c:1113 openvswitch#6 0x907a7e in nl_transact lib/netlink-socket.c:1817 #7 0x8b5abe in dpif_netlink_dp_transact lib/dpif-netlink.c:5007 #8 0x89a6b5 in dpif_netlink_open lib/dpif-netlink.c:398 #9 0x5de16f in do_open lib/dpif.c:348 #10 0x5de69a in dpif_open lib/dpif.c:393 #11 0x5de71f in dpif_create_and_open lib/dpif.c:419 #12 0x47b918 in open_dpif_backer ofproto/ofproto-dpif.c:764 #13 0x483e4a in construct ofproto/ofproto-dpif.c:1658 #14 0x441644 in ofproto_create ofproto/ofproto.c:556 #15 0x40ba5a in bridge_reconfigure vswitchd/bridge.c:885 openvswitch#16 0x41f1a9 in bridge_run vswitchd/bridge.c:3313 #17 0x42d4fb in main vswitchd/ovs-vswitchd.c:132 openvswitch#18 0x7fe09cc03c86 in __libc_start_main (/usr/lib64/libc.so.6+0x25c86) Fixes: b841e3c ("dpif-netlink: Fix feature negotiation for older kernels.") Suggested-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Signed-off-by: 0-day Robot <robot@bytheb.org>
In the specific call to dpif_netlink_dp_transact() (line 398) in dpif_netlink_open(), the 'dp' content is not being used in the branch when no error is returned (starting line 430). Furthermore, the 'dp' and 'buf' variables are overwritten later in this same branch when a new netlink request is sent (line 437), which results in a memory leak. Reported by Address Sanitizer. Indirect leak of 1024 byte(s) in 1 object(s) allocated from: #0 0x7fe09d3bfe70 in __interceptor_malloc (/usr/lib64/libasan.so.4+0xe0e70) openvswitch#1 0x8759be in xmalloc__ lib/util.c:140 openvswitch#2 0x875a9a in xmalloc lib/util.c:175 openvswitch#3 0x7ba0d2 in ofpbuf_init lib/ofpbuf.c:141 #4 0x7ba1d6 in ofpbuf_new lib/ofpbuf.c:169 #5 0x9057f9 in nl_sock_transact lib/netlink-socket.c:1113 openvswitch#6 0x907a7e in nl_transact lib/netlink-socket.c:1817 #7 0x8b5abe in dpif_netlink_dp_transact lib/dpif-netlink.c:5007 #8 0x89a6b5 in dpif_netlink_open lib/dpif-netlink.c:398 #9 0x5de16f in do_open lib/dpif.c:348 #10 0x5de69a in dpif_open lib/dpif.c:393 #11 0x5de71f in dpif_create_and_open lib/dpif.c:419 #12 0x47b918 in open_dpif_backer ofproto/ofproto-dpif.c:764 #13 0x483e4a in construct ofproto/ofproto-dpif.c:1658 #14 0x441644 in ofproto_create ofproto/ofproto.c:556 #15 0x40ba5a in bridge_reconfigure vswitchd/bridge.c:885 openvswitch#16 0x41f1a9 in bridge_run vswitchd/bridge.c:3313 #17 0x42d4fb in main vswitchd/ovs-vswitchd.c:132 openvswitch#18 0x7fe09cc03c86 in __libc_start_main (/usr/lib64/libc.so.6+0x25c86) Fixes: b841e3c ("dpif-netlink: Fix feature negotiation for older kernels.") Reviewed-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Signed-off-by: 0-day Robot <robot@bytheb.org>
Currently, bundle->cvlans and xbundle->cvlans are pointing to the same memory location. This can cause issues if the main thread modifies bundle->cvlans and frees it while the revalidator thread is still accessing xbundle->cvlans. This leads to use-after-free error. AddressSanitizer: heap-use-after-free on address 0x615000007b08 at pc 0x0000004ede1e bp 0x7f3120ee0310 sp 0x7f3120ee0300 READ of size 8 at 0x615000007b08 thread T25 (revalidator25) #0 0x4ede1d in bitmap_is_set lib/bitmap.h:91 openvswitch#1 0x4fcb26 in xbundle_allows_cvlan ofproto/ofproto-dpif-xlate.c:2028 openvswitch#2 0x4fe279 in input_vid_is_valid ofproto/ofproto-dpif-xlate.c:2294 openvswitch#3 0x502abf in xlate_normal ofproto/ofproto-dpif-xlate.c:3051 #4 0x5164dc in xlate_output_action ofproto/ofproto-dpif-xlate.c:5361 #5 0x522576 in do_xlate_actions ofproto/ofproto-dpif-xlate.c:7047 openvswitch#6 0x52a751 in xlate_actions ofproto/ofproto-dpif-xlate.c:8061 #7 0x4e2b66 in xlate_key ofproto/ofproto-dpif-upcall.c:2212 #8 0x4e2e13 in xlate_ukey ofproto/ofproto-dpif-upcall.c:2227 #9 0x4e345d in revalidate_ukey__ ofproto/ofproto-dpif-upcall.c:2276 #10 0x4e3f85 in revalidate_ukey ofproto/ofproto-dpif-upcall.c:2395 #11 0x4e7ac5 in revalidate ofproto/ofproto-dpif-upcall.c:2858 #12 0x4d9ed3 in udpif_revalidator ofproto/ofproto-dpif-upcall.c:1010 #13 0x7cd92e in ovsthread_wrapper lib/ovs-thread.c:423 #14 0x7f312ff01f3a (/usr/lib64/libpthread.so.0+0x8f3a) #15 0x7f312fc8f51f in clone (/usr/lib64/libc.so.6+0xf851f) 0x615000007b08 is located 8 bytes inside of 512-byte region [0x615000007b00,0x615000007d00) freed by thread T0 here: #0 0x7f3130378ad8 in free (/usr/lib64/libasan.so.4+0xe0ad8) openvswitch#1 0x49044e in bundle_set ofproto/ofproto-dpif.c:3431 openvswitch#2 0x444f92 in ofproto_bundle_register ofproto/ofproto.c:1455 openvswitch#3 0x40e6c9 in port_configure vswitchd/bridge.c:1300 #4 0x40bcfd in bridge_reconfigure vswitchd/bridge.c:921 #5 0x41f1a9 in bridge_run vswitchd/bridge.c:3313 openvswitch#6 0x42d4fb in main vswitchd/ovs-vswitchd.c:132 #7 0x7f312fbbcc86 in __libc_start_main (/usr/lib64/libc.so.6+0x25c86) previously allocated by thread T0 here: #0 0x7f3130378e70 in __interceptor_malloc (/usr/lib64/libasan.so.4+0xe0e70) openvswitch#1 0x8757fe in xmalloc__ lib/util.c:140 openvswitch#2 0x8758da in xmalloc lib/util.c:175 openvswitch#3 0x875927 in xmemdup lib/util.c:188 #4 0x475f63 in bitmap_clone lib/bitmap.h:79 #5 0x47797c in vlan_bitmap_clone lib/vlan-bitmap.h:40 openvswitch#6 0x49048d in bundle_set ofproto/ofproto-dpif.c:3433 #7 0x444f92 in ofproto_bundle_register ofproto/ofproto.c:1455 #8 0x40e6c9 in port_configure vswitchd/bridge.c:1300 #9 0x40bcfd in bridge_reconfigure vswitchd/bridge.c:921 #10 0x41f1a9 in bridge_run vswitchd/bridge.c:3313 #11 0x42d4fb in main vswitchd/ovs-vswitchd.c:132 #12 0x7f312fbbcc86 in __libc_start_main (/usr/lib64/libc.so.6+0x25c86) Fixes: fed8962 ("Add new port VLAN mode "dot1q-tunnel"") Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Signed-off-by: 0-day Robot <robot@bytheb.org>
Found by AddressSanitizer when running OVN tests: Direct leak of 64 byte(s) in 1 object(s) allocated from: #0 0x498fb2 in calloc (/ic/ovn-ic+0x498fb2) openvswitch#1 0x5f681e in xcalloc__ ovs/lib/util.c:121:31 openvswitch#2 0x5f681e in xzalloc__ ovs/lib/util.c:131:12 openvswitch#3 0x5f681e in xzalloc ovs/lib/util.c:165:12 #4 0x5e3697 in ovsdb_idl_txn_add_map_op ovs/lib/ovsdb-idl.c:4057:29 #5 0x4d3f25 in update_isb_pb_external_ids ic/ovn-ic.c:576:5 openvswitch#6 0x4cc4cc in create_isb_pb ic/ovn-ic.c:716:5 #7 0x4cc4cc in port_binding_run ic/ovn-ic.c:803:21 #8 0x4cc4cc in ovn_db_run ic/ovn-ic.c:1700:5 #9 0x4c9c1c in main ic/ovn-ic.c:1984:17 #10 0x7f9ad9f4a0b2 in __libc_start_main Signed-off-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
OVN unit tests highlight this: ERROR: LeakSanitizer: detected memory leaks Direct leak of 1344 byte(s) in 1 object(s) allocated from: #0 0x4db0b7 in calloc (/root/master-ovn/ovs/ovsdb/ovsdb-server+0x4db0b7) openvswitch#1 0x5c2162 in xcalloc__ /root/master-ovn/ovs/lib/util.c:124:31 openvswitch#2 0x5c221c in xcalloc /root/master-ovn/ovs/lib/util.c:161:12 openvswitch#3 0x54afbc in ovsdb_condition_diff /root/master-ovn/ovs/ovsdb/condition.c:527:21 #4 0x529da6 in ovsdb_monitor_table_condition_update /root/master-ovn/ovs/ovsdb/monitor.c:824:5 #5 0x524fa4 in ovsdb_jsonrpc_parse_monitor_cond_change_request /root/master-ovn/ovs/ovsdb/jsonrpc-server.c:1557:13 openvswitch#6 0x5235c3 in ovsdb_jsonrpc_monitor_cond_change /root/master-ovn/ovs/ovsdb/jsonrpc-server.c:1624:25 #7 0x5217f2 in ovsdb_jsonrpc_session_got_request /root/master-ovn/ovs/ovsdb/jsonrpc-server.c:1034:17 #8 0x520ee6 in ovsdb_jsonrpc_session_run /root/master-ovn/ovs/ovsdb/jsonrpc-server.c:572:17 #9 0x51ffbe in ovsdb_jsonrpc_session_run_all /root/master-ovn/ovs/ovsdb/jsonrpc-server.c:602:21 #10 0x51fbcf in ovsdb_jsonrpc_server_run /root/master-ovn/ovs/ovsdb/jsonrpc-server.c:417:9 #11 0x517550 in main_loop /root/master-ovn/ovs/ovsdb/ovsdb-server.c:224:9 #12 0x512e80 in main /root/master-ovn/ovs/ovsdb/ovsdb-server.c:507:5 #13 0x7f9ecf675b74 in __libc_start_main (/lib64/libc.so.6+0x27b74) Signed-off-by: Xavier Simonart <xsimonar@redhat.com> Signed-off-by: 0-day Robot <robot@bytheb.org>
This avoids misaligned accesses as flagged by UBSan when running CoPP system tests: controller/pinctrl.c:7129:28: runtime error: member access within misaligned address 0x61b0000627be for type 'const struct bfd_msg', which requires 4 byte alignment 0x61b0000627be: note: pointer points here 00 20 d3 d4 20 60 05 18 a8 99 e7 7b 92 1d 36 73 00 0f 42 40 00 0f 42 40 00 00 00 00 00 00 00 03 ^ #0 0x621b8f in pinctrl_check_bfd_msg /root/ovn/controller/pinctrl.c:7129:28 openvswitch#1 0x621b8f in pinctrl_handle_bfd_msg /root/ovn/controller/pinctrl.c:7183:10 openvswitch#2 0x621b8f in process_packet_in /root/ovn/controller/pinctrl.c:3320:9 openvswitch#3 0x621b8f in pinctrl_recv /root/ovn/controller/pinctrl.c:3386:9 #4 0x621b8f in pinctrl_handler /root/ovn/controller/pinctrl.c:3481:17 #5 0xa2d89a in ovsthread_wrapper /root/ovn/ovs/lib/ovs-thread.c:422:12 openvswitch#6 0x7fcb598081ce in start_thread (/lib64/libpthread.so.0+0x81ce) #7 0x7fcb58439dd2 in clone (/lib64/libc.so.6+0x39dd2) Fixes: 1172035 ("controller: introduce BFD tx path in ovn-controller.") Reported-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Ales Musil <amusil@redhat.com> Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Nothing is being freed wherever we are calling ctl_fatal which is fine because the program is about to shutdown anyway however one of the leaks was caught by address sanitizer. Fix most of the leaks that are happening before call to ctl_fatal. Direct leak of 64 byte(s) in 1 object(s) allocated from: #0 0x568d70 in __interceptor_realloc.part.0 asan_malloc_linux.cpp.o openvswitch#1 0xa530a5 in xrealloc__ /workspace/ovn/ovs/lib/util.c:147:9 openvswitch#2 0xa530a5 in xrealloc /workspace/ovn/ovs/lib/util.c:179:12 openvswitch#3 0xa530a5 in x2nrealloc /workspace/ovn/ovs/lib/util.c:239:12 #4 0xa2ee57 in svec_expand /workspace/ovn/ovs/lib/svec.c:92:23 #5 0xa2ef6e in svec_terminate /workspace/ovn/ovs/lib/svec.c:116:5 openvswitch#6 0x82c117 in ovs_cmdl_env_parse_all /workspace/ovn/ovs/lib/command-line.c:98:5 #7 0x5ad70d in ovn_dbctl_main /workspace/ovn/utilities/ovn-dbctl.c:132:20 #8 0x5b58c7 in main /workspace/ovn/utilities/ovn-nbctl.c:7943:12 Indirect leak of 10 byte(s) in 1 object(s) allocated from: #0 0x569c07 in malloc (/workspace/ovn/utilities/ovn-nbctl+0x569c07) (BuildId: 3a287416f70de43f52382f0336980c876f655f90) openvswitch#1 0xa52d4c in xmalloc__ /workspace/ovn/ovs/lib/util.c:137:15 openvswitch#2 0xa52d4c in xmalloc /workspace/ovn/ovs/lib/util.c:172:12 openvswitch#3 0xa52d4c in xmemdup0 /workspace/ovn/ovs/lib/util.c:193:15 #4 0xa2e580 in svec_add /workspace/ovn/ovs/lib/svec.c:71:27 #5 0x82c041 in ovs_cmdl_env_parse_all /workspace/ovn/ovs/lib/command-line.c:91:5 openvswitch#6 0x5ad70d in ovn_dbctl_main /workspace/ovn/utilities/ovn-dbctl.c:132:20 #7 0x5b58c7 in main /workspace/ovn/utilities/ovn-nbctl.c:7943:12 Indirect leak of 8 byte(s) in 2 object(s) allocated from: #0 0x569c07 in malloc (/workspace/ovn/utilities/ovn-nbctl+0x569c07) openvswitch#1 0xa52d4c in xmalloc__ /workspace/ovn/ovs/lib/util.c:137:15 openvswitch#2 0xa52d4c in xmalloc /workspace/ovn/ovs/lib/util.c:172:12 openvswitch#3 0xa52d4c in xmemdup0 /workspace/ovn/ovs/lib/util.c:193:15 #4 0xa2e580 in svec_add /workspace/ovn/ovs/lib/svec.c:71:27 #5 0x82c0b6 in ovs_cmdl_env_parse_all /workspace/ovn/ovs/lib/command-line.c:96:9 openvswitch#6 0x5ad70d in ovn_dbctl_main /workspace/ovn/utilities/ovn-dbctl.c:132:20 #7 0x5b58c7 in main /workspace/ovn/utilities/ovn-nbctl.c:7943:12 Signed-off-by: Ales Musil <amusil@redhat.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Dumitru Ceara <dceara@redhat.com>
…eletion. When multiple LSP deletions are handled in incremental processing, if it hits a LSP that can't be incrementally processed after incrementally processing some LSP deletions, it falls back to recompute without destroying the ovn_port objects that are already handled in the handler, resulting in memory leaks. See example below, which is detected by the new test case added by this patch when running with address sanitizer. ======================= Indirect leak of 936 byte(s) in 3 object(s) allocated from: #0 0x55bce7 in calloc (/home/hanzhou/src/ovn/_build_as/northd/ovn-northd+0x55bce7) openvswitch#1 0x773f4e in xcalloc__ /home/hanzhou/src/ovs/_build/../lib/util.c:121:31 openvswitch#2 0x773f4e in xzalloc__ /home/hanzhou/src/ovs/_build/../lib/util.c:131:12 openvswitch#3 0x773f4e in xzalloc /home/hanzhou/src/ovs/_build/../lib/util.c:165:12 #4 0x60106c in en_northd_run /home/hanzhou/src/ovn/_build_as/../northd/en-northd.c:137:5 #5 0x61c6a8 in engine_recompute /home/hanzhou/src/ovn/_build_as/../lib/inc-proc-eng.c:415:5 openvswitch#6 0x61bee0 in engine_compute /home/hanzhou/src/ovn/_build_as/../lib/inc-proc-eng.c:454:17 #7 0x61bee0 in engine_run_node /home/hanzhou/src/ovn/_build_as/../lib/inc-proc-eng.c:503:14 #8 0x61bee0 in engine_run /home/hanzhou/src/ovn/_build_as/../lib/inc-proc-eng.c:528:9 #9 0x605e23 in inc_proc_northd_run /home/hanzhou/src/ovn/_build_as/../northd/inc-proc-northd.c:317:9 #10 0x5fe43b in main /home/hanzhou/src/ovn/_build_as/../northd/ovn-northd.c:934:33 #11 0x7f473933c1a1 in __libc_start_main (/lib64/libc.so.6+0x281a1) Indirect leak of 204 byte(s) in 3 object(s) allocated from: #0 0x55bea8 in realloc (/home/hanzhou/src/ovn/_build_as/northd/ovn-northd+0x55bea8) openvswitch#1 0x773c7d in xrealloc__ /home/hanzhou/src/ovs/_build/../lib/util.c:147:9 openvswitch#2 0x773c7d in xrealloc /home/hanzhou/src/ovs/_build/../lib/util.c:179:12 openvswitch#3 0x614bd4 in extract_addresses /home/hanzhou/src/ovn/_build_as/../lib/ovn-util.c:228:12 #4 0x614bd4 in extract_lsp_addresses /home/hanzhou/src/ovn/_build_as/../lib/ovn-util.c:243:20 #5 0x5c8d90 in parse_lsp_addrs /home/hanzhou/src/ovn/_build_as/../northd/northd.c:2468:21 openvswitch#6 0x5b2ebf in join_logical_ports /home/hanzhou/src/ovn/_build_as/../northd/northd.c:2594:13 #7 0x5b2ebf in build_ports /home/hanzhou/src/ovn/_build_as/../northd/northd.c:4711:5 #8 0x5b2ebf in ovnnb_db_run /home/hanzhou/src/ovn/_build_as/../northd/northd.c:17376:5 #9 0x60106c in en_northd_run /home/hanzhou/src/ovn/_build_as/../northd/en-northd.c:137:5 #10 0x61c6a8 in engine_recompute /home/hanzhou/src/ovn/_build_as/../lib/inc-proc-eng.c:415:5 #11 0x61bee0 in engine_compute /home/hanzhou/src/ovn/_build_as/../lib/inc-proc-eng.c:454:17 #12 0x61bee0 in engine_run_node /home/hanzhou/src/ovn/_build_as/../lib/inc-proc-eng.c:503:14 #13 0x61bee0 in engine_run /home/hanzhou/src/ovn/_build_as/../lib/inc-proc-eng.c:528:9 #14 0x605e23 in inc_proc_northd_run /home/hanzhou/src/ovn/_build_as/../northd/inc-proc-northd.c:317:9 #15 0x5fe43b in main /home/hanzhou/src/ovn/_build_as/../northd/ovn-northd.c:934:33 openvswitch#16 0x7f473933c1a1 in __libc_start_main (/lib64/libc.so.6+0x281a1) ... Fixes: b337750 ("northd: Incremental processing of VIF changes in 'northd' node.") Reported-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Han Zhou <hzhou@ovn.org> Acked-by: Dumitru Ceara <dceara@redhat.com>
It's specified in RFC 8415. This also avoids having to free/realloc the pfd->uuid.data memory. That part was not correct anyway and was flagged by ASAN as a memleak: Direct leak of 42 byte(s) in 3 object(s) allocated from: #0 0x55e5b6354c9e in malloc (/workspace/ovn-tmp/controller/ovn-controller+0x2edc9e) (BuildId: f963f8c756bd5a2207a9b3c922d4362e46bb3162) openvswitch#1 0x55e5b671878d in xmalloc__ /workspace/ovn-tmp/ovs/lib/util.c:140:15 openvswitch#2 0x55e5b671878d in xmalloc /workspace/ovn-tmp/ovs/lib/util.c:175:12 openvswitch#3 0x55e5b642cebc in pinctrl_parse_dhcpv6_reply /workspace/ovn-tmp/controller/pinctrl.c:997:20 #4 0x55e5b642cebc in pinctrl_handle_dhcp6_server /workspace/ovn-tmp/controller/pinctrl.c:1040:9 #5 0x55e5b642cebc in process_packet_in /workspace/ovn-tmp/controller/pinctrl.c:3210:9 openvswitch#6 0x55e5b642cebc in pinctrl_recv /workspace/ovn-tmp/controller/pinctrl.c:3290:9 #7 0x55e5b642cebc in pinctrl_handler /workspace/ovn-tmp/controller/pinctrl.c:3385:17 #8 0x55e5b66ef664 in ovsthread_wrapper /workspace/ovn-tmp/ovs/lib/ovs-thread.c:423:12 #9 0x7faa30194b42 (/lib/x86_64-linux-gnu/libc.so.6+0x94b42) (BuildId: 69389d485a9793dbe873f0ea2c93e02efaa9aa3d) Fixes: faa44a0 ("controller: IPv6 Prefix-Delegation: introduce RENEW/REBIND msg support") Signed-off-by: Ales Musil <amusil@redhat.com> Co-authored-by: Ales Musil <amusil@redhat.com> Signed-off-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
The new_buffer data pointer is NULL when the size of the cloned buffer is 0. This is fine as there is no need to allocate space. However, the clonned buffer header/msg might be the same pointer as data. Tis causes undefined behavior by adding 0 to NULL pointer. This was caught by OVN system test: lib/ofpbuf.c:203:56: runtime error: applying zero offset to null pointer #0 0xa012fc in ofpbuf_clone_with_headroom /workspace/ovn/ovs/lib/ofpbuf.c:203:56 #1 0x635fd4 in put_remote_port_redirect_overlay /workspace/ovn/controller/physical.c:397:40 openvswitch#2 0x635fd4 in consider_port_binding /workspace/ovn/controller/physical.c:1951:9 openvswitch#3 0x62e046 in physical_run /workspace/ovn/controller/physical.c:2447:9 #4 0x601d98 in en_pflow_output_run /workspace/ovn/controller/ovn-controller.c:4690:5 #5 0x707769 in engine_recompute /workspace/ovn/lib/inc-proc-eng.c:415:5 openvswitch#6 0x7060eb in engine_compute /workspace/ovn/lib/inc-proc-eng.c:454:17 #7 0x7060eb in engine_run_node /workspace/ovn/lib/inc-proc-eng.c:503:14 #8 0x7060eb in engine_run /workspace/ovn/lib/inc-proc-eng.c:528:9 #9 0x5f9f26 in main /workspace/ovn/controller/ovn-controller.c Signed-off-by: Ales Musil <amusil@redhat.com>
The new_buffer data pointer is NULL when the size of the cloned buffer is 0. This is fine as there is no need to allocate space. However, the cloned buffer header/msg might be the same pointer as data. This causes undefined behavior by adding 0 to NULL pointer. Check if the data buffer is not NULL before attempting to apply the header/msg offset. This was caught by OVN system test: lib/ofpbuf.c:203:56: runtime error: applying zero offset to null pointer #0 0xa012fc in ofpbuf_clone_with_headroom /workspace/ovn/ovs/lib/ofpbuf.c:203:56 #1 0x635fd4 in put_remote_port_redirect_overlay /workspace/ovn/controller/physical.c:397:40 openvswitch#2 0x635fd4 in consider_port_binding /workspace/ovn/controller/physical.c:1951:9 openvswitch#3 0x62e046 in physical_run /workspace/ovn/controller/physical.c:2447:9 #4 0x601d98 in en_pflow_output_run /workspace/ovn/controller/ovn-controller.c:4690:5 #5 0x707769 in engine_recompute /workspace/ovn/lib/inc-proc-eng.c:415:5 openvswitch#6 0x7060eb in engine_compute /workspace/ovn/lib/inc-proc-eng.c:454:17 #7 0x7060eb in engine_run_node /workspace/ovn/lib/inc-proc-eng.c:503:14 #8 0x7060eb in engine_run /workspace/ovn/lib/inc-proc-eng.c:528:9 #9 0x5f9f26 in main /workspace/ovn/controller/ovn-controller.c Signed-off-by: Ales Musil <amusil@redhat.com>
The new_buffer data pointer is NULL when the size of the cloned buffer is 0. This is fine as there is no need to allocate space. However, the cloned buffer header/msg might be the same pointer as data. This causes undefined behavior by adding 0 to NULL pointer. Check if the data buffer is not NULL before attempting to apply the header/msg offset. This was caught by OVN system test: lib/ofpbuf.c:203:56: runtime error: applying zero offset to null pointer #0 0xa012fc in ofpbuf_clone_with_headroom /workspace/ovn/ovs/lib/ofpbuf.c:203:56 openvswitch#1 0x635fd4 in put_remote_port_redirect_overlay /workspace/ovn/controller/physical.c:397:40 openvswitch#2 0x635fd4 in consider_port_binding /workspace/ovn/controller/physical.c:1951:9 openvswitch#3 0x62e046 in physical_run /workspace/ovn/controller/physical.c:2447:9 #4 0x601d98 in en_pflow_output_run /workspace/ovn/controller/ovn-controller.c:4690:5 #5 0x707769 in engine_recompute /workspace/ovn/lib/inc-proc-eng.c:415:5 openvswitch#6 0x7060eb in engine_compute /workspace/ovn/lib/inc-proc-eng.c:454:17 #7 0x7060eb in engine_run_node /workspace/ovn/lib/inc-proc-eng.c:503:14 #8 0x7060eb in engine_run /workspace/ovn/lib/inc-proc-eng.c:528:9 #9 0x5f9f26 in main /workspace/ovn/controller/ovn-controller.c Signed-off-by: Ales Musil <amusil@redhat.com> Signed-off-by: 0-day Robot <robot@bytheb.org>
The has was passed to memcpy as uin32_t *, however the hash bytes might be unaligned at that point. Case it to uint8_t * instead which has only single byte alignment requirement. lib/hash.c:46:22: runtime error: load of misaligned address 0x507000000065 for type 'const uint32_t *' (aka 'const unsigned int *'), which requires 4 byte alignment 0x507000000065: note: pointer points here 73 62 2e 73 6f 63 6b 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ^ #0 0x6191cb in hash_bytes /workspace/ovn/ovs/lib/hash.c:46:9 #1 0x69d064 in hash_string /workspace/ovn/ovs/lib/hash.h:404:12 openvswitch#2 0x69d064 in hash_name /workspace/ovn/ovs/lib/shash.c:29:12 openvswitch#3 0x69d064 in shash_find /workspace/ovn/ovs/lib/shash.c:237:49 #4 0x69dada in shash_find_data /workspace/ovn/ovs/lib/shash.c:251:31 #5 0x507987 in add_remote /workspace/ovn/ovs/ovsdb/ovsdb-server.c:1382:15 openvswitch#6 0x507987 in parse_options /workspace/ovn/ovs/ovsdb/ovsdb-server.c:2659:13 #7 0x507987 in main /workspace/ovn/ovs/ovsdb/ovsdb-server.c:751:5 #8 0x7f47e3997087 in __libc_start_call_main (/lib64/libc.so.6+0x2a087) (BuildId: b098f1c75a76548bb230d8f551eae07a2aeccf06) #9 0x7f47e399714a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a14a) (BuildId: b098f1c75a76548bb230d8f551eae07a2aeccf06) #10 0x42de64 in _start (/workspace/ovn/ovs/ovsdb/ovsdb-server+0x42de64) (BuildId: 6c3f4e311556b29f84c9c4a5d6df5114dc08a12e) SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior lib/hash.c:46:22 Signed-off-by: Ales Musil <amusil@redhat.com>
The has was passed to memcpy as uin32_t *, however the hash bytes might be unaligned at that point. Case it to uint8_t * instead which has only single byte alignment requirement. lib/hash.c:46:22: runtime error: load of misaligned address 0x507000000065 for type 'const uint32_t *' (aka 'const unsigned int *'), which requires 4 byte alignment 0x507000000065: note: pointer points here 73 62 2e 73 6f 63 6b 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ^ #0 0x6191cb in hash_bytes /workspace/ovn/ovs/lib/hash.c:46:9 openvswitch#1 0x69d064 in hash_string /workspace/ovn/ovs/lib/hash.h:404:12 openvswitch#2 0x69d064 in hash_name /workspace/ovn/ovs/lib/shash.c:29:12 openvswitch#3 0x69d064 in shash_find /workspace/ovn/ovs/lib/shash.c:237:49 #4 0x69dada in shash_find_data /workspace/ovn/ovs/lib/shash.c:251:31 #5 0x507987 in add_remote /workspace/ovn/ovs/ovsdb/ovsdb-server.c:1382:15 openvswitch#6 0x507987 in parse_options /workspace/ovn/ovs/ovsdb/ovsdb-server.c:2659:13 #7 0x507987 in main /workspace/ovn/ovs/ovsdb/ovsdb-server.c:751:5 #8 0x7f47e3997087 in __libc_start_call_main (/lib64/libc.so.6+0x2a087) (BuildId: b098f1c75a76548bb230d8f551eae07a2aeccf06) #9 0x7f47e399714a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a14a) (BuildId: b098f1c75a76548bb230d8f551eae07a2aeccf06) #10 0x42de64 in _start (/workspace/ovn/ovs/ovsdb/ovsdb-server+0x42de64) (BuildId: 6c3f4e311556b29f84c9c4a5d6df5114dc08a12e) SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior lib/hash.c:46:22 Signed-off-by: Ales Musil <amusil@redhat.com> Signed-off-by: 0-day Robot <robot@bytheb.org>
Summary
This series implements a cache for xlate_actions() so that full flow translation does not need to be performed for long-lived flows; instead, references are kept to the modules affected by it, and this cache can be used to attribute stats and provide other side-effects of translation. Revalidator threads construct the cache the first time a flow is dumped, which is why this only improves the situation for long-lived flows.
The "modules" which logic is implemented for are:
Results
These changes only improve long-lived flows, and the most benefit is gained when there is little or no revalidation. The time spent per flow to process stats is reduced by ~80% under this patchset, as compared with master. Under test conditions with occasional revalidation, this translates to the ability to support an average of double the flows in the datapath. Flow setup rate is impacted slightly by these changes, reducing by around 5-10%.
I plan to look into shifting ukey creation into handler threads, further reducing the workload for revalidator threads. This will allow these savings to also be applied to short-lived flows. That work is left for a future patchset.
Changelog
RFC-> V1