Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rfc pr3: make OVN test case passed #4

Merged
merged 2 commits into from
Jul 11, 2018
Merged

Rfc pr3: make OVN test case passed #4

merged 2 commits into from
Jul 11, 2018

Conversation

williamtu
Copy link
Owner

with some fixex:

  1: datapath - basic BPF commands                   ok
  2: datapath - ping between two ports               ok
  3: datapath - http between two ports               ok
  4: datapath - ping between two ports on vlan       ok
  5: datapath - ping between two ports on cvlan      ok
  6: datapath - ping6 between two ports              ok
  7: datapath - ping6 between two ports on vlan      ok
  8: datapath - ping6 between two ports on cvlan     ok
  9: datapath - ping over bond                       skipped (system-bpf-traffic.at:210)
 10: datapath - ping over vxlan tunnel               ok
 11: datapath - ping over vxlan6 tunnel              ok
 12: datapath - ping over gre tunnel                 ok
 13: datapath - ping over geneve tunnel              ok
 14: datapath - ping over geneve6 tunnel             ok
 15: datapath - clone action                         FAILED (system-bpf-traffic.at:445)
 16: datapath - mpls actions                         FAILED (system-bpf-traffic.at:484)
 19: ovn -- 1 LR connects 2 LSes                     ok

OVS could uses either SET or SET_MASKED action to set a
packet's fields.  Before the patch, we only implement set
field in SET_MASKED action.  A field is_set_tunnel is added
to distinguish whether under SET action, it is set tunnel,
or set fields.

Signed-off-by: William Tu <u9012063@gmail.com>
Not a bug, just warnings
lib/dpif-bpf-odp.c:593:31: warning: incorrect type in argument 1 (different base types)
lib/dpif-bpf-odp.c:593:31:    expected unsigned short [unsigned] [usertype] x

Signed-off-by: William Tu <u9012063@gmail.com>
@williamtu williamtu requested a review from yifsun July 11, 2018 22:49
@yifsun yifsun merged commit 534870f into rfc Jul 11, 2018
@williamtu williamtu deleted the rfc-pr3 branch August 8, 2018 16:30
williamtu pushed a commit that referenced this pull request Nov 21, 2018
There is a coredump when I add and delete bridges. When the rcu thread call
ofproto_destroy__, the main thread may call ofproto_create. But the
ofproto_destroy__ fun doesn't have the ofproto_mutex when access the
all_ofprotos.

 #0  0x00007f824aa0d197 in raise () from /usr/lib64/libc.so.6
 #1  0x00007f824aa0e888 in abort () from /usr/lib64/libc.so.6
 #2  0x0000000000658249 in PAT_abort ()
 #3  0x000000000065538d in patchIllInsHandler ()
 #4  <signal handler called>
 #5  0x0000000000478a5b in hmap_remove (node=0x3320150, hmap=0x95fc40 <all_ofprotos>) at include/openvswitch/hmap.h:287
 #6  ofproto_destroy__ (ofproto=0x3320150) at ofproto/ofproto.c:1642
 #7  0x0000000000535e46 in ovsrcu_call_postponed () at lib/ovs_rcu.c:323
 #8  0x0000000000536014 in ovsrcu_postpone_thread (arg=<optimized out>) at lib/ovs_rcu.c:338
 #9  0x0000000000538488 in ovsthread_wrapper (aux_=<optimized out>) at lib/ovs_thread.c:682
 #10 0x00007f824c130dc5 in start_thread () from /usr/lib64/libpthread.so.0
 #11 0x00007f824aacf7bd in clone () from /usr/lib64/libc.so.6

Signed-off-by: Cheng Liu <liucheng11@huawei.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
YiHungWei pushed a commit to YiHungWei/ovs-ebpf that referenced this pull request Mar 13, 2019
Problem Description:
The ovs-vswitchd is crashing while invoking flow-mod with upsupported
action(Tested with ovs2.10.1)

Steps to recreate:
Step 1) Create a flow
ovs-ofctl add-flow switch1
priority=228,dl_type=0x0800,dl_vlan="600",in_port=25,actions=output:ALL
This step is successful.

Step 2) Invoke flow-mod with incorrect contents.
ovs-ofctl mod-flows switch1
priority=228,dl_type=0x0800,dl_vlan="600",in_port=25,actions=output:ALL,mod_vlan_vid:50,mod_vlan_pcp=6,mod_nw_tos=16

In the above example, the ofproto provider I have, will return error for
rule_construct as set_fields come after Output.

However the OVS is ignoring the error (The return value of add_flow_init
is ignored in modify_flow_init_strict) and eventually the ovs-vswitched
crashes.

Crash backtrace:
-----------------------

Thread 1 "ovs-vswitchd" received signal SIGSEGV, Segmentation fault.

 0x00007f6a06e785fb in modify_flows_start__ (
   ofproto=ofproto@entry=0x55b289cecc28, ofm=ofm@entry=0x7ffdf7d57b70)
     at ofproto/ofproto.c:5402
 5402    in ofproto/ofproto.c
 (gdb) bt
 #0  0x00007f6a06e785fb in modify_flows_start__ (
   ofproto=ofproto@entry=0x55b289cecc28, ofm=ofm@entry=0x7ffdf7d57b70)
    at ofproto/ofproto.c:5402
 williamtu#1  0x00007f6a06e790db in modify_flows_start_loose (ofm=0x7ffdf7d57b70,
    ofproto=0x55b289cecc28) at ofproto/ofproto.c:5443
 williamtu#2  ofproto_flow_mod_start (ofproto=ofproto@entry=0x55b289cecc28,
     ofm=ofm@entry=0x7ffdf7d57b70) at ofproto/ofproto.c:7672
 williamtu#3  0x00007f6a06e79164 in handle_flow_mod__ (
    ofproto=ofproto@entry=0x55b289cecc28, fm=fm@entry=0x7ffdf7d57d20,
    req=req@entry=0x7ffdf7d57cd0) at ofproto/ofproto.c:5858
 williamtu#4  0x00007f6a06e792c2 in handle_flow_mod (ofconn=ofconn@entry
 =0x55b289d528c0,
oh=oh@entry=0x55b289d5a410) at ofproto/ofproto.c:5835
 williamtu#5  0x00007f6a06e7a173 in handle_openflow__ (msg=0x55b289d351d0,
    ofconn=0x55b289d528c0) at ofproto/ofproto.c:8127
 williamtu#6  handle_openflow (ofconn=0x55b289d528c0, ofp_msg=0x55b289d351d0)
  at ofproto/ofproto.c:8296
 williamtu#7  0x00007f6a06e6a013 in ofconn_run (
  handle_openflow=0x7f6a06e796f0 <handle_openflow>,
 ofconn=0x55b289d528c0)
 at ofproto/connmgr.c:1446
 williamtu#8  connmgr_run (mgr=0x55b289d14fe0,
 handle_openflow=handle_openflow@entry=0x7f6a06e796f0
handle_openflow>)
at ofproto/connmgr.c:365

With this fix, OVS does not crash.

Signed-off-by: Parameswaran Krishnamurthy <parkrish@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
williamtu pushed a commit that referenced this pull request Jun 28, 2019
Running ovsdb-server with empty private-key and non-empty certificate
(or otherwise) causes crash:

 # ovsdb-tool create ./etc/openvswitch/conf.db ./vswitch.ovsschema
 # ovsdb-server --remote=punix:./db.sock \
                --remote=db:Open_vSwitch,Open_vSwitch,manager_options \
                --private-key=db:Open_vSwitch,SSL,private_key \
                --certificate=db:Open_vSwitch,SSL,certificate \
                --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert

 # ovs-vsctl --no-wait init
 # ovs-vsctl --no-wait set-ssl pkey.key cert.cert ca.cert
 # ovs-vsctl --no-wait set SSL . private_key='""'
 # ovs-vsctl --no-wait set SSL . certificate='cert.new'

 ==25513==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000
 ==25513==The signal is caused by a READ memory access.
 ==25513==Hint: address points to the zero page.
    #0 0x7ff7582aa0a9 in __GI___strlen_sse2
    #1 0x7ff759bdde81  (/lib64/libasan.so.5+0xace81)
    #2 0x7ff759479932  (/lib64/libcrypto.so.1.1+0xb3932)
    #3 0x7ff759473c5a in BIO_ctrl (/lib64/libcrypto.so.1.1+0xadc5a)
    #4 0x7ff7598decc1 in SSL_CTX_use_certificate_file (/lib64/libssl.so.1.1+0x40cc1)
    #5 0x4dbaa7 in stream_ssl_set_certificate_file__ lib/stream-ssl.c:1170
    #6 0x4dca2e in stream_ssl_set_key_and_cert lib/stream-ssl.c:1216
    #7 0x4146b2 in reconfigure_ssl ovsdb/ovsdb-server.c:1254
    #8 0x409c83 in main ovsdb/ovsdb-server.c:368
    #9 0x7ff758233812 in __libc_start_main
    #10 0x40f6bd in _start (ovsdb-server+0x40f6bd)

 AddressSanitizer can not provide additional info.
 SUMMARY: AddressSanitizer: SEGV (/lib64/libc.so.6+0x9a0a9) in __GI___strlen_sse2
 ==25513==ABORTING

Another way to reproduce is to use non-initialized DB entry for
private-key and a file for certificate in ovsdb-server cmdline.

The root cause is that stream_ssl_set_key_and_cert() triggers
configuration for both key and cert if any of them is valid, keeping
it possible for one of them to be NULL.

Fixes: 6f1e91b ("stream-ssl: Make changing keys and certificate at runtime reliable.")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ben Pfaff <blp@ovn.org>
williamtu pushed a commit that referenced this pull request Jul 31, 2019
The following, run inside the OVS sandbox, caused OVS to abort when
Address Sanitizer was used:

    ovs-vsctl add-br br-int
    ovs-ofctl add-flow br-int "table=0,cookie=0x1234,priority=10000,icmp,actions=drop"
    ovs-ofctl --strict del-flows br-int "table=0,cookie=0x1234/-1,priority=10000"

Sample report from Address Sanitizer:

==3029==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000043260 at pc 0x7f6b09c2459b bp 0x7ffcb67e7540 sp 0x7ffcb67e6cf0
READ of size 40 at 0x603000043260 thread T0
    #0 0x7f6b09c2459a  (/lib/x86_64-linux-gnu/libasan.so.5+0xb859a)
    #1 0x565110a748a5 in minimask_equal ../lib/flow.c:3510
    #2 0x565110a9ea41 in minimatch_equal ../lib/match.c:1821
    #3 0x56511091e864 in collect_rules_strict ../ofproto/ofproto.c:4516
    #4 0x56511093d526 in delete_flow_start_strict ../ofproto/ofproto.c:5959
    #5 0x56511093d526 in ofproto_flow_mod_start ../ofproto/ofproto.c:7949
    #6 0x56511093d77b in handle_flow_mod__ ../ofproto/ofproto.c:6122
    #7 0x56511093db71 in handle_flow_mod ../ofproto/ofproto.c:6099
    #8 0x5651109407f6 in handle_single_part_openflow ../ofproto/ofproto.c:8406
    #9 0x5651109407f6 in handle_openflow ../ofproto/ofproto.c:8587
    #10 0x5651109e40da in ofconn_run ../ofproto/connmgr.c:1318
    #11 0x5651109e40da in connmgr_run ../ofproto/connmgr.c:355
    #12 0x56511092b129 in ofproto_run ../ofproto/ofproto.c:1826
    #13 0x5651108f23cd in bridge_run__ ../vswitchd/bridge.c:2965
    #14 0x565110904887 in bridge_run ../vswitchd/bridge.c:3023
    #15 0x5651108e659c in main ../vswitchd/ovs-vswitchd.c:127
    #16 0x7f6b093b709a in __libc_start_main ../csu/libc-start.c:308
    #17 0x5651108e9009 in _start (/home/blp/nicira/ovs/_build/vswitchd/ovs-vswitchd+0x11d009)

This fixes the problem, which although largely theoretical could crop up
with odd implementations of memcmp(), perhaps ones optimized in various
"clever" ways.  All in all, it seems best to avoid the theoretical problem.

Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
PlumeTomi pushed a commit to PlumeTomi/ovs-ebpf that referenced this pull request Oct 8, 2020
Running ovsdb-server with empty private-key and non-empty certificate
(or otherwise) causes crash:

 # ovsdb-tool create ./etc/openvswitch/conf.db ./vswitch.ovsschema
 # ovsdb-server --remote=punix:./db.sock \
                --remote=db:Open_vSwitch,Open_vSwitch,manager_options \
                --private-key=db:Open_vSwitch,SSL,private_key \
                --certificate=db:Open_vSwitch,SSL,certificate \
                --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert

 # ovs-vsctl --no-wait init
 # ovs-vsctl --no-wait set-ssl pkey.key cert.cert ca.cert
 # ovs-vsctl --no-wait set SSL . private_key='""'
 # ovs-vsctl --no-wait set SSL . certificate='cert.new'

 ==25513==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000
 ==25513==The signal is caused by a READ memory access.
 ==25513==Hint: address points to the zero page.
    #0 0x7ff7582aa0a9 in __GI___strlen_sse2
    williamtu#1 0x7ff759bdde81  (/lib64/libasan.so.5+0xace81)
    williamtu#2 0x7ff759479932  (/lib64/libcrypto.so.1.1+0xb3932)
    williamtu#3 0x7ff759473c5a in BIO_ctrl (/lib64/libcrypto.so.1.1+0xadc5a)
    williamtu#4 0x7ff7598decc1 in SSL_CTX_use_certificate_file (/lib64/libssl.so.1.1+0x40cc1)
    williamtu#5 0x4dbaa7 in stream_ssl_set_certificate_file__ lib/stream-ssl.c:1170
    williamtu#6 0x4dca2e in stream_ssl_set_key_and_cert lib/stream-ssl.c:1216
    williamtu#7 0x4146b2 in reconfigure_ssl ovsdb/ovsdb-server.c:1254
    williamtu#8 0x409c83 in main ovsdb/ovsdb-server.c:368
    williamtu#9 0x7ff758233812 in __libc_start_main
    williamtu#10 0x40f6bd in _start (ovsdb-server+0x40f6bd)

 AddressSanitizer can not provide additional info.
 SUMMARY: AddressSanitizer: SEGV (/lib64/libc.so.6+0x9a0a9) in __GI___strlen_sse2
 ==25513==ABORTING

Another way to reproduce is to use non-initialized DB entry for
private-key and a file for certificate in ovsdb-server cmdline.

The root cause is that stream_ssl_set_key_and_cert() triggers
configuration for both key and cert if any of them is valid, keeping
it possible for one of them to be NULL.

Fixes: 6f1e91b ("stream-ssl: Make changing keys and certificate at runtime reliable.")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ben Pfaff <blp@ovn.org>
PlumeTomi pushed a commit to PlumeTomi/ovs-ebpf that referenced this pull request Oct 8, 2020
Should use flow->actions not &flow->actions.

here is ASAN report:
=================================================================
==57189==ERROR: AddressSanitizer: stack-buffer-overflow on address 0xffff428fa0e8 at pc 0xffff7f61a520 bp 0xffff428f9420 sp 0xffff428f9498 READ of size 196 at 0xffff428fa0e8 thread T150 (revalidator22)
    #0 0xffff7f61a51f in __interceptor_memcpy (/lib64/libasan.so.4+0xa251f)
    williamtu#1 0xaaaad26a3b2b in ofpbuf_put lib/ofpbuf.c:426
    williamtu#2 0xaaaad26a30cb in ofpbuf_clone_data_with_headroom lib/ofpbuf.c:248
    williamtu#3 0xaaaad26a2e77 in ofpbuf_clone_with_headroom lib/ofpbuf.c:218
    williamtu#4 0xaaaad26a2dc3 in ofpbuf_clone lib/ofpbuf.c:208
    williamtu#5 0xaaaad23e3993 in ukey_set_actions ofproto/ofproto-dpif-upcall.c:1640
    williamtu#6 0xaaaad23e3f03 in ukey_create__ ofproto/ofproto-dpif-upcall.c:1696
    williamtu#7 0xaaaad23e553f in ukey_create_from_dpif_flow ofproto/ofproto-dpif-upcall.c:1806
    williamtu#8 0xaaaad23e65fb in ukey_acquire ofproto/ofproto-dpif-upcall.c:1984
    williamtu#9 0xaaaad23eb583 in revalidate ofproto/ofproto-dpif-upcall.c:2625
    williamtu#10 0xaaaad23dee5f in udpif_revalidator ofproto/ofproto-dpif-upcall.c:1076
    williamtu#11 0xaaaad26b84ef in ovsthread_wrapper lib/ovs-thread.c:708
    williamtu#12 0xffff7e74a8bb in start_thread (/lib64/libpthread.so.0+0x78bb)
    williamtu#13 0xffff7e0665cb in thread_start (/lib64/libc.so.6+0xd55cb)

Address 0xffff428fa0e8 is located in stack of thread T150 (revalidator22) at offset 328 in frame
    #0 0xaaaad23e4cab in ukey_create_from_dpif_flow ofproto/ofproto-dpif-upcall.c:1762

  This frame has 4 object(s):
    [32, 96) 'actions'
    [128, 192) 'buf'
    [224, 328) 'full_flow'
    [384, 2432) 'stub' <== Memory access at offset 328 partially underflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
      (longjmp and C++ exceptions *are* supported) Thread T150 (revalidator22) created by T0 here:
    #0 0xffff7f5b0f7f in __interceptor_pthread_create (/lib64/libasan.so.4+0x38f7f)
    williamtu#1 0xaaaad26b891f in ovs_thread_create lib/ovs-thread.c:792
    williamtu#2 0xaaaad23dc62f in udpif_start_threads ofproto/ofproto-dpif-upcall.c:639
    williamtu#3 0xaaaad23daf87 in ofproto_set_flow_table ofproto/ofproto-dpif-upcall.c:446
    williamtu#4 0xaaaad230ff7f in dpdk_evs_cfg_set vswitchd/bridge.c:1134
    williamtu#5 0xaaaad2310097 in bridge_reconfigure vswitchd/bridge.c:1148
    williamtu#6 0xaaaad23279d7 in bridge_run vswitchd/bridge.c:3944
    williamtu#7 0xaaaad23365a3 in main vswitchd/ovs-vswitchd.c:240
    williamtu#8 0xffff7dfb1adf in __libc_start_main (/lib64/libc.so.6+0x20adf)
    williamtu#9 0xaaaad230a3d3  (/usr/sbin/ovs-vswitchd-2.7.0-1.1.RC5.001.asan+0x26f3d3)

SUMMARY: AddressSanitizer: stack-buffer-overflow (/lib64/libasan.so.4+0xa251f) in __interceptor_memcpy Shadow bytes around the buggy address:
  0x200fe851f3c0: 00 00 00 00 f1 f1 f1 f1 f8 f2 f2 f2 00 00 00 00
  0x200fe851f3d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x200fe851f3e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x200fe851f3f0: 00 00 00 00 f1 f1 f1 f1 00 00 00 00 00 00 00 00
  0x200fe851f400: f2 f2 f2 f2 f8 f8 f8 f8 f8 f8 f8 f8 f2 f2 f2 f2
=>0x200fe851f410: 00 00 00 00 00 00 00 00 00 00 00 00 00[f2]f2 f2
  0x200fe851f420: f2 f2 f2 f2 00 00 00 00 00 00 00 00 00 00 00 00
  0x200fe851f430: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x200fe851f440: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x200fe851f450: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x200fe851f460: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==57189==ABORTING

Acked-by: Numan Siddique <numans@ovn.org>
Signed-off-by: Linhaifeng <haifeng.lin@huawei.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
PlumeTomi pushed a commit to PlumeTomi/ovs-ebpf that referenced this pull request Oct 8, 2020
…LE_IPV*.

Both ovs_key_ct_tuple_ipv* structures contains padding at the end
that mast be cleared before passing attributes to kernel:

 Syscall param sendmsg(msg.msg_iov[0]) points to uninitialised byte(s)
    at 0x566A607: sendmsg (sendmsg.c:28)
    by 0xFC95CE: nl_sock_transact_multiple__ (netlink-socket.c:858)
    by 0xFC8580: nl_sock_transact_multiple (netlink-socket.c:1079)
    by 0xFC83FF: nl_transact_multiple (netlink-socket.c:1839)
    by 0xFA8648: dpif_netlink_operate__ (dpif-netlink.c:1926)
    by 0xFA789F: dpif_netlink_operate_chunks (dpif-netlink.c:2219)
    by 0xFA25CB: dpif_netlink_operate (dpif-netlink.c:2278)
    by 0xE5BB4C: dpif_operate (dpif.c:1377)
    by 0xE5B7F6: dpif_flow_put (dpif.c:1048)
    by 0xE5B49A: dpif_probe_feature (dpif.c:965)
    by 0xDD6BF5: check_ct_orig_tuple (ofproto-dpif.c:1557)
    by 0xDD41EC: check_support (ofproto-dpif.c:1590)
    by 0xDD3BF3: open_dpif_backer (ofproto-dpif.c:818)
    by 0xDC8467: construct (ofproto-dpif.c:1605)
    by 0xDAD6BB: ofproto_create (ofproto.c:549)
    by 0xD96A19: bridge_reconfigure (bridge.c:877)
    by 0xD9625D: bridge_run (bridge.c:3324)
    by 0xDA5829: main (ovs-vswitchd.c:127)
  Address 0x1ffefe36a5 is on thread 1's stack
  in frame williamtu#4, created by dpif_netlink_operate__ (dpif-netlink.c:1839)
  Uninitialised value was created by a stack allocation
    at 0xEB87D0: odp_flow_key_from_flow__ (odp-util.c:5996)

Fixes: daf4d3c ("odp: Support conntrack orig tuple key.")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
PlumeTomi pushed a commit to PlumeTomi/ovs-ebpf that referenced this pull request Aug 8, 2021
Should use flow->actions not &flow->actions.

here is ASAN report:
=================================================================
==57189==ERROR: AddressSanitizer: stack-buffer-overflow on address 0xffff428fa0e8 at pc 0xffff7f61a520 bp 0xffff428f9420 sp 0xffff428f9498 READ of size 196 at 0xffff428fa0e8 thread T150 (revalidator22)
    #0 0xffff7f61a51f in __interceptor_memcpy (/lib64/libasan.so.4+0xa251f)
    williamtu#1 0xaaaad26a3b2b in ofpbuf_put lib/ofpbuf.c:426
    williamtu#2 0xaaaad26a30cb in ofpbuf_clone_data_with_headroom lib/ofpbuf.c:248
    williamtu#3 0xaaaad26a2e77 in ofpbuf_clone_with_headroom lib/ofpbuf.c:218
    williamtu#4 0xaaaad26a2dc3 in ofpbuf_clone lib/ofpbuf.c:208
    williamtu#5 0xaaaad23e3993 in ukey_set_actions ofproto/ofproto-dpif-upcall.c:1640
    williamtu#6 0xaaaad23e3f03 in ukey_create__ ofproto/ofproto-dpif-upcall.c:1696
    williamtu#7 0xaaaad23e553f in ukey_create_from_dpif_flow ofproto/ofproto-dpif-upcall.c:1806
    williamtu#8 0xaaaad23e65fb in ukey_acquire ofproto/ofproto-dpif-upcall.c:1984
    williamtu#9 0xaaaad23eb583 in revalidate ofproto/ofproto-dpif-upcall.c:2625
    williamtu#10 0xaaaad23dee5f in udpif_revalidator ofproto/ofproto-dpif-upcall.c:1076
    williamtu#11 0xaaaad26b84ef in ovsthread_wrapper lib/ovs-thread.c:708
    williamtu#12 0xffff7e74a8bb in start_thread (/lib64/libpthread.so.0+0x78bb)
    williamtu#13 0xffff7e0665cb in thread_start (/lib64/libc.so.6+0xd55cb)

Address 0xffff428fa0e8 is located in stack of thread T150 (revalidator22) at offset 328 in frame
    #0 0xaaaad23e4cab in ukey_create_from_dpif_flow ofproto/ofproto-dpif-upcall.c:1762

  This frame has 4 object(s):
    [32, 96) 'actions'
    [128, 192) 'buf'
    [224, 328) 'full_flow'
    [384, 2432) 'stub' <== Memory access at offset 328 partially underflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
      (longjmp and C++ exceptions *are* supported) Thread T150 (revalidator22) created by T0 here:
    #0 0xffff7f5b0f7f in __interceptor_pthread_create (/lib64/libasan.so.4+0x38f7f)
    williamtu#1 0xaaaad26b891f in ovs_thread_create lib/ovs-thread.c:792
    williamtu#2 0xaaaad23dc62f in udpif_start_threads ofproto/ofproto-dpif-upcall.c:639
    williamtu#3 0xaaaad23daf87 in ofproto_set_flow_table ofproto/ofproto-dpif-upcall.c:446
    williamtu#4 0xaaaad230ff7f in dpdk_evs_cfg_set vswitchd/bridge.c:1134
    williamtu#5 0xaaaad2310097 in bridge_reconfigure vswitchd/bridge.c:1148
    williamtu#6 0xaaaad23279d7 in bridge_run vswitchd/bridge.c:3944
    williamtu#7 0xaaaad23365a3 in main vswitchd/ovs-vswitchd.c:240
    williamtu#8 0xffff7dfb1adf in __libc_start_main (/lib64/libc.so.6+0x20adf)
    williamtu#9 0xaaaad230a3d3  (/usr/sbin/ovs-vswitchd-2.7.0-1.1.RC5.001.asan+0x26f3d3)

SUMMARY: AddressSanitizer: stack-buffer-overflow (/lib64/libasan.so.4+0xa251f) in __interceptor_memcpy Shadow bytes around the buggy address:
  0x200fe851f3c0: 00 00 00 00 f1 f1 f1 f1 f8 f2 f2 f2 00 00 00 00
  0x200fe851f3d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x200fe851f3e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x200fe851f3f0: 00 00 00 00 f1 f1 f1 f1 00 00 00 00 00 00 00 00
  0x200fe851f400: f2 f2 f2 f2 f8 f8 f8 f8 f8 f8 f8 f8 f2 f2 f2 f2
=>0x200fe851f410: 00 00 00 00 00 00 00 00 00 00 00 00 00[f2]f2 f2
  0x200fe851f420: f2 f2 f2 f2 00 00 00 00 00 00 00 00 00 00 00 00
  0x200fe851f430: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x200fe851f440: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x200fe851f450: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x200fe851f460: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==57189==ABORTING

Acked-by: Numan Siddique <numans@ovn.org>
Signed-off-by: Linhaifeng <haifeng.lin@huawei.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
PlumeTomi pushed a commit to PlumeTomi/ovs-ebpf that referenced this pull request Aug 8, 2021
…LE_IPV*.

Both ovs_key_ct_tuple_ipv* structures contains padding at the end
that mast be cleared before passing attributes to kernel:

 Syscall param sendmsg(msg.msg_iov[0]) points to uninitialised byte(s)
    at 0x566A607: sendmsg (sendmsg.c:28)
    by 0xFC95CE: nl_sock_transact_multiple__ (netlink-socket.c:858)
    by 0xFC8580: nl_sock_transact_multiple (netlink-socket.c:1079)
    by 0xFC83FF: nl_transact_multiple (netlink-socket.c:1839)
    by 0xFA8648: dpif_netlink_operate__ (dpif-netlink.c:1926)
    by 0xFA789F: dpif_netlink_operate_chunks (dpif-netlink.c:2219)
    by 0xFA25CB: dpif_netlink_operate (dpif-netlink.c:2278)
    by 0xE5BB4C: dpif_operate (dpif.c:1377)
    by 0xE5B7F6: dpif_flow_put (dpif.c:1048)
    by 0xE5B49A: dpif_probe_feature (dpif.c:965)
    by 0xDD6BF5: check_ct_orig_tuple (ofproto-dpif.c:1557)
    by 0xDD41EC: check_support (ofproto-dpif.c:1590)
    by 0xDD3BF3: open_dpif_backer (ofproto-dpif.c:818)
    by 0xDC8467: construct (ofproto-dpif.c:1605)
    by 0xDAD6BB: ofproto_create (ofproto.c:549)
    by 0xD96A19: bridge_reconfigure (bridge.c:877)
    by 0xD9625D: bridge_run (bridge.c:3324)
    by 0xDA5829: main (ovs-vswitchd.c:127)
  Address 0x1ffefe36a5 is on thread 1's stack
  in frame williamtu#4, created by dpif_netlink_operate__ (dpif-netlink.c:1839)
  Uninitialised value was created by a stack allocation
    at 0xEB87D0: odp_flow_key_from_flow__ (odp-util.c:5996)

Fixes: daf4d3c ("odp: Support conntrack orig tuple key.")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
PlumeTomi pushed a commit to PlumeTomi/ovs-ebpf that referenced this pull request Aug 8, 2021
OVS uses memcmp to compare actions of existing and new flows, but
'struct ofp_ed_prop_nsh_md_type' and corresponding ofpact structure has
3 bytes of padding that never initialized and passed around within OF
data structures and messages.

  Uninitialized bytes in MemcmpInterceptorCommon
    at offset 21 inside [0x7090000003f8, 136)
  WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x4a184e in bcmp (vswitchd/ovs-vswitchd+0x4a184e)
    williamtu#1 0x896c8a in ofpacts_equal lib/ofp-actions.c:9121:31
    williamtu#2 0x564403 in replace_rule_finish ofproto/ofproto.c:5650:37
    williamtu#3 0x563462 in add_flow_finish ofproto/ofproto.c:5218:13
    williamtu#4 0x54a1ff in ofproto_flow_mod_finish ofproto/ofproto.c:8091:17
    williamtu#5 0x5433b2 in handle_flow_mod__ ofproto/ofproto.c:6216:17
    williamtu#6 0x56a2fc in handle_flow_mod ofproto/ofproto.c:6190:17
    williamtu#7 0x565bda in handle_single_part_openflow ofproto/ofproto.c:8504:16
    williamtu#8 0x540b25 in handle_openflow ofproto/ofproto.c:8685:21
    williamtu#9 0x6697fd in ofconn_run ofproto/connmgr.c:1329:13
    williamtu#10 0x668e6e in connmgr_run ofproto/connmgr.c:356:9
    williamtu#11 0x53f1bc in ofproto_run ofproto/ofproto.c:1890:5
    williamtu#12 0x4ead0c in bridge_run__ vswitchd/bridge.c:3250:9
    williamtu#13 0x4e9bc8 in bridge_run vswitchd/bridge.c:3309:5
    williamtu#14 0x51c072 in main vswitchd/ovs-vswitchd.c:127:9
    williamtu#15 0x7f23a99011a2 in __libc_start_main (/lib64/libc.so.6)
    williamtu#16 0x46b92d in _start (vswitchd/ovs-vswitchd+0x46b92d)

  Uninitialized value was stored to memory at
    #0 0x4745aa in __msan_memcpy.part.0 (vswitchd/ovs-vswitchd)
    williamtu#1 0x54529f in rule_actions_create ofproto/ofproto.c:3134:5
    williamtu#2 0x54915e in ofproto_rule_create ofproto/ofproto.c:5284:11
    williamtu#3 0x55d419 in add_flow_init ofproto/ofproto.c:5123:17
    williamtu#4 0x54841f in ofproto_flow_mod_init ofproto/ofproto.c:7987:17
    williamtu#5 0x543250 in handle_flow_mod__ ofproto/ofproto.c:6206:13
    williamtu#6 0x56a2fc in handle_flow_mod ofproto/ofproto.c:6190:17
    williamtu#7 0x565bda in handle_single_part_openflow ofproto/ofproto.c:8504:16
    williamtu#8 0x540b25 in handle_openflow ofproto/ofproto.c:8685:21
    williamtu#9 0x6697fd in ofconn_run ofproto/connmgr.c:1329:13
    williamtu#10 0x668e6e in connmgr_run ofproto/connmgr.c:356:9
    williamtu#11 0x53f1bc in ofproto_run ofproto/ofproto.c:1890:5
    williamtu#12 0x4ead0c in bridge_run__ vswitchd/bridge.c:3250:9
    williamtu#13 0x4e9bc8 in bridge_run vswitchd/bridge.c:3309:5
    williamtu#14 0x51c072 in main vswitchd/ovs-vswitchd.c:127:9
    williamtu#15 0x7f23a99011a2 in __libc_start_main (/lib64/libc.so.6)

  Uninitialized value was created by an allocation of 'ofpacts_stub'
  in the stack frame of function 'handle_flow_mod'
    #0 0x569e80 in handle_flow_mod ofproto/ofproto.c:6170

This could cause issues with flow modifications or other operations.

To reproduce, some NSH tests could be run under valgrind or clang
MemorySantizer. Ex. "nsh - md1 encap over a veth link" test.

Fix that by clearing padding bytes while encoding and decoding.
OVS will still accept OF messages with non-zero padding from
controllers.

New tests added to tests/ofp-actions.at.

Fixes: 1fc11c5 ("Generic encap and decap support for NSH")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Jan Scheurich <jan.scheurich@ericsson.com>
PlumeTomi pushed a commit to PlumeTomi/ovs-ebpf that referenced this pull request Aug 8, 2021
If datapath flow doesn't have one of the fields of gtpu metadata, e.g.
'tunnel(gtpu())', uninitialized stack memory will be used instead.

 ==3485429==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x853a1b in format_u8x lib/odp-util.c:3474:13
    williamtu#1 0x86ee9c in format_odp_tun_gtpu_opt lib/odp-util.c:3713:5
    williamtu#2 0x86a099 in format_odp_tun_attr lib/odp-util.c:3973:13
    williamtu#3 0x83afe6 in format_odp_key_attr__ lib/odp-util.c:4179:9
    williamtu#4 0x838afb in odp_flow_format lib/odp-util.c:4563:17
    williamtu#5 0x738422 in log_flow_message lib/dpif.c:1750:5
    williamtu#6 0x738e2f in log_flow_put_message lib/dpif.c:1784:9
    williamtu#7 0x7371a4 in dpif_operate lib/dpif.c:1377:21
    williamtu#8 0x7363ef in dpif_flow_put lib/dpif.c:1035:5
    williamtu#9 0xc7aab7 in dpctl_put_flow lib/dpctl.c:1171:13
    williamtu#10 0xc65a4f in dpctl_unixctl_handler lib/dpctl.c:2701:17
    williamtu#11 0xaaad04 in process_command lib/unixctl.c:308:13
    williamtu#12 0xaa87f7 in run_connection lib/unixctl.c:342:17
    williamtu#13 0xaa842e in unixctl_server_run lib/unixctl.c:393:21
    williamtu#14 0x51c09c in main vswitchd/ovs-vswitchd.c:128:9
    williamtu#15 0x7f88344391a2 in __libc_start_main (/lib64/libc.so.6+0x271a2)
    williamtu#16 0x46b92d in _start (vswitchd/ovs-vswitchd+0x46b92d)

  Uninitialized value was stored to memory at
    #0 0x87da17 in scan_gtpu_metadata lib/odp-util.c:5221:27
    williamtu#1 0x874588 in parse_odp_key_mask_attr__ lib/odp-util.c:5862:9
    williamtu#2 0x83ee14 in parse_odp_key_mask_attr lib/odp-util.c:5808:18
    williamtu#3 0x83e8b5 in odp_flow_from_string lib/odp-util.c:6065:18
    williamtu#4 0xc7a4f3 in dpctl_put_flow lib/dpctl.c:1145:13
    williamtu#5 0xc65a4f in dpctl_unixctl_handler lib/dpctl.c:2701:17
    williamtu#6 0xaaad04 in process_command lib/unixctl.c:308:13
    williamtu#7 0xaa87f7 in run_connection lib/unixctl.c:342:17
    williamtu#8 0xaa842e in unixctl_server_run lib/unixctl.c:393:21
    williamtu#9 0x51c09c in main vswitchd/ovs-vswitchd.c:128:9
    williamtu#10 0x7f88344391a2 in __libc_start_main (/lib64/libc.so.6+0x271a2)

  Uninitialized value was created by an allocation of 'msgtype_ma' in the
  stack frame of function 'scan_gtpu_metadata'
    #0 0x87d440 in scan_gtpu_metadata lib/odp-util.c:5187

Fix that by initializing fields to all zeroes by default.

Reported-at: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=21426
Fixes: 3c6d05a ("userspace: Add GTP-U support.")
Acked-by: Yi Yang <yangyi01@inspur.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
PlumeTomi pushed a commit to PlumeTomi/ovs-ebpf that referenced this pull request Aug 8, 2021
Length of nested attributes must be checked before storing to the
header.  If current length exceeds the maximum value parsing should
fail, otherwise the length value will be truncated leading to
corrupted netlink message and out-of-bound memory accesses:

  ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6310002cc838
         at pc 0x000000575470 bp 0x7ffc6c322d60 sp 0x7ffc6c322d58
  READ of size 1 at 0x6310002cc838 thread T0
  SCARINESS: 12 (1-byte-read-heap-buffer-overflow)
    #0 0x57546f in format_generic_odp_key lib/odp-util.c:2738:39
    williamtu#1 0x559e70 in check_attr_len lib/odp-util.c:3572:13
    williamtu#2 0x56581a in format_odp_key_attr lib/odp-util.c:4392:9
    williamtu#3 0x5563b9 in format_odp_action lib/odp-util.c:1192:9
    williamtu#4 0x555d75 in format_odp_actions lib/odp-util.c:1279:13
    ...

Fix that by checking the length of nested netlink attributes before
updating 'nla_len' inside the header.  Additionally introduced
assertion inside nl_msg_end_nested() to catch this kind of issues
before actual overflow happened.

Credit to OSS-Fuzz.

Reported-at: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=20003
Fixes: 65da723 ("odp-util: Format tunnel attributes directly from netlink.")
Acked-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
PlumeTomi pushed a commit to PlumeTomi/ovs-ebpf that referenced this pull request Aug 8, 2021
When idl removes orphan rows, those rows are inserted into the
'track_list'.  This allows iterators such as *_FOR_EACH_TRACKED () to
return orphan rows that never had any data to the IDL user.  In this
case, it is difficult for the user to understand whether it is a row
with no data (there was no "insert" / "modify" for this row) or it is
a row with zero data (columns were cleared by DB transaction).

The main problem with this condition is that rows without data will
have NULL pointers instead of references that should be there according
to the database schema.  For example, ovn-controller might crash:

 ERROR: AddressSanitizer: SEGV on unknown address 0x000000000100
       (pc 0x00000055e9b2 bp 0x7ffef6180880 sp 0x7ffef6180860 T0)
 The signal is caused by a READ memory access.
 Hint: address points to the zero page.
    #0 0x55e9b1 in handle_deleted_lport /controller/binding.c
    williamtu#1 0x55e903 in handle_deleted_vif_lport /controller/binding.c:2072:5
    williamtu#2 0x55e059 in binding_handle_port_binding_changes /controller/binding.c:2155:23
    williamtu#3 0x5a6395 in runtime_data_sb_port_binding_handler /controller/ovn-controller.c:1454:10
    williamtu#4 0x5e15b3 in engine_compute /lib/inc-proc-eng.c:306:18
    williamtu#5 0x5e0faf in engine_run_node /lib/inc-proc-eng.c:352:14
    williamtu#6 0x5e0e04 in engine_run /lib/inc-proc-eng.c:377:9
    williamtu#7 0x5a03de in main /controller/ovn-controller.c
    williamtu#8 0x7f4fd9c991a2 in __libc_start_main (/lib64/libc.so.6+0x271a2)
    williamtu#9 0x483f0d in _start (/controller/ovn-controller+0x483f0d)

It doesn't make much sense to return non-real rows to the user, so it's
best to exclude them from iteration.

Test included.  Without the fix, provided test will print empty orphan
rows that was never received by idl as tracked changes.

Fixes: 932104f ("ovsdb-idl: Add support for change tracking.")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Dumitru Ceara <dceara@redhat.com>
PlumeTomi pushed a commit to PlumeTomi/ovs-ebpf that referenced this pull request Aug 8, 2021
While decoding RAW_ENCAP action, decode_ed_prop() might re-allocate
ofpbuf if there is no enough space left.  However, function
'decode_NXAST_RAW_ENCAP' continues to use old pointer to 'encap'
structure leading to write-after-free and incorrect decoding.

  ==3549105==ERROR: AddressSanitizer: heap-use-after-free on address
  0x60600000011a at pc 0x0000005f6cc6 bp 0x7ffc3a2d4410 sp 0x7ffc3a2d4408
  WRITE of size 2 at 0x60600000011a thread T0
    #0 0x5f6cc5 in decode_NXAST_RAW_ENCAP lib/ofp-actions.c:4461:20
    williamtu#1 0x5f0551 in ofpact_decode ./lib/ofp-actions.inc2:4777:16
    williamtu#2 0x5ed17c in ofpacts_decode lib/ofp-actions.c:7752:21
    williamtu#3 0x5eba9a in ofpacts_pull_openflow_actions__ lib/ofp-actions.c:7791:13
    williamtu#4 0x5eb9fc in ofpacts_pull_openflow_actions lib/ofp-actions.c:7835:12
    williamtu#5 0x64bb8b in ofputil_decode_packet_out lib/ofp-packet.c:1113:17
    williamtu#6 0x65b6f4 in ofp_print_packet_out lib/ofp-print.c:148:13
    williamtu#7 0x659e3f in ofp_to_string__ lib/ofp-print.c:1029:16
    williamtu#8 0x659b24 in ofp_to_string lib/ofp-print.c:1244:21
    williamtu#9 0x65a28c in ofp_print lib/ofp-print.c:1288:28
    williamtu#10 0x540d11 in ofctl_ofp_parse utilities/ovs-ofctl.c:2814:9
    williamtu#11 0x564228 in ovs_cmdl_run_command__ lib/command-line.c:247:17
    williamtu#12 0x56408a in ovs_cmdl_run_command lib/command-line.c:278:5
    williamtu#13 0x5391ae in main utilities/ovs-ofctl.c:179:9
    williamtu#14 0x7f6911ce9081 in __libc_start_main (/lib64/libc.so.6+0x27081)
    williamtu#15 0x461fed in _start (utilities/ovs-ofctl+0x461fed)

Fix that by getting a new pointer before using.

Credit to OSS-Fuzz.

Fuzzer regression test will fail only with AddressSanitizer enabled.

Reported-at: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=27851
Fixes: f839892 ("OF support and translation of generic encap and decap")
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
PlumeTomi pushed a commit to PlumeTomi/ovs-ebpf that referenced this pull request Aug 8, 2021
ovsdb_cs_send_transaction() returns the pointer to the same
'request_id' object that is used internally.  This leads to
situation where transaction in idl and CS module has the
same 'request_id' object.  However, CS module is able to
destroy this transaction id at any time, e.g. if connection
state chnaged, but idl transaction might be still around at
this moment and application might still use it.

Found by running 'make check-ovsdb-cluster' with AddressSanitizer:

  ==79922==ERROR: AddressSanitizer: heap-use-after-free on address
  0x604000167a98 at pc 0x000000626acf bp 0x7ffcdb38a4c0 sp 0x7ffcdb38a4b8
  READ of size 8 at 0x604000167a98 thread T0
    #0 0x626ace in json_destroy lib/json.c:354:18
    williamtu#1 0x56d1ab in ovsdb_idl_txn_destroy lib/ovsdb-idl.c:2528:5
    williamtu#2 0x53a908 in do_vsctl utilities/ovs-vsctl.c:3008:5
    williamtu#3 0x539251 in main utilities/ovs-vsctl.c:203:17
    williamtu#4 0x7f7f7e376081 in __libc_start_main (/lib64/libc.so.6+0x27081)
    williamtu#5 0x461fed in _start (utilities/ovs-vsctl+0x461fed)

  0x604000167a98 is located 8 bytes inside of 40-byte
                    region [0x604000167a90,0x604000167ab8)
  freed by thread T0 here:
    #0 0x503ac7 in free (utilities/ovs-vsctl+0x503ac7)
    williamtu#1 0x626aae in json_destroy lib/json.c:378:9
    williamtu#2 0x6adfa2 in ovsdb_cs_run lib/ovsdb-cs.c:625:13
    williamtu#3 0x567731 in ovsdb_idl_run lib/ovsdb-idl.c:394:5
    williamtu#4 0x56fed1 in ovsdb_idl_txn_commit_block lib/ovsdb-idl.c:3187:9
    williamtu#5 0x53a4df in do_vsctl utilities/ovs-vsctl.c:2898:14
    williamtu#6 0x539251 in main utilities/ovs-vsctl.c:203:17
    williamtu#7 0x7f7f7e376081 in __libc_start_main

  previously allocated by thread T0 here:
    #0 0x503dcf in malloc (utilities/ovs-vsctl+0x503dcf)
    williamtu#1 0x594656 in xmalloc lib/util.c:138:15
    williamtu#2 0x626431 in json_create lib/json.c:1451:25
    williamtu#3 0x626972 in json_integer_create lib/json.c:263:25
    williamtu#4 0x62da0f in jsonrpc_create_id lib/jsonrpc.c:563:12
    williamtu#5 0x62d9a8 in jsonrpc_create_request lib/jsonrpc.c:570:23
    williamtu#6 0x6af3a6 in ovsdb_cs_send_transaction lib/ovsdb-cs.c:1357:35
    williamtu#7 0x56e3d5 in ovsdb_idl_txn_commit lib/ovsdb-idl.c:3147:27
    williamtu#8 0x56fea9 in ovsdb_idl_txn_commit_block lib/ovsdb-idl.c:3186:22
    williamtu#9 0x53a4df in do_vsctl utilities/ovs-vsctl.c:2898:14
    williamtu#10 0x539251 in main utilities/ovs-vsctl.c:203:17
    williamtu#11 0x7f7f7e376081 in __libc_start_main

Fixes: 1c337c4 ("ovsdb-idl: Break into two layers.")
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
PlumeTomi pushed a commit to PlumeTomi/ovs-ebpf that referenced this pull request Aug 8, 2021
…nected.

The symptom of this issue is that OVS bridge looses its IP address on
restart.

Simple reproducer:
 0. start ovsdb-server and ovs-vswitchd
 1. ovs-vsctl add-br br0
 2. ifconfig br0 10.0.0.1 up
 3. ovs-appctl -t ovs-vswitchd exit
 4. start ovs-vswitchd back.

After step williamtu#3 ovs-vswitchd is down, but br0 interface exists and
has configured IP address.  After step williamtu#4 there is no IP address
on the port br0.

What happened:
1. ovsdb-cs connects to the database via ovsdb-idl and requests
   database lock.
   --> get_schema for _Server database
   --> lock request

2. ovsdb-cs receives schema for the _Server database.  And sends
   monitor request.
   <-- schema for _Server
   --> monitor_cond for _Server

3. ovsdb-cs receives lock reply.
   <-- locked
   At this point ovsdb-cs generates OVSDB_CS_EVENT_TYPE_LOCKED
   event and passes it to ovsdb-idl.  ovsdb-idl increases change_seqno.

4. ovsdb_idl_has_ever_connected() is 'true' now, because change_seqno
   is not zero.

5. ovs-vswitchd decides that it has connection with database and
   all the initial data, therefore initiates configuration of bridges.
   bridge_run():ovsdb_idl_has_ever_connected() --> true

6. Since monitor request for the Open_vSwitch database is not even
   sent yet, the database is empty.  This leads to removal of all the
   ports and all other resources.

7. When data finally received, ovs-vswitchd re-creates bridges and
   ports, but IP addresses can not be restored.

While splitting out ovsdb-cs from ovsdb-idl one part of the logic
was lost.  Particularly, before the split, ovsdb-idl updated
change_seqno only in MONITORING state.

Restoring the logic by updating the change_seqno only if may send
transaction, i.e. lock is ours and ovsdb-cs is in the MONITORING
state.  This matches with the main purpose of increasing change_seqno
at this point, i.e. to force the client to re-try the transaction.
With this change ovsdb_idl_has_ever_connected() remains 'false'
until the first monitor reply with the actual data received.

This issue was reported several times during the last couple of weeks.

Reported-at: https://bugzilla.redhat.com/1968445
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2021-June/383512.html
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2021-June/051222.html
Fixes: 1c337c4 ("ovsdb-idl: Break into two layers.")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Dumitru Ceara <dceara@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants