Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pbrd: add pbr rule and table range commands to cli #4

Merged
merged 1 commit into from
Feb 5, 2018

Conversation

dslicenc
Copy link

@dslicenc dslicenc commented Feb 5, 2018

Signed-off-by: Don Slice dslice@cumulusnetworks.com

Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
@donaldsharp donaldsharp merged commit 0ac921a into donaldsharp:PBRD Feb 5, 2018
@dslicenc dslicenc deleted the PBRD branch February 5, 2018 16:28
donaldsharp pushed a commit that referenced this pull request Apr 23, 2018
* commit '4709b4faa43907ed9fcaf5920e56b6664f0523cf':
  bgpd: peer hash expands until we are out of memory
donaldsharp pushed a commit that referenced this pull request Apr 23, 2018
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
Reviewed-by:   Donald Sharp <sharpd@cumulusnetworks.com>

Ticket: CM-17175

If you are doing multipath in a VRF and bounce one of the multipaths for
a prefix, bgp is not updating the zebra entry for that prefix with the
new multipaths. We start with:

cel-redxp-10# show bgp vrf RED  ipv4 unicast 6.0.0.16/32
BGP routing table entry for 6.0.0.16/32
Paths: (4 available, best #4, table RED)
  Advertised to non peer-group peers:
  spine-1(swp1) spine-2(swp2) spine-3(swp3) spine-4(swp4)
  104 65104 65002
    fe80::202:ff:fe00:2d from spine-4(swp4) (6.0.0.12)
    (fe80::202:ff:fe00:2d) (used)
      Origin incomplete, localpref 100, valid, external, multipath, bestpath-from-AS 104
      AddPath ID: RX 0, TX 21
      Last update: Tue Aug  1 18:28:33 2017

  102 65104 65002
    fe80::202:ff:fe00:25 from spine-2(swp2) (6.0.0.10)
    (fe80::202:ff:fe00:25) (used)
      Origin incomplete, localpref 100, valid, external, multipath, bestpath-from-AS 102
      AddPath ID: RX 0, TX 20
      Last update: Tue Aug  1 18:28:33 2017

  103 65104 65002
    fe80::202:ff:fe00:29 from spine-3(swp3) (6.0.0.11)
    (fe80::202:ff:fe00:29) (used)
      Origin incomplete, localpref 100, valid, external, multipath, bestpath-from-AS 103
      AddPath ID: RX 0, TX 17
      Last update: Tue Aug  1 18:28:33 2017

  101 65104 65002
    fe80::202:ff:fe00:21 from spine-1(swp1) (6.0.0.9)
    (fe80::202:ff:fe00:21) (used)
      Origin incomplete, localpref 100, valid, external, multipath, bestpath-from-AS 101, best
      AddPath ID: RX 0, TX 8
      Last update: Tue Aug  1 18:28:33 2017

cel-redxp-10#
cel-redxp-10# show ip route vrf RED 6.0.0.16/32
Routing entry for 6.0.0.16/32
  Known via "bgp", distance 20, metric 0, vrf RED, best
  Last update 00:00:25 ago
  * fe80::202:ff:fe00:21, via swp1
  * fe80::202:ff:fe00:25, via swp2
  * fe80::202:ff:fe00:29, via swp3
  * fe80::202:ff:fe00:2d, via swp4

cel-redxp-10#

And then on spine-1 we bounce all peers

spine-1# clear ip bgp *
spine-1#

On the leaf (cel-redxp-10) we remove the route from spine-1

cel-redxp-10# show ip route vrf RED 6.0.0.16/32
Routing entry for 6.0.0.16/32
  Known via "bgp", distance 20, metric 0, vrf RED, best
  Last update 00:00:01 ago
  * fe80::202:ff:fe00:25, via swp2
  * fe80::202:ff:fe00:29, via swp3
  * fe80::202:ff:fe00:2d, via swp4

cel-redxp-10#

So far so good. The problem is when the session to spine-1 comes back up
bgp will mark the flag from spine-1 as multipath but does not update
zebra. We end up in a state where BGP has 4 paths flags as multipath but
only 3 paths are in the RIB.
donaldsharp pushed a commit that referenced this pull request Jun 19, 2018
donaldsharp pushed a commit that referenced this pull request Feb 24, 2019
If path->net is NULL in the bgp_path_info_free() function, then
bgpd would crash in bgp_addpath_free_info_data() with the following
backtrace:

 (gdb) bt
 #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
 #1  0x00007ff7b267a42a in __GI_abort () at abort.c:89
 #2  0x00007ff7b39c1ca0 in core_handler (signo=11, siginfo=0x7ffff66414f0, context=<optimized out>) at lib/sigevent.c:249
 #3  <signal handler called>
 #4  idalloc_free_to_pool (pool_ptr=pool_ptr@entry=0x0, id=3) at lib/id_alloc.c:368
 #5  0x0000560096246688 in bgp_addpath_free_info_data (d=d@entry=0x560098665468, nd=0x0) at bgpd/bgp_addpath.c:100
 #6  0x00005600961bb522 in bgp_path_info_free (path=0x560098665400) at bgpd/bgp_route.c:252
 #7  bgp_path_info_unlock (path=0x560098665400) at bgpd/bgp_route.c:276
 #8  0x00005600961bb719 in bgp_path_info_reap (rn=rn@entry=0x5600986b2110, pi=pi@entry=0x560098665400) at bgpd/bgp_route.c:320
 #9  0x00005600961bf4db in bgp_process_main_one (safi=SAFI_MPLS_VPN, afi=AFI_IP, rn=0x5600986b2110, bgp=0x560098587320) at bgpd/bgp_route.c:2476
 #10 bgp_process_wq (wq=<optimized out>, data=0x56009869b8f0) at bgpd/bgp_route.c:2503
 #11 0x00007ff7b39d5fcc in work_queue_run (thread=0x7ffff6641e10) at lib/workqueue.c:294
 #12 0x00007ff7b39ce3b1 in thread_call (thread=thread@entry=0x7ffff6641e10) at lib/thread.c:1606
 #13 0x00007ff7b39a3538 in frr_run (master=0x5600980795b0) at lib/libfrr.c:1011
 #14 0x000056009618a5a3 in main (argc=3, argv=0x7ffff6642078) at bgpd/bgp_main.c:481

Add a null-check protection to fix this problem.

Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
donaldsharp pushed a commit that referenced this pull request Feb 24, 2019
If path->net is NULL in the bgp_path_info_free() function, then
bgpd would crash in bgp_addpath_free_info_data() with the following
backtrace:

 (gdb) bt
 #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
 #1  0x00007ff7b267a42a in __GI_abort () at abort.c:89
 #2  0x00007ff7b39c1ca0 in core_handler (signo=11, siginfo=0x7ffff66414f0, context=<optimized out>) at lib/sigevent.c:249
 #3  <signal handler called>
 #4  idalloc_free_to_pool (pool_ptr=pool_ptr@entry=0x0, id=3) at lib/id_alloc.c:368
 #5  0x0000560096246688 in bgp_addpath_free_info_data (d=d@entry=0x560098665468, nd=0x0) at bgpd/bgp_addpath.c:100
 #6  0x00005600961bb522 in bgp_path_info_free (path=0x560098665400) at bgpd/bgp_route.c:252
 #7  bgp_path_info_unlock (path=0x560098665400) at bgpd/bgp_route.c:276
 #8  0x00005600961bb719 in bgp_path_info_reap (rn=rn@entry=0x5600986b2110, pi=pi@entry=0x560098665400) at bgpd/bgp_route.c:320
 #9  0x00005600961bf4db in bgp_process_main_one (safi=SAFI_MPLS_VPN, afi=AFI_IP, rn=0x5600986b2110, bgp=0x560098587320) at bgpd/bgp_route.c:2476
 #10 bgp_process_wq (wq=<optimized out>, data=0x56009869b8f0) at bgpd/bgp_route.c:2503
 #11 0x00007ff7b39d5fcc in work_queue_run (thread=0x7ffff6641e10) at lib/workqueue.c:294
 #12 0x00007ff7b39ce3b1 in thread_call (thread=thread@entry=0x7ffff6641e10) at lib/thread.c:1606
 #13 0x00007ff7b39a3538 in frr_run (master=0x5600980795b0) at lib/libfrr.c:1011
 #14 0x000056009618a5a3 in main (argc=3, argv=0x7ffff6642078) at bgpd/bgp_main.c:481

Add a null-check protection to fix this problem.

Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
donaldsharp added a commit that referenced this pull request Oct 10, 2019
Our Address Sanitizer CI is finding this issue:
error	09-Oct-2019 19:28:33	r4: bgpd triggered an exception by AddressSanitizer
error	09-Oct-2019 19:28:33	ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffdd425b060 at pc 0x00000068575f bp 0x7ffdd4258550 sp 0x7ffdd4258540
error	09-Oct-2019 19:28:33	READ of size 1 at 0x7ffdd425b060 thread T0
error	09-Oct-2019 19:28:33	    #0 0x68575e in prefix_cmp lib/prefix.c:776
error	09-Oct-2019 19:28:33	    #1 0x5889f5 in rfapiItBiIndexSearch bgpd/rfapi/rfapi_import.c:2230
error	09-Oct-2019 19:28:33	    #2 0x5889f5 in rfapiBgpInfoFilteredImportVPN bgpd/rfapi/rfapi_import.c:3520
error	09-Oct-2019 19:28:33	    #3 0x58b909 in rfapiProcessWithdraw bgpd/rfapi/rfapi_import.c:4071
error	09-Oct-2019 19:28:33	    #4 0x4c459b in bgp_withdraw bgpd/bgp_route.c:3736
error	09-Oct-2019 19:28:33	    #5 0x484122 in bgp_nlri_parse_vpn bgpd/bgp_mplsvpn.c:237
error	09-Oct-2019 19:28:33	    #6 0x497f52 in bgp_nlri_parse bgpd/bgp_packet.c:315
error	09-Oct-2019 19:28:33	    #7 0x49d06d in bgp_update_receive bgpd/bgp_packet.c:1598
error	09-Oct-2019 19:28:33	    #8 0x49d06d in bgp_process_packet bgpd/bgp_packet.c:2274
error	09-Oct-2019 19:28:33	    #9 0x6b9f54 in thread_call lib/thread.c:1531
error	09-Oct-2019 19:28:33	    #10 0x657037 in frr_run lib/libfrr.c:1052
error	09-Oct-2019 19:28:33	    #11 0x42d268 in main bgpd/bgp_main.c:486
error	09-Oct-2019 19:28:33	    #12 0x7f806032482f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
error	09-Oct-2019 19:28:33	    #13 0x42bcc8 in _start (/usr/lib/frr/bgpd+0x42bcc8)
error	09-Oct-2019 19:28:33
error	09-Oct-2019 19:28:33	Address 0x7ffdd425b060 is located in stack of thread T0 at offset 240 in frame
error	09-Oct-2019 19:28:33	    #0 0x483945 in bgp_nlri_parse_vpn bgpd/bgp_mplsvpn.c:103
error	09-Oct-2019 19:28:33
error	09-Oct-2019 19:28:33	  This frame has 5 object(s):
error	09-Oct-2019 19:28:33	    [32, 36) 'label'
error	09-Oct-2019 19:28:33	    [96, 108) 'rd_as'
error	09-Oct-2019 19:28:33	    [160, 172) 'rd_ip'
error	09-Oct-2019 19:28:33	    [224, 240) 'prd' <== Memory access at offset 240 overflows this variable
error	09-Oct-2019 19:28:33	    [288, 336) 'p'
error	09-Oct-2019 19:28:33	HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
error	09-Oct-2019 19:28:33	      (longjmp and C++ exceptions *are* supported)
error	09-Oct-2019 19:28:33	SUMMARY: AddressSanitizer: stack-buffer-overflow lib/prefix.c:776 prefix_cmp
error	09-Oct-2019 19:28:33	Shadow bytes around the buggy address:
error	09-Oct-2019 19:28:33	  0x10003a8435b0: 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00 00 00
error	09-Oct-2019 19:28:33	  0x10003a8435c0: 00 00 00 00 00 00 00 00 00 00 f3 f3 f3 f3 f3 f3
error	09-Oct-2019 19:28:33	  0x10003a8435d0: f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 00 00
error	09-Oct-2019 19:28:33	  0x10003a8435e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1
error	09-Oct-2019 19:28:33	  0x10003a8435f0: f1 f1 04 f4 f4 f4 f2 f2 f2 f2 00 04 f4 f4 f2 f2
error	09-Oct-2019 19:28:33	=>0x10003a843600: f2 f2 00 04 f4 f4 f2 f2 f2 f2 00 00[f4]f4 f2 f2
error	09-Oct-2019 19:28:33	  0x10003a843610: f2 f2 00 00 00 00 00 00 f4 f4 f3 f3 f3 f3 00 00
error	09-Oct-2019 19:28:33	  0x10003a843620: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
error	09-Oct-2019 19:28:33	  0x10003a843630: 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 02 f4
error	09-Oct-2019 19:28:33	  0x10003a843640: f4 f4 f2 f2 f2 f2 04 f4 f4 f4 f2 f2 f2 f2 00 00
error	09-Oct-2019 19:28:33	  0x10003a843650: f4 f4 f2 f2 f2 f2 00 00 00 00 f2 f2 f2 f2 00 00
error	09-Oct-2019 19:28:33	Shadow byte legend (one shadow byte represents 8 application bytes):
error	09-Oct-2019 19:28:33	  Addressable:           00
error	09-Oct-2019 19:28:33	  Partially addressable: 01 02 03 04 05 06 07
error	09-Oct-2019 19:28:33	  Heap left redzone:       fa
error	09-Oct-2019 19:28:33	  Heap right redzone:      fb
error	09-Oct-2019 19:28:33	  Freed heap region:       fd
error	09-Oct-2019 19:28:33	  Stack left redzone:      f1
error	09-Oct-2019 19:28:33	  Stack mid redzone:       f2
error	09-Oct-2019 19:28:33	  Stack right redzone:     f3
error	09-Oct-2019 19:28:33	  Stack partial redzone:   f4
error	09-Oct-2019 19:28:33	  Stack after return:      f5
error	09-Oct-2019 19:28:33	  Stack use after scope:   f8
error	09-Oct-2019 19:28:33	  Global redzone:          f9
error	09-Oct-2019 19:28:33	  Global init order:       f6
error	09-Oct-2019 19:28:33	  Poisoned by user:        f7
error	09-Oct-2019 19:28:33	  Container overflow:      fc
error	09-Oct-2019 19:28:33	  Array cookie:            ac
error	09-Oct-2019 19:28:33	  Intra object redzone:    bb
error	09-Oct-2019 19:28:33	  ASan internal:           fe
error	09-Oct-2019 19:28:36	r3: Daemon bgpd not running

This is the result of this code pattern in rfapi/rfapi_import.c:

prefix_cmp((struct prefix *)&bpi_result->extra->vnc.import.rd,
	   (struct prefix *)prd))

Effectively prd or vnc.import.rd are `struct prefix_rd` which
are being typecast to a `struct prefix`.  Not a big deal except commit
1315d74 modified the prefix_cmp
function to allow for a sorted prefix_cmp.  In prefix_cmp
we were looking at the offset and shift.  In the case
of vnc we were passing a prefix length of 64 which is the exact length of
the remaining data structure for struct prefix_rd.  So we calculated
a offset of 8 and a shift of 0.  The data structures for the prefix
portion happened to be equal to 64 bits of data. So we checked that
with the memcmp got a 0 and promptly read off the end of the data
structure for the numcmp.  The fix is if shift is 0 that means thei
the memcmp has checked everything and there is nothing to do.

Please note: We will still crash if we set the prefixlen > then
~312 bits currently( ie if the prefixlen specifies a bit length
longer than the prefix length ).  I do not think there is
anything to do here( nor am I sure how to correct this either )
as that we are going to have some severe problems when we muck
up the prefixlen.

Fixes: FRRouting#5025
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
donaldsharp added a commit that referenced this pull request Oct 15, 2019
Our Address Sanitizer CI is finding this issue:
error	09-Oct-2019 19:28:33	r4: bgpd triggered an exception by AddressSanitizer
error	09-Oct-2019 19:28:33	ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffdd425b060 at pc 0x00000068575f bp 0x7ffdd4258550 sp 0x7ffdd4258540
error	09-Oct-2019 19:28:33	READ of size 1 at 0x7ffdd425b060 thread T0
error	09-Oct-2019 19:28:33	    #0 0x68575e in prefix_cmp lib/prefix.c:776
error	09-Oct-2019 19:28:33	    #1 0x5889f5 in rfapiItBiIndexSearch bgpd/rfapi/rfapi_import.c:2230
error	09-Oct-2019 19:28:33	    #2 0x5889f5 in rfapiBgpInfoFilteredImportVPN bgpd/rfapi/rfapi_import.c:3520
error	09-Oct-2019 19:28:33	    #3 0x58b909 in rfapiProcessWithdraw bgpd/rfapi/rfapi_import.c:4071
error	09-Oct-2019 19:28:33	    #4 0x4c459b in bgp_withdraw bgpd/bgp_route.c:3736
error	09-Oct-2019 19:28:33	    #5 0x484122 in bgp_nlri_parse_vpn bgpd/bgp_mplsvpn.c:237
error	09-Oct-2019 19:28:33	    #6 0x497f52 in bgp_nlri_parse bgpd/bgp_packet.c:315
error	09-Oct-2019 19:28:33	    #7 0x49d06d in bgp_update_receive bgpd/bgp_packet.c:1598
error	09-Oct-2019 19:28:33	    #8 0x49d06d in bgp_process_packet bgpd/bgp_packet.c:2274
error	09-Oct-2019 19:28:33	    #9 0x6b9f54 in thread_call lib/thread.c:1531
error	09-Oct-2019 19:28:33	    #10 0x657037 in frr_run lib/libfrr.c:1052
error	09-Oct-2019 19:28:33	    #11 0x42d268 in main bgpd/bgp_main.c:486
error	09-Oct-2019 19:28:33	    #12 0x7f806032482f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
error	09-Oct-2019 19:28:33	    #13 0x42bcc8 in _start (/usr/lib/frr/bgpd+0x42bcc8)
error	09-Oct-2019 19:28:33
error	09-Oct-2019 19:28:33	Address 0x7ffdd425b060 is located in stack of thread T0 at offset 240 in frame
error	09-Oct-2019 19:28:33	    #0 0x483945 in bgp_nlri_parse_vpn bgpd/bgp_mplsvpn.c:103
error	09-Oct-2019 19:28:33
error	09-Oct-2019 19:28:33	  This frame has 5 object(s):
error	09-Oct-2019 19:28:33	    [32, 36) 'label'
error	09-Oct-2019 19:28:33	    [96, 108) 'rd_as'
error	09-Oct-2019 19:28:33	    [160, 172) 'rd_ip'
error	09-Oct-2019 19:28:33	    [224, 240) 'prd' <== Memory access at offset 240 overflows this variable
error	09-Oct-2019 19:28:33	    [288, 336) 'p'
error	09-Oct-2019 19:28:33	HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
error	09-Oct-2019 19:28:33	      (longjmp and C++ exceptions *are* supported)
error	09-Oct-2019 19:28:33	SUMMARY: AddressSanitizer: stack-buffer-overflow lib/prefix.c:776 prefix_cmp
error	09-Oct-2019 19:28:33	Shadow bytes around the buggy address:
error	09-Oct-2019 19:28:33	  0x10003a8435b0: 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00 00 00
error	09-Oct-2019 19:28:33	  0x10003a8435c0: 00 00 00 00 00 00 00 00 00 00 f3 f3 f3 f3 f3 f3
error	09-Oct-2019 19:28:33	  0x10003a8435d0: f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 00 00
error	09-Oct-2019 19:28:33	  0x10003a8435e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1
error	09-Oct-2019 19:28:33	  0x10003a8435f0: f1 f1 04 f4 f4 f4 f2 f2 f2 f2 00 04 f4 f4 f2 f2
error	09-Oct-2019 19:28:33	=>0x10003a843600: f2 f2 00 04 f4 f4 f2 f2 f2 f2 00 00[f4]f4 f2 f2
error	09-Oct-2019 19:28:33	  0x10003a843610: f2 f2 00 00 00 00 00 00 f4 f4 f3 f3 f3 f3 00 00
error	09-Oct-2019 19:28:33	  0x10003a843620: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
error	09-Oct-2019 19:28:33	  0x10003a843630: 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 02 f4
error	09-Oct-2019 19:28:33	  0x10003a843640: f4 f4 f2 f2 f2 f2 04 f4 f4 f4 f2 f2 f2 f2 00 00
error	09-Oct-2019 19:28:33	  0x10003a843650: f4 f4 f2 f2 f2 f2 00 00 00 00 f2 f2 f2 f2 00 00
error	09-Oct-2019 19:28:33	Shadow byte legend (one shadow byte represents 8 application bytes):
error	09-Oct-2019 19:28:33	  Addressable:           00
error	09-Oct-2019 19:28:33	  Partially addressable: 01 02 03 04 05 06 07
error	09-Oct-2019 19:28:33	  Heap left redzone:       fa
error	09-Oct-2019 19:28:33	  Heap right redzone:      fb
error	09-Oct-2019 19:28:33	  Freed heap region:       fd
error	09-Oct-2019 19:28:33	  Stack left redzone:      f1
error	09-Oct-2019 19:28:33	  Stack mid redzone:       f2
error	09-Oct-2019 19:28:33	  Stack right redzone:     f3
error	09-Oct-2019 19:28:33	  Stack partial redzone:   f4
error	09-Oct-2019 19:28:33	  Stack after return:      f5
error	09-Oct-2019 19:28:33	  Stack use after scope:   f8
error	09-Oct-2019 19:28:33	  Global redzone:          f9
error	09-Oct-2019 19:28:33	  Global init order:       f6
error	09-Oct-2019 19:28:33	  Poisoned by user:        f7
error	09-Oct-2019 19:28:33	  Container overflow:      fc
error	09-Oct-2019 19:28:33	  Array cookie:            ac
error	09-Oct-2019 19:28:33	  Intra object redzone:    bb
error	09-Oct-2019 19:28:33	  ASan internal:           fe
error	09-Oct-2019 19:28:36	r3: Daemon bgpd not running

This is the result of this code pattern in rfapi/rfapi_import.c:

prefix_cmp((struct prefix *)&bpi_result->extra->vnc.import.rd,
	   (struct prefix *)prd))

Effectively prd or vnc.import.rd are `struct prefix_rd` which
are being typecast to a `struct prefix`.  Not a big deal except commit
1315d74 modified the prefix_cmp
function to allow for a sorted prefix_cmp.  In prefix_cmp
we were looking at the offset and shift.  In the case
of vnc we were passing a prefix length of 64 which is the exact length of
the remaining data structure for struct prefix_rd.  So we calculated
a offset of 8 and a shift of 0.  The data structures for the prefix
portion happened to be equal to 64 bits of data. So we checked that
with the memcmp got a 0 and promptly read off the end of the data
structure for the numcmp.  The fix is if shift is 0 that means thei
the memcmp has checked everything and there is nothing to do.

Please note: We will still crash if we set the prefixlen > then
~312 bits currently( ie if the prefixlen specifies a bit length
longer than the prefix length ).  I do not think there is
anything to do here( nor am I sure how to correct this either )
as that we are going to have some severe problems when we muck
up the prefixlen.

Fixes: FRRouting#5025
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
donaldsharp added a commit that referenced this pull request Nov 1, 2019
Running with --enable-address-sanitizer I am seeing this:

=================================================================
==19520==ERROR: AddressSanitizer: heap-use-after-free on address 0x6020003ef850 at pc 0x7fe9b8f7b57b bp 0x7fffbac6f9c0 sp 0x7fffbac6f170
READ of size 6 at 0x6020003ef850 thread T0
    #0 0x7fe9b8f7b57a  (/lib/x86_64-linux-gnu/libasan.so.5+0xb857a)
    #1 0x55e33d1071e5 in bgp_process_mac_rescan_table bgpd/bgp_mac.c:159
    #2 0x55e33d107c09 in bgp_mac_rescan_evpn_table bgpd/bgp_mac.c:252
    #3 0x55e33d107e39 in bgp_mac_rescan_all_evpn_tables bgpd/bgp_mac.c:266
    #4 0x55e33d108270 in bgp_mac_remove_ifp_internal bgpd/bgp_mac.c:291
    #5 0x55e33d108893 in bgp_mac_del_mac_entry bgpd/bgp_mac.c:351
    #6 0x55e33d21412d in bgp_ifp_down bgpd/bgp_zebra.c:257
    #7 0x7fe9b8cbf3be in if_down_via_zapi lib/if.c:198
    #8 0x7fe9b8db303a in zclient_interface_down lib/zclient.c:1549
    #9 0x7fe9b8db8a06 in zclient_read lib/zclient.c:2693
    #10 0x7fe9b8d7b95a in thread_call lib/thread.c:1599
    #11 0x7fe9b8cd824e in frr_run lib/libfrr.c:1024
    #12 0x55e33d09d463 in main bgpd/bgp_main.c:477
    #13 0x7fe9b879409a in __libc_start_main ../csu/libc-start.c:308
    #14 0x55e33d09c189 in _start (/usr/lib/frr/bgpd+0x168189)
0x6020003ef850 is located 0 bytes inside of 16-byte region [0x6020003ef850,0x6020003ef860)
freed by thread T0 here:
    #0 0x7fe9b8fabfb0 in __interceptor_free (/lib/x86_64-linux-gnu/libasan.so.5+0xe8fb0)
    #1 0x7fe9b8ce4ea9 in qfree lib/memory.c:129
    #2 0x55e33d10825c in bgp_mac_remove_ifp_internal bgpd/bgp_mac.c:289
    #3 0x55e33d108893 in bgp_mac_del_mac_entry bgpd/bgp_mac.c:351
    #4 0x55e33d21412d in bgp_ifp_down bgpd/bgp_zebra.c:257
    #5 0x7fe9b8cbf3be in if_down_via_zapi lib/if.c:198
    #6 0x7fe9b8db303a in zclient_interface_down lib/zclient.c:1549
    #7 0x7fe9b8db8a06 in zclient_read lib/zclient.c:2693
    #8 0x7fe9b8d7b95a in thread_call lib/thread.c:1599
    #9 0x7fe9b8cd824e in frr_run lib/libfrr.c:1024
    #10 0x55e33d09d463 in main bgpd/bgp_main.c:477
    #11 0x7fe9b879409a in __libc_start_main ../csu/libc-start.c:308
previously allocated by thread T0 here:
    #0 0x7fe9b8fac518 in calloc (/lib/x86_64-linux-gnu/libasan.so.5+0xe9518)
    #1 0x7fe9b8ce4d93 in qcalloc lib/memory.c:110
    #2 0x55e33d106b29 in bgp_mac_hash_alloc bgpd/bgp_mac.c:96
    #3 0x7fe9b8cb8350 in hash_get lib/hash.c:149
    #4 0x55e33d10845b in bgp_mac_add_mac_entry bgpd/bgp_mac.c:303
    #5 0x55e33d226757 in bgp_ifp_create bgpd/bgp_zebra.c:2644
    #6 0x7fe9b8cbf1e6 in if_new_via_zapi lib/if.c:176
    #7 0x7fe9b8db2d3b in zclient_interface_add lib/zclient.c:1481
    #8 0x7fe9b8db87f8 in zclient_read lib/zclient.c:2659
    #9 0x7fe9b8d7b95a in thread_call lib/thread.c:1599
    #10 0x7fe9b8cd824e in frr_run lib/libfrr.c:1024
    #11 0x55e33d09d463 in main bgpd/bgp_main.c:477
    #12 0x7fe9b879409a in __libc_start_main ../csu/libc-start.c:308

Effectively we are passing to bgp_mac_remove_ifp_internal the macaddr
that is associated with the bsm data structure.  There exists a path
where the bsm is freed and then we immediately pass the macaddr into
bgp_mac_rescan_all_evpn_tables.  So just make a copy of the macaddr
data structure before we free the bsm

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
donaldsharp pushed a commit that referenced this pull request Nov 15, 2019
This code is called from the zebra main pthread during shutdown
but the thread event is scheduled via the zebra dplane pthread.

Hence, we should be using the `thread_cancel_async()` API to
cancel the thread event on a different pthread.

This is only ever hit in the rare case that we still have work left
to do on the update queue during shutdown.

Found via zebra crash:

```
(gdb) bt
\#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
\#1  0x00007f4e4d3f7535 in __GI_abort () at abort.c:79
\#2  0x00007f4e4d3f740f in __assert_fail_base (fmt=0x7f4e4d559ee0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x7f4e4d9071d0 "master->owner == pthread_self()",
    file=0x7f4e4d906cf8 "lib/thread.c", line=1185, function=<optimized out>) at assert.c:92
\#3  0x00007f4e4d405102 in __GI___assert_fail (assertion=assertion@entry=0x7f4e4d9071d0 "master->owner == pthread_self()", file=file@entry=0x7f4e4d906cf8 "lib/thread.c",
    line=line@entry=1185, function=function@entry=0x7f4e4d906b68 <__PRETTY_FUNCTION__.15817> "thread_cancel") at assert.c:101
\#4  0x00007f4e4d8d095a in thread_cancel (thread=0x55b40d01a640) at lib/thread.c:1185
\#5  0x000055b40c291845 in zebra_dplane_shutdown () at zebra/zebra_dplane.c:3274
\#6  0x000055b40c27ee13 in zebra_finalize (dummy=<optimized out>) at zebra/main.c:202
\#7  0x00007f4e4d8d1416 in thread_call (thread=thread@entry=0x7ffcbbc08870) at lib/thread.c:1599
\#8  0x00007f4e4d8a1ef8 in frr_run (master=0x55b40ce35510) at lib/libfrr.c:1024
\#9  0x000055b40c270916 in main (argc=8, argv=0x7ffcbbc08c78) at zebra/main.c:483
(gdb) down
\#4  0x00007f4e4d8d095a in thread_cancel (thread=0x55b40d01a640) at lib/thread.c:1185
1185		assert(master->owner == pthread_self());
(gdb)
```

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
donaldsharp added a commit that referenced this pull request Dec 18, 2019
Address Sanitizer is reporting this issue:

==26177==ERROR: AddressSanitizer: heap-use-after-free on address 0x6120000238d8 at pc 0x7f88f7c4fa93 bp 0x7fff9a641830 sp 0x7fff9a641820
READ of size 8 at 0x6120000238d8 thread T0
    #0 0x7f88f7c4fa92 in if_delete lib/if.c:290
    #1 0x42192e in ospf_vl_if_delete ospfd/ospf_interface.c:912
    #2 0x42192e in ospf_vl_delete ospfd/ospf_interface.c:990
    #3 0x4a6208 in no_ospf_area_vlink ospfd/ospf_vty.c:1227
    #4 0x7f88f7c1553d in cmd_execute_command_real lib/command.c:1073
    #5 0x7f88f7c19b1e in cmd_execute_command lib/command.c:1132
    #6 0x7f88f7c19e8e in cmd_execute lib/command.c:1288
    #7 0x7f88f7cd7523 in vty_command lib/vty.c:516
    #8 0x7f88f7cd79ff in vty_execute lib/vty.c:1285
    #9 0x7f88f7cde4f9 in vtysh_read lib/vty.c:2119
    #10 0x7f88f7ccb845 in thread_call lib/thread.c:1549
    #11 0x7f88f7c5d6a7 in frr_run lib/libfrr.c:1093
    #12 0x412976 in main ospfd/ospf_main.c:221
    #13 0x7f88f73b082f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
    #14 0x413c78 in _start (/usr/local/master/sbin/ospfd+0x413c78)

Effectively we are in a shutdown phase and as part of shutdown we delete the
ospf interface pointer ( ifp->info ).  The interface deletion code
was modified in the past year to pass in the address of operator
to allow us to NULL out the holding pointer.  The catch here
is that we free the oi and then delete the interface passing
in the address of the oi->ifp pointer, causing a use after free.

Fixes: FRRouting#5555
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
donaldsharp pushed a commit that referenced this pull request Dec 27, 2019
We were not connecting the default zebra_ns to the default
ns->info at namespace initialization in zebra. Thus, when
we tried to use the `ns_walk_func()` it would ignore the
default zebra_ns since there is no pointer to it from the
ns struct.

Fix this by connecting them in `zebra_ns_init()` and,
if the default ns is not found, exit with failure
since this is not recoverable.

This was found during a crash where we fail to cancel the kernel_read
thread at termination (via the `ns_walk_func()`) and then we
get a netlink notification trying to use the zns struct that has
already been freed.

```
(gdb) bt
\#0  0x00007fc1134dc7bb in raise () from /lib/x86_64-linux-gnu/libc.so.6
\#1  0x00007fc1134c7535 in abort () from /lib/x86_64-linux-gnu/libc.so.6
\#2  0x00007fc113996f8f in core_handler (signo=11, siginfo=0x7ffe5429d070, context=<optimized out>) at lib/sigevent.c:254
\#3  <signal handler called>
\#4  0x0000561880e15449 in if_lookup_by_index_per_ns (ns=0x0, ifindex=174) at zebra/interface.c:269
\#5  0x0000561880e1642c in if_up (ifp=ifp@entry=0x561883076c50) at zebra/interface.c:1043
\#6  0x0000561880e10723 in netlink_link_change (h=0x7ffe5429d8f0, ns_id=<optimized out>, startup=<optimized out>) at zebra/if_netlink.c:1384
\#7  0x0000561880e17e68 in netlink_parse_info (filter=filter@entry=0x561880e17680 <netlink_information_fetch>, nl=nl@entry=0x561882497238, zns=zns@entry=0x7ffe542a5940,
    count=count@entry=5, startup=startup@entry=0) at zebra/kernel_netlink.c:932
\#8  0x0000561880e186a5 in kernel_read (thread=<optimized out>) at zebra/kernel_netlink.c:406
\#9  0x00007fc1139a4416 in thread_call (thread=thread@entry=0x7ffe542a5b70) at lib/thread.c:1599
\#10 0x00007fc113974ef8 in frr_run (master=0x5618823c9510) at lib/libfrr.c:1024
\#11 0x0000561880e0b916 in main (argc=8, argv=0x7ffe542a5f78) at zebra/main.c:483
```

Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
donaldsharp added a commit that referenced this pull request Jan 3, 2020
=================================================================
==13611==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffe9e5c8694 at pc 0x0000004d18ac bp 0x7ffe9e5c8330 sp 0x7ffe9e5c7ae0
WRITE of size 17 at 0x7ffe9e5c8694 thread T0
    #0 0x4d18ab in __asan_memcpy (/usr/lib/frr/zebra+0x4d18ab)
    #1 0x7f16f04bd97f in stream_get2 /home/qlyoung/frr/lib/stream.c:277:2
    #2 0x6410ec in zebra_vxlan_remote_macip_del /home/qlyoung/frr/zebra/zebra_vxlan.c:7718:4
    #3 0x68fa98 in zserv_handle_commands /home/qlyoung/frr/zebra/zapi_msg.c:2611:3
    #4 0x556add in main /home/qlyoung/frr/zebra/main.c:309:2
    #5 0x7f16eea3bb96 in __libc_start_main /build/glibc-OTsEL5/glibc-2.27/csu/../csu/libc-start.c:310
    #6 0x431249 in _start (/usr/lib/frr/zebra+0x431249)
donaldsharp added a commit that referenced this pull request Jan 3, 2020
=================================================================
==13611==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffe9e5c8694 at pc 0x0000004d18ac bp 0x7ffe9e5c8330 sp 0x7ffe9e5c7ae0
WRITE of size 17 at 0x7ffe9e5c8694 thread T0
    #0 0x4d18ab in __asan_memcpy (/usr/lib/frr/zebra+0x4d18ab)
    #1 0x7f16f04bd97f in stream_get2 /home/qlyoung/frr/lib/stream.c:277:2
    #2 0x6410ec in zebra_vxlan_remote_macip_del /home/qlyoung/frr/zebra/zebra_vxlan.c:7718:4
    #3 0x68fa98 in zserv_handle_commands /home/qlyoung/frr/zebra/zapi_msg.c:2611:3
    #4 0x556add in main /home/qlyoung/frr/zebra/main.c:309:2
    #5 0x7f16eea3bb96 in __libc_start_main /build/glibc-OTsEL5/glibc-2.27/csu/../csu/libc-start.c:310
    #6 0x431249 in _start (/usr/lib/frr/zebra+0x431249)

This decode is the result of a buffer overflow because we are
not checking ipa_len.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
donaldsharp added a commit that referenced this pull request Jan 3, 2020
=================================================================
==3058==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000010 (pc 0x7f5bf3ef7477 bp 0x7ffdfaa20d40 sp 0x7ffdfaa204c8 T0)
==3058==The signal is caused by a READ memory access.
==3058==Hint: address points to the zero page.
    #0 0x7f5bf3ef7476 in memcpy /build/glibc-OTsEL5/glibc-2.27/string/../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:134
    #1 0x4d158a in __asan_memcpy (/usr/lib/frr/zebra+0x4d158a)
    #2 0x7f5bf58da8ad in stream_put /home/qlyoung/frr/lib/stream.c:605:3
    #3 0x67d428 in zsend_ipset_entry_notify_owner /home/qlyoung/frr/zebra/zapi_msg.c:851:2
    #4 0x5c70b3 in zebra_pbr_add_ipset_entry /home/qlyoung/frr/zebra/zebra_pbr.c
    #5 0x68e1bb in zread_ipset_entry /home/qlyoung/frr/zebra/zapi_msg.c:2465:4
    #6 0x68f958 in zserv_handle_commands /home/qlyoung/frr/zebra/zapi_msg.c:2611:3
    #7 0x55666d in main /home/qlyoung/frr/zebra/main.c:309:2
    #8 0x7f5bf3e5db96 in __libc_start_main /build/glibc-OTsEL5/glibc-2.27/csu/../csu/libc-start.c:310
    #9 0x4311d9 in _start (/usr/lib/frr/zebra+0x4311d9)

the ipset->backpointer was NULL as that the hash lookup failed to find
anything.  Prevent this crash from happening.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
donaldsharp added a commit that referenced this pull request Jan 7, 2020
=================================================================
==13611==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffe9e5c8694 at pc 0x0000004d18ac bp 0x7ffe9e5c8330 sp 0x7ffe9e5c7ae0
WRITE of size 17 at 0x7ffe9e5c8694 thread T0
    #0 0x4d18ab in __asan_memcpy (/usr/lib/frr/zebra+0x4d18ab)
    #1 0x7f16f04bd97f in stream_get2 /home/qlyoung/frr/lib/stream.c:277:2
    #2 0x6410ec in zebra_vxlan_remote_macip_del /home/qlyoung/frr/zebra/zebra_vxlan.c:7718:4
    #3 0x68fa98 in zserv_handle_commands /home/qlyoung/frr/zebra/zapi_msg.c:2611:3
    #4 0x556add in main /home/qlyoung/frr/zebra/main.c:309:2
    #5 0x7f16eea3bb96 in __libc_start_main /build/glibc-OTsEL5/glibc-2.27/csu/../csu/libc-start.c:310
    #6 0x431249 in _start (/usr/lib/frr/zebra+0x431249)

This decode is the result of a buffer overflow because we are
not checking ipa_len.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
donaldsharp added a commit that referenced this pull request Jan 7, 2020
=================================================================
==3058==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000010 (pc 0x7f5bf3ef7477 bp 0x7ffdfaa20d40 sp 0x7ffdfaa204c8 T0)
==3058==The signal is caused by a READ memory access.
==3058==Hint: address points to the zero page.
    #0 0x7f5bf3ef7476 in memcpy /build/glibc-OTsEL5/glibc-2.27/string/../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:134
    #1 0x4d158a in __asan_memcpy (/usr/lib/frr/zebra+0x4d158a)
    #2 0x7f5bf58da8ad in stream_put /home/qlyoung/frr/lib/stream.c:605:3
    #3 0x67d428 in zsend_ipset_entry_notify_owner /home/qlyoung/frr/zebra/zapi_msg.c:851:2
    #4 0x5c70b3 in zebra_pbr_add_ipset_entry /home/qlyoung/frr/zebra/zebra_pbr.c
    #5 0x68e1bb in zread_ipset_entry /home/qlyoung/frr/zebra/zapi_msg.c:2465:4
    #6 0x68f958 in zserv_handle_commands /home/qlyoung/frr/zebra/zapi_msg.c:2611:3
    #7 0x55666d in main /home/qlyoung/frr/zebra/main.c:309:2
    #8 0x7f5bf3e5db96 in __libc_start_main /build/glibc-OTsEL5/glibc-2.27/csu/../csu/libc-start.c:310
    #9 0x4311d9 in _start (/usr/lib/frr/zebra+0x4311d9)

the ipset->backpointer was NULL as that the hash lookup failed to find
anything.  Prevent this crash from happening.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
donaldsharp pushed a commit that referenced this pull request Jan 9, 2020
ERROR: LeakSanitizer: detected memory leaks

Direct leak of 16416 byte(s) in 1 object(s) allocated from:
    #0 0x7f08e3ca8602 in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x98602)
    #1 0x7f08e389c4b0 in qmalloc lib/memory.c:105
    #2 0x7f08e38e87b4 in stream_new lib/stream.c:106
    #3 0x481d7f in _zebra_ptm_bfd_client_deregister zebra/zebra_ptm.c:1348
    #4 0x4e7b84 in hook_call_zserv_client_close zebra/zserv.c:544
    #5 0x4e7b84 in zserv_client_free zebra/zserv.c:560
    #6 0x4e7b84 in zserv_close_client zebra/zserv.c:625
    #7 0x4e7fe0 in zserv_handle_client_fail zebra/zserv.c:638
    #8 0x7f08e3901995 in thread_call lib/thread.c:1549
    #9 0x7f08e38937d7 in frr_run lib/libfrr.c:1093
    #10 0x41686e in main zebra/main.c:470
    #11 0x7f08e2fe682f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)

Direct leak of 16416 byte(s) in 1 object(s) allocated from:
    #0 0x7f08e3ca8602 in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x98602)
    #1 0x7f08e389c4b0 in qmalloc lib/memory.c:105
    #2 0x7f08e38e87b4 in stream_new lib/stream.c:106
    #3 0x481efe in _zebra_ptm_reroute zebra/zebra_ptm.c:1411
    #4 0x4f7dc9 in zserv_handle_commands zebra/zapi_msg.c:2642
    #5 0x4e6d32 in zserv_process_messages zebra/zserv.c:517
    #6 0x7f08e3901995 in thread_call lib/thread.c:1549
    #7 0x7f08e38937d7 in frr_run lib/libfrr.c:1093
    #8 0x41686e in main zebra/main.c:470
    #9 0x7f08e2fe682f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)

SUMMARY: AddressSanitizer: 32832 byte(s) leaked in 2 allocation(s).

This commit fixes these two different leaks.

Fixes: FRRouting#5658
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
donaldsharp added a commit that referenced this pull request Jan 15, 2020
The only two safi's that are usable for zebra for installation
of routes into the rib are SAFI_UNICAST and SAFI_MULTICAST.
The acceptance of other safi's is causing a memory leak:

Direct leak of 56 byte(s) in 1 object(s) allocated from:
    #0 0x5332f2 in calloc (/usr/lib/frr/zebra+0x5332f2)
    #1 0x7f594adc29db in qcalloc /opt/build/frr/lib/memory.c:110:27
    #2 0x686849 in zebra_vrf_get_table_with_table_id /opt/build/frr/zebra/zebra_vrf.c:390:11
    #3 0x65a245 in rib_add_multipath /opt/build/frr/zebra/zebra_rib.c:2591:10
    #4 0x7211bc in zread_route_add /opt/build/frr/zebra/zapi_msg.c:1616:8
    #5 0x73063c in zserv_handle_commands /opt/build/frr/zebra/zapi_msg.c:2682:2
Collapse

Sequence of events:

Upon vrf creation there is a zvrf->table[afi][safi] data structure
that tables are auto created for.  These tables only create SAFI_UNICAST
and SAFI_MULTICAST tables.  Since these are the only safi types that
are zebra can actually work on.  zvrf data structures also have a
zvrf->otable data structure that tracks in a RB tree other tables
that are created ( say you have routes stuck in any random table
in the 32bit route table space in linux ).  This data structure is
only used if the lookup in zvrf->table[afi][safi] fails.

After creation if we pass a route down from an upper level protocol
that has non unicast or multicast safi *but* has the actual
tableid of the vrf we are in, the initial lookup will always
return NULL leaving us to look in the otable.  This will create
a data structure to track this data.

If after this event you pass in a second route with the same
afi/safi/table_id, the otable will be created and attempted
to be stored, but the RB_TREE_UNIQ data structure when it sees
this will return the original otable returned and the lookup function
zebra_vrf_get_table_with_table_id will just drop the second otable.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
donaldsharp added a commit that referenced this pull request Jan 15, 2020
The only two safi's that are usable for zebra for installation
of routes into the rib are SAFI_UNICAST and SAFI_MULTICAST.
The acceptance of other safi's is causing a memory leak:

Direct leak of 56 byte(s) in 1 object(s) allocated from:
    #0 0x5332f2 in calloc (/usr/lib/frr/zebra+0x5332f2)
    #1 0x7f594adc29db in qcalloc /opt/build/frr/lib/memory.c:110:27
    #2 0x686849 in zebra_vrf_get_table_with_table_id /opt/build/frr/zebra/zebra_vrf.c:390:11
    #3 0x65a245 in rib_add_multipath /opt/build/frr/zebra/zebra_rib.c:2591:10
    #4 0x7211bc in zread_route_add /opt/build/frr/zebra/zapi_msg.c:1616:8
    #5 0x73063c in zserv_handle_commands /opt/build/frr/zebra/zapi_msg.c:2682:2
Collapse

Sequence of events:

Upon vrf creation there is a zvrf->table[afi][safi] data structure
that tables are auto created for.  These tables only create SAFI_UNICAST
and SAFI_MULTICAST tables.  Since these are the only safi types that
are zebra can actually work on.  zvrf data structures also have a
zvrf->otable data structure that tracks in a RB tree other tables
that are created ( say you have routes stuck in any random table
in the 32bit route table space in linux ).  This data structure is
only used if the lookup in zvrf->table[afi][safi] fails.

After creation if we pass a route down from an upper level protocol
that has non unicast or multicast safi *but* has the actual
tableid of the vrf we are in, the initial lookup will always
return NULL leaving us to look in the otable.  This will create
a data structure to track this data.

If after this event you pass in a second route with the same
afi/safi/table_id, the otable will be created and attempted
to be stored, but the RB_TREE_UNIQ data structure when it sees
this will return the original otable returned and the lookup function
zebra_vrf_get_table_with_table_id will just drop the second otable.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
donaldsharp added a commit that referenced this pull request Jan 15, 2020
==25402==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x533302 in calloc (/usr/lib/frr/zebra+0x533302)
    #1 0x7fee84cdc80b in qcalloc /home/qlyoung/frr/lib/memory.c:110:27
    #2 0x5a3032 in create_label_chunk /home/qlyoung/frr/zebra/label_manager.c:188:3
    #3 0x5a3c2b in assign_label_chunk /home/qlyoung/frr/zebra/label_manager.c:354:8
    #4 0x5a2a38 in label_manager_get_chunk /home/qlyoung/frr/zebra/label_manager.c:424:9
    #5 0x5a1412 in hook_call_lm_get_chunk /home/qlyoung/frr/zebra/label_manager.c:60:1
    #6 0x5a1412 in lm_get_chunk_call /home/qlyoung/frr/zebra/label_manager.c:81:2
    #7 0x72a234 in zread_get_label_chunk /home/qlyoung/frr/zebra/zapi_msg.c:2026:2
    #8 0x72a234 in zread_label_manager_request /home/qlyoung/frr/zebra/zapi_msg.c:2073:4
    #9 0x73150c in zserv_handle_commands /home/qlyoung/frr/zebra/zapi_msg.c:2688:2

When creating label chunk that has a specified base, we eventually are
calling assign_specific_label_chunk. This function finds the appropriate
list node and deletes it from the lbl_mgr.lc_list but since
the function uses list_delete_node() the deletion function that is
specified for lbl_mgr.lc_list is not called thus dropping the memory.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
donaldsharp added a commit that referenced this pull request Jan 15, 2020
==25402==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x533302 in calloc (/usr/lib/frr/zebra+0x533302)
    #1 0x7fee84cdc80b in qcalloc /home/qlyoung/frr/lib/memory.c:110:27
    #2 0x5a3032 in create_label_chunk /home/qlyoung/frr/zebra/label_manager.c:188:3
    #3 0x5a3c2b in assign_label_chunk /home/qlyoung/frr/zebra/label_manager.c:354:8
    #4 0x5a2a38 in label_manager_get_chunk /home/qlyoung/frr/zebra/label_manager.c:424:9
    #5 0x5a1412 in hook_call_lm_get_chunk /home/qlyoung/frr/zebra/label_manager.c:60:1
    #6 0x5a1412 in lm_get_chunk_call /home/qlyoung/frr/zebra/label_manager.c:81:2
    #7 0x72a234 in zread_get_label_chunk /home/qlyoung/frr/zebra/zapi_msg.c:2026:2
    #8 0x72a234 in zread_label_manager_request /home/qlyoung/frr/zebra/zapi_msg.c:2073:4
    #9 0x73150c in zserv_handle_commands /home/qlyoung/frr/zebra/zapi_msg.c:2688:2

When creating label chunk that has a specified base, we eventually are
calling assign_specific_label_chunk. This function finds the appropriate
list node and deletes it from the lbl_mgr.lc_list but since
the function uses list_delete_node() the deletion function that is
specified for lbl_mgr.lc_list is not called thus dropping the memory.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
donaldsharp added a commit that referenced this pull request Jan 15, 2020
The only two safi's that are usable for zebra for installation
of routes into the rib are SAFI_UNICAST and SAFI_MULTICAST.
The acceptance of other safi's is causing a memory leak:

Direct leak of 56 byte(s) in 1 object(s) allocated from:
    #0 0x5332f2 in calloc (/usr/lib/frr/zebra+0x5332f2)
    #1 0x7f594adc29db in qcalloc /opt/build/frr/lib/memory.c:110:27
    #2 0x686849 in zebra_vrf_get_table_with_table_id /opt/build/frr/zebra/zebra_vrf.c:390:11
    #3 0x65a245 in rib_add_multipath /opt/build/frr/zebra/zebra_rib.c:2591:10
    #4 0x7211bc in zread_route_add /opt/build/frr/zebra/zapi_msg.c:1616:8
    #5 0x73063c in zserv_handle_commands /opt/build/frr/zebra/zapi_msg.c:2682:2
Collapse

Sequence of events:

Upon vrf creation there is a zvrf->table[afi][safi] data structure
that tables are auto created for.  These tables only create SAFI_UNICAST
and SAFI_MULTICAST tables.  Since these are the only safi types that
are zebra can actually work on.  zvrf data structures also have a
zvrf->otable data structure that tracks in a RB tree other tables
that are created ( say you have routes stuck in any random table
in the 32bit route table space in linux ).  This data structure is
only used if the lookup in zvrf->table[afi][safi] fails.

After creation if we pass a route down from an upper level protocol
that has non unicast or multicast safi *but* has the actual
tableid of the vrf we are in, the initial lookup will always
return NULL leaving us to look in the otable.  This will create
a data structure to track this data.

If after this event you pass in a second route with the same
afi/safi/table_id, the otable will be created and attempted
to be stored, but the RB_TREE_UNIQ data structure when it sees
this will return the original otable returned and the lookup function
zebra_vrf_get_table_with_table_id will just drop the second otable.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
donaldsharp added a commit that referenced this pull request Mar 8, 2020
Upper level clients ask for default routes of a particular family
This change ensures that they only receive the family that they
have asked for.

Discovered when testing in ospf `default-information originate`

=================================================================
==246306==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fffffffa2e8 at pc 0x7ffff73c44e2 bp 0x7fffffffa090 sp 0x7fffffffa088
READ of size 16 at 0x7fffffffa2e8 thread T0
    #0 0x7ffff73c44e1 in prefix_copy lib/prefix.c:310
    #1 0x7ffff741c0aa in route_node_lookup lib/table.c:255
    #2 0x5555556cd263 in ospf_external_info_delete ospfd/ospf_asbr.c:178
    #3 0x5555556a47cc in ospf_zebra_read_route ospfd/ospf_zebra.c:852
    #4 0x7ffff746f5d8 in zclient_read lib/zclient.c:3028
    #5 0x7ffff742fc91 in thread_call lib/thread.c:1549
    #6 0x7ffff7374642 in frr_run lib/libfrr.c:1093
    #7 0x5555555bfaef in main ospfd/ospf_main.c:235
    #8 0x7ffff70a2bba in __libc_start_main ../csu/libc-start.c:308
    #9 0x5555555bf499 in _start (/usr/lib/frr/ospfd+0x6b499)

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
donaldsharp pushed a commit that referenced this pull request Mar 30, 2020
Implement tests to verify BGP link-bandwidth and weighted ECMP
functionality. These tests validate one of the primary use cases for
weighted ECMP (a.k.a. Unequal cost multipath) using BGP link-bandwidth:
https://tools.ietf.org/html/draft-mohanty-bess-ebgp-dmz

The included tests are:
Test #1: Test BGP link-bandwidth advertisement based on number of multipaths
Test #2: Test cumulative link-bandwidth propagation
Test #3: Test weighted ECMP - multipath with next hop weights
Test #4: Test weighted ECMP rebalancing upon change (link flap)
Test #5: Test weighted ECMP for a second anycast IP
Test #6: Test paths with and without link-bandwidth - receiver should resort to regular ECMP
Test #7: Test different options for processing link-bandwidth on the receiver

Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
donaldsharp pushed a commit that referenced this pull request Apr 8, 2020
Implement tests to verify BGP link-bandwidth and weighted ECMP
functionality. These tests validate one of the primary use cases for
weighted ECMP (a.k.a. Unequal cost multipath) using BGP link-bandwidth:
https://tools.ietf.org/html/draft-mohanty-bess-ebgp-dmz

The included tests are:
Test #1: Test BGP link-bandwidth advertisement based on number of multipaths
Test #2: Test cumulative link-bandwidth propagation
Test #3: Test weighted ECMP - multipath with next hop weights
Test #4: Test weighted ECMP rebalancing upon change (link flap)
Test #5: Test weighted ECMP for a second anycast IP
Test #6: Test paths with and without link-bandwidth - receiver should resort to regular ECMP
Test #7: Test different options for processing link-bandwidth on the receiver

Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
donaldsharp pushed a commit that referenced this pull request Jun 26, 2020
=================================================================
==24764==ERROR: AddressSanitizer: heap-use-after-free on address 0x60d0000115c8 at pc 0x55cb9cfad312 bp 0x7fffa0552140 sp 0x7fffa0552138
READ of size 8 at 0x60d0000115c8 thread T0
    #0 0x55cb9cfad311 in zebra_evpn_remote_es_flush zebra/zebra_evpn_mh.c:2041
    #1 0x55cb9cfad311 in zebra_evpn_es_cleanup zebra/zebra_evpn_mh.c:2234
    #2 0x55cb9cf6ae78 in zebra_vrf_disable zebra/zebra_vrf.c:205
    #3 0x7fc8d478f114 in vrf_delete lib/vrf.c:229
    #4 0x7fc8d478f99a in vrf_terminate lib/vrf.c:541
    #5 0x55cb9ceba0af in sigint zebra/main.c:176
    #6 0x55cb9ceba0af in sigint zebra/main.c:130
    #7 0x7fc8d4765d20 in quagga_sigevent_process lib/sigevent.c:103
    #8 0x7fc8d4787e8c in thread_fetch lib/thread.c:1396
    #9 0x7fc8d4708782 in frr_run lib/libfrr.c:1092
    #10 0x55cb9ce931d8 in main zebra/main.c:488
    #11 0x7fc8d43ee09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
    #12 0x55cb9ce94c09 in _start (/usr/lib/frr/zebra+0x8ac09)
=================================================================

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
donaldsharp pushed a commit that referenced this pull request Sep 25, 2020
This problem was reported by the sanitizer -
=================================================================
==24764==ERROR: AddressSanitizer: heap-use-after-free on address 0x60d0000115c8 at pc 0x55cb9cfad312 bp 0x7fffa0552140 sp 0x7fffa0552138
READ of size 8 at 0x60d0000115c8 thread T0
    #0 0x55cb9cfad311 in zebra_evpn_remote_es_flush zebra/zebra_evpn_mh.c:2041
    #1 0x55cb9cfad311 in zebra_evpn_es_cleanup zebra/zebra_evpn_mh.c:2234
    #2 0x55cb9cf6ae78 in zebra_vrf_disable zebra/zebra_vrf.c:205
    #3 0x7fc8d478f114 in vrf_delete lib/vrf.c:229
    #4 0x7fc8d478f99a in vrf_terminate lib/vrf.c:541
    #5 0x55cb9ceba0af in sigint zebra/main.c:176
    #6 0x55cb9ceba0af in sigint zebra/main.c:130
    #7 0x7fc8d4765d20 in quagga_sigevent_process lib/sigevent.c:103
    #8 0x7fc8d4787e8c in thread_fetch lib/thread.c:1396
    #9 0x7fc8d4708782 in frr_run lib/libfrr.c:1092
    #10 0x55cb9ce931d8 in main zebra/main.c:488
    #11 0x7fc8d43ee09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
    #12 0x55cb9ce94c09 in _start (/usr/lib/frr/zebra+0x8ac09)
=================================================================

Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
donaldsharp pushed a commit that referenced this pull request Dec 6, 2023
Implement proper memory cleanup for SRv6 functions and locator chunks to prevent potential memory leaks.
The list callback deletion functions have been set.

The ASan leak log for reference:

```
***********************************************************************************
Address Sanitizer Error detected in bgp_srv6l3vpn_to_bgp_vrf.test_bgp_srv6l3vpn_to_bgp_vrf/r2.asan.bgpd.4180

=================================================================
==4180==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 544 byte(s) in 2 object(s) allocated from:
    #0 0x7f8d176a0d28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28)
    #1 0x7f8d1709f238 in qcalloc lib/memory.c:105
    #2 0x55d5dba6ee75 in sid_register bgpd/bgp_mplsvpn.c:591
    #3 0x55d5dba6ee75 in alloc_new_sid bgpd/bgp_mplsvpn.c:712
    #4 0x55d5dba6f3ce in ensure_vrf_tovpn_sid_per_af bgpd/bgp_mplsvpn.c:758
    #5 0x55d5dba6fb94 in ensure_vrf_tovpn_sid bgpd/bgp_mplsvpn.c:849
    #6 0x55d5dba7f975 in vpn_leak_postchange bgpd/bgp_mplsvpn.h:299
    #7 0x55d5dba7f975 in vpn_leak_postchange_all bgpd/bgp_mplsvpn.c:3704
    #8 0x55d5dbbb6c66 in bgp_zebra_process_srv6_locator_chunk bgpd/bgp_zebra.c:3164
    #9 0x7f8d1716f08a in zclient_read lib/zclient.c:4459
    #10 0x7f8d1713f034 in event_call lib/event.c:1974
    #11 0x7f8d1708242b in frr_run lib/libfrr.c:1214
    #12 0x55d5db99d19d in main bgpd/bgp_main.c:510
    #13 0x7f8d160c5c86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)

Direct leak of 296 byte(s) in 1 object(s) allocated from:
    #0 0x7f8d176a0d28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28)
    #1 0x7f8d1709f238 in qcalloc lib/memory.c:105
    #2 0x7f8d170b1d5f in srv6_locator_chunk_alloc lib/srv6.c:135
    #3 0x55d5dbbb6a19 in bgp_zebra_process_srv6_locator_chunk bgpd/bgp_zebra.c:3144
    #4 0x7f8d1716f08a in zclient_read lib/zclient.c:4459
    #5 0x7f8d1713f034 in event_call lib/event.c:1974
    #6 0x7f8d1708242b in frr_run lib/libfrr.c:1214
    #7 0x55d5db99d19d in main bgpd/bgp_main.c:510
    #8 0x7f8d160c5c86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)
***********************************************************************************

```

Signed-off-by: Keelan Cannoo <keelan.cannoo@icloud.com>
donaldsharp pushed a commit that referenced this pull request Dec 11, 2023
Fix a crash because a use-after-free.

> =================================================================
> ==1249835==ERROR: AddressSanitizer: heap-use-after-free on address 0x604000074210 at pc 0x7fa1b42a652c bp 0x7ffc477a2aa0 sp 0x7ffc477a2a98
> READ of size 8 at 0x604000074210 thread T0
>     #0 0x7fa1b42a652b in list_delete_all_node git/frr/lib/linklist.c:299:20
>     #1 0x7fa1b42a683f in list_delete git/frr/lib/linklist.c:312:2
>     #2 0x5ee515 in dplane_ctx_free_internal git/frr/zebra/zebra_dplane.c:858:4
>     #3 0x5ee59c in dplane_ctx_free git/frr/zebra/zebra_dplane.c:884:2
>     #4 0x5ee544 in dplane_ctx_fini git/frr/zebra/zebra_dplane.c:905:2
>     #5 0x7045c0 in rib_process_dplane_results git/frr/zebra/zebra_rib.c:4928:4
>     #6 0x7fa1b4434fb2 in event_call git/frr/lib/event.c:1970:2
>     #7 0x7fa1b42a0ccf in frr_run git/frr/lib/libfrr.c:1213:3
>     #8 0x556808 in main git/frr/zebra/main.c:488:2
>     #9 0x7fa1b3d0bd09 in __libc_start_main csu/../csu/libc-start.c:308:16
>     #10 0x4453e9 in _start (/usr/lib/frr/zebra+0x4453e9)
>
> 0x604000074210 is located 0 bytes inside of 40-byte region [0x604000074210,0x604000074238)
> freed by thread T0 here:
>     #0 0x4bf1dd in free (/usr/lib/frr/zebra+0x4bf1dd)
>     #1 0x7fa1b42df0c0 in qfree git/frr/lib/memory.c:130:2
>     #2 0x7fa1b42a68ce in list_free_internal git/frr/lib/linklist.c:24:2
>     #3 0x7fa1b42a6870 in list_delete git/frr/lib/linklist.c:313:2
>     #4 0x5ee515 in dplane_ctx_free_internal git/frr/zebra/zebra_dplane.c:858:4
>     #5 0x5ee59c in dplane_ctx_free git/frr/zebra/zebra_dplane.c:884:2
>     #6 0x5ee544 in dplane_ctx_fini git/frr/zebra/zebra_dplane.c:905:2
>     #7 0x7045c0 in rib_process_dplane_results git/frr/zebra/zebra_rib.c:4928:4
>     #8 0x7fa1b4434fb2 in event_call git/frr/lib/event.c:1970:2
>     #9 0x7fa1b42a0ccf in frr_run git/frr/lib/libfrr.c:1213:3
>     #10 0x556808 in main git/frr/zebra/main.c:488:2
>     #11 0x7fa1b3d0bd09 in __libc_start_main csu/../csu/libc-start.c:308:16
>
> previously allocated by thread T0 here:
>     #0 0x4bf5d2 in calloc (/usr/lib/frr/zebra+0x4bf5d2)
>     #1 0x7fa1b42dee18 in qcalloc git/frr/lib/memory.c:105:27
>     #2 0x7fa1b42a3784 in list_new git/frr/lib/linklist.c:18:9
>     #3 0x6d165f in pbr_iptable_alloc_intern git/frr/zebra/zebra_pbr.c:1015:29
>     #4 0x7fa1b426ad1f in hash_get git/frr/lib/hash.c:147:13
>     #5 0x6d15f2 in zebra_pbr_add_iptable git/frr/zebra/zebra_pbr.c:1030:13
>     #6 0x5db2a3 in zread_iptable git/frr/zebra/zapi_msg.c:3759:3
>     #7 0x5e365d in zserv_handle_commands git/frr/zebra/zapi_msg.c:4039:3
>     #8 0x7e09fc in zserv_process_messages git/frr/zebra/zserv.c:520:3
>     #9 0x7fa1b4434fb2 in event_call git/frr/lib/event.c:1970:2
>     #10 0x7fa1b42a0ccf in frr_run git/frr/lib/libfrr.c:1213:3
>     #11 0x556808 in main git/frr/zebra/main.c:488:2
>     #12 0x7fa1b3d0bd09 in __libc_start_main csu/../csu/libc-start.c:308:16

Fixes: 1cc3806 ("zebra: Actually free all memory associated ctx->u.iptable.interface_name_list")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit 45140bb)
donaldsharp pushed a commit that referenced this pull request Jan 4, 2024
Fix the following heap-use-after-free

> ==82961==ERROR: AddressSanitizer: heap-use-after-free on address 0x6020001e4750 at pc 0x55a8cc7f63ac bp 0x7ffd6948e340 sp 0x7ffd6948e330
> READ of size 8 at 0x6020001e4750 thread T0
>     #0 0x55a8cc7f63ab in isis_route_node_cleanup isisd/isis_route.c:335
>     #1 0x7ff25ec617c1 in route_node_free lib/table.c:75
>     #2 0x7ff25ec619fc in route_table_free lib/table.c:111
>     #3 0x7ff25ec61661 in route_table_finish lib/table.c:46
>     #4 0x55a8cc800d83 in _isis_spftree_del isisd/isis_spf.c:397
>     #5 0x55a8cc800e45 in isis_spftree_clear isisd/isis_spf.c:414
>     #6 0x55a8cc80bd9a in isis_run_spf isisd/isis_spf.c:2020
>     #7 0x55a8cc80c370 in isis_run_spf_with_protection isisd/isis_spf.c:2076
>     #8 0x55a8cc80cf52 in isis_run_spf_cb isisd/isis_spf.c:2165
>     #9 0x7ff25ec7c4dc in event_call lib/event.c:1970
>     #10 0x7ff25eb64423 in frr_run lib/libfrr.c:1213
>     #11 0x55a8cc7799da in main isisd/isis_main.c:318
>     #12 0x7ff25e623d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>     #13 0x7ff25e623e3f in __libc_start_main_impl ../csu/libc-start.c:392
>     #14 0x55a8cc778e44 in _start (/usr/lib/frr/isisd+0x109e44)
>
> 0x6020001e4750 is located 0 bytes inside of 16-byte region [0x6020001e4750,0x6020001e4760)
> freed by thread T0 here:
>     #0 0x7ff25f000537 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:127
>     #1 0x7ff25eb9012e in qfree lib/memory.c:130
>     #2 0x55a8cc7f6485 in isis_route_table_info_free isisd/isis_route.c:351
>     #3 0x55a8cc800cf4 in _isis_spftree_del isisd/isis_spf.c:395
>     #4 0x55a8cc800e45 in isis_spftree_clear isisd/isis_spf.c:414
>     #5 0x55a8cc80bd9a in isis_run_spf isisd/isis_spf.c:2020
>     #6 0x55a8cc80c370 in isis_run_spf_with_protection isisd/isis_spf.c:2076
>     #7 0x55a8cc80cf52 in isis_run_spf_cb isisd/isis_spf.c:2165
>     #8 0x7ff25ec7c4dc in event_call lib/event.c:1970
>     #9 0x7ff25eb64423 in frr_run lib/libfrr.c:1213
>     #10 0x55a8cc7799da in main isisd/isis_main.c:318
>     #11 0x7ff25e623d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>
> previously allocated by thread T0 here:
>     #0 0x7ff25f000a57 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
>     #1 0x7ff25eb8ffdc in qcalloc lib/memory.c:105
>     #2 0x55a8cc7f63eb in isis_route_table_info_alloc isisd/isis_route.c:343
>     #3 0x55a8cc80052a in _isis_spftree_init isisd/isis_spf.c:334
>     #4 0x55a8cc800e51 in isis_spftree_clear isisd/isis_spf.c:415
>     #5 0x55a8cc80bd9a in isis_run_spf isisd/isis_spf.c:2020
>     #6 0x55a8cc80c370 in isis_run_spf_with_protection isisd/isis_spf.c:2076
>     #7 0x55a8cc80cf52 in isis_run_spf_cb isisd/isis_spf.c:2165
>     #8 0x7ff25ec7c4dc in event_call lib/event.c:1970
>     #9 0x7ff25eb64423 in frr_run lib/libfrr.c:1213
>     #10 0x55a8cc7799da in main isisd/isis_main.c:318
>     #11 0x7ff25e623d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

Fixes: 7153c3c ("isisd: update struct isis_route_info has multiple sr info by algorithm")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
donaldsharp pushed a commit that referenced this pull request Jan 5, 2024
Fix the following heap-buffer-overflow:

> ==3901635==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6020003a5940 at pc 0x56260067bb48 bp 0x7ffe8a4f3840 sp 0x7ffe8a4f3838
> READ of size 4 at 0x6020003a5940 thread T0
>     #0 0x56260067bb47 in ecommunity_fill_pbr_action bgpd/bgp_ecommunity.c:1587
>     #1 0x5626007a246e in bgp_pbr_build_and_validate_entry bgpd/bgp_pbr.c:939
>     #2 0x5626007b25e6 in bgp_pbr_update_entry bgpd/bgp_pbr.c:2933
>     #3 0x562600909d18 in bgp_zebra_announce bgpd/bgp_zebra.c:1351
>     #4 0x5626007d5efd in bgp_process_main_one bgpd/bgp_route.c:3528
>     #5 0x5626007d6b43 in bgp_process_wq bgpd/bgp_route.c:3641
>     #6 0x7f450f34c2cc in work_queue_run lib/workqueue.c:266
>     #7 0x7f450f327a27 in event_call lib/event.c:1970
>     #8 0x7f450f21a637 in frr_run lib/libfrr.c:1213
>     #9 0x56260062fc04 in main bgpd/bgp_main.c:540
>     #10 0x7f450ee2dd09 in __libc_start_main ../csu/libc-start.c:308
>     #11 0x56260062ca29 in _start (/usr/lib/frr/bgpd+0x2e3a29)
>
> 0x6020003a5940 is located 0 bytes to the right of 16-byte region [0x6020003a5930,0x6020003a5940)
> allocated by thread T0 here:
>     #0 0x7f450f6aa1f8 in __interceptor_realloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:164
>     #1 0x7f450f244f8a in qrealloc lib/memory.c:112
>     #2 0x562600673313 in ecommunity_add_val_internal bgpd/bgp_ecommunity.c:143
>     #3 0x5626006735bc in ecommunity_uniq_sort_internal bgpd/bgp_ecommunity.c:193
>     #4 0x5626006737e3 in ecommunity_parse_internal bgpd/bgp_ecommunity.c:228
>     #5 0x562600673890 in ecommunity_parse bgpd/bgp_ecommunity.c:236
>     #6 0x562600640469 in bgp_attr_ext_communities bgpd/bgp_attr.c:2674
>     #7 0x562600646eb3 in bgp_attr_parse bgpd/bgp_attr.c:3893
>     #8 0x562600791b7e in bgp_update_receive bgpd/bgp_packet.c:2141
>     #9 0x56260079ba6b in bgp_process_packet bgpd/bgp_packet.c:3406
>     #10 0x7f450f327a27 in event_call lib/event.c:1970
>     #11 0x7f450f21a637 in frr_run lib/libfrr.c:1213
>     #12 0x56260062fc04 in main bgpd/bgp_main.c:540
>     #13 0x7f450ee2dd09 in __libc_start_main ../csu/libc-start.c:308

Fixes: dacf6ec ("bgpd: utility routine to convert flowspec actions into pbr actions")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
donaldsharp pushed a commit that referenced this pull request Jan 15, 2024
Fix a crash when re-adding a rpki server:

> r2# sh run bgpd
> [...]
> rpki
>  rpki retry_interval 5
>  rpki cache 192.0.2.1 15432 preference 1
> exit
> [...]
> r2# conf t
> r2(config)# rpki
> r2(config-rpki)# no rpki cache 192.0.2.1 15432 preference 1
> r2(config-rpki)# do show rpki cache-connection
> Cannot find a connected group.
> r2(config-rpki)# rpki cache 192.0.2.1 15432 preference 1
> r2(config-rpki)# do show rpki cache-connection
> vtysh: error reading from bgpd: Resource temporarily unavailable (11)Warning: closing connection to bgpd because of an I/O error!

> #0  raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:50
> #1  0x00007f3fd2d16e57 in core_handler (signo=11, siginfo=0x7ffffd5931b0, context=0x7ffffd593080) at lib/sigevent.c:246
> #2  <signal handler called>
> #3  0x00007f3fd26926b4 in tommy_list_head (list=0x2e322e302e323931) at /home/lscalber/git/rtrlib/./third-party/tommyds/tommylist.h:125
> #4  0x00007f3fd2693812 in rtr_mgr_get_first_group (config=0x55fbf31d7f00) at /home/lscalber/git/rtrlib/rtrlib/rtr_mgr.c:409
> #5  0x00007f3fd2ebef59 in get_connected_group () at bgpd/bgp_rpki.c:718
> #6  0x00007f3fd2ec0b39 in show_rpki_cache_connection_magic (self=0x7f3fd2ec69c0 <show_rpki_cache_connection_cmd>, vty=0x55fbf31f9ef0, argc=3, argv=0x55fbf31f99d0, uj=0x0)
> #   at bgpd/bgp_rpki.c:1575
> #7  0x00007f3fd2ebd4da in show_rpki_cache_connection (self=0x7f3fd2ec69c0 <show_rpki_cache_connection_cmd>, vty=0x55fbf31f9ef0, argc=3, argv=0x55fbf31f99d0) at ./bgpd/bgp_rpki_clippy.c:648
> #8  0x00007f3fd2c8a142 in cmd_execute_command_real (vline=0x55fbf31f9990, vty=0x55fbf31f9ef0, cmd=0x0, up_level=0) at lib/command.c:978
> #9  0x00007f3fd2c8a25c in cmd_execute_command (vline=0x55fbf31e5260, vty=0x55fbf31f9ef0, cmd=0x0, vtysh=0) at lib/command.c:1028
> #10 0x00007f3fd2c8a7f1 in cmd_execute (vty=0x55fbf31f9ef0, cmd=0x55fbf3200680 "do show rpki cache-connection ", matched=0x0, vtysh=0) at lib/command.c:1203
> #11 0x00007f3fd2d36548 in vty_command (vty=0x55fbf31f9ef0, buf=0x55fbf3200680 "do show rpki cache-connection ") at lib/vty.c:594
> #12 0x00007f3fd2d382e1 in vty_execute (vty=0x55fbf31f9ef0) at lib/vty.c:1357
> #13 0x00007f3fd2d3a519 in vtysh_read (thread=0x7ffffd5963c0) at lib/vty.c:2365
> #14 0x00007f3fd2d2faf6 in event_call (thread=0x7ffffd5963c0) at lib/event.c:1974
> #15 0x00007f3fd2cc238e in frr_run (master=0x55fbf2a0cd60) at lib/libfrr.c:1214
> #16 0x000055fbf073de40 in main (argc=9, argv=0x7ffffd596618) at bgpd/bgp_main.c:510

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
donaldsharp added a commit that referenced this pull request Jan 24, 2024
Fix this:
***********************************************************************************
Address Sanitizer Error detected in zebra_opaque.test_zebra_opaque/r3.asan.zebra.11099

=================================================================
==11099==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 66 byte(s) in 1 object(s) allocated from:
    #0 0x7f527fc06b40 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xdeb40)
    #1 0x7f527f5e852b in qmalloc lib/memory.c:100
    #2 0x56418d20832d in zread_route_add zebra/zapi_msg.c:2125
    #3 0x56418d215d08 in zserv_handle_commands zebra/zapi_msg.c:4011
    #4 0x56418d32ab5b in zserv_process_messages zebra/zserv.c:520
    #5 0x7f527f6938d3 in event_call lib/event.c:2003
    #6 0x7f527f5cb692 in frr_run lib/libfrr.c:1218
    #7 0x56418d1c3336 in main zebra/main.c:508
    #8 0x7f527e656c86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)

SUMMARY: AddressSanitizer: 66 byte(s) leaked in 1 allocation(s).
***********************************************************************************

Code inspection leads to some code paths where the opaque data was not
freed up.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
donaldsharp added a commit that referenced this pull request Jan 25, 2024
The loading_done event needs a event pointer to prevent
use after free's.  Testing found this:

    ERROR: AddressSanitizer: heap-use-after-free on address 0x613000035130 at pc 0x55ad42d54e5f bp 0x7ffff1e942a0 sp 0x7ffff1e94290
    READ of size 1 at 0x613000035130 thread T0
        #0 0x55ad42d54e5e in loading_done ospf6d/ospf6_neighbor.c:447
        #1 0x55ad42ed7be4 in event_call lib/event.c:1995
        #2 0x55ad42e1df75 in frr_run lib/libfrr.c:1213
        #3 0x55ad42cf332e in main ospf6d/ospf6_main.c:250
        #4 0x7f5798133c86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)
        #5 0x55ad42cf2b19 in _start (/usr/lib/frr/ospf6d+0x248b19)

    0x613000035130 is located 48 bytes inside of 384-byte region [0x613000035100,0x613000035280)
    freed by thread T0 here:
        #0 0x7f57998d77a8 in __interceptor_free (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xde7a8)
        #1 0x55ad42e3b4b6 in qfree lib/memory.c:130
        #2 0x55ad42d5d049 in ospf6_neighbor_delete ospf6d/ospf6_neighbor.c:180
        #3 0x55ad42d1e1ea in interface_down ospf6d/ospf6_interface.c:930
        #4 0x55ad42ed7be4 in event_call lib/event.c:1995
        #5 0x55ad42ed84fe in _event_execute lib/event.c:2086
        #6 0x55ad42d26d7b in ospf6_interface_clear ospf6d/ospf6_interface.c:2847
        #7 0x55ad42d73f16 in ospf6_process_reset ospf6d/ospf6_top.c:755
        #8 0x55ad42d7e98c in clear_router_ospf6_magic ospf6d/ospf6_top.c:778
        #9 0x55ad42d7e98c in clear_router_ospf6 ospf6d/ospf6_top_clippy.c:42
        #10 0x55ad42dc2665 in cmd_execute_command_real lib/command.c:994
        #11 0x55ad42dc2b32 in cmd_execute_command lib/command.c:1053
        #12 0x55ad42dc2fa9 in cmd_execute lib/command.c:1221
        #13 0x55ad42ee3cd6 in vty_command lib/vty.c:591
        #14 0x55ad42ee4170 in vty_execute lib/vty.c:1354
        #15 0x55ad42eec94f in vtysh_read lib/vty.c:2362
        #16 0x55ad42ed7be4 in event_call lib/event.c:1995
        #17 0x55ad42e1df75 in frr_run lib/libfrr.c:1213
        #18 0x55ad42cf332e in main ospf6d/ospf6_main.c:250
        #19 0x7f5798133c86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)

    previously allocated by thread T0 here:
        #0 0x7f57998d7d28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28)
        #1 0x55ad42e3ab22 in qcalloc lib/memory.c:105
        #2 0x55ad42d5c8ff in ospf6_neighbor_create ospf6d/ospf6_neighbor.c:119
        #3 0x55ad42d4c86a in ospf6_hello_recv ospf6d/ospf6_message.c:464
        #4 0x55ad42d4c86a in ospf6_read_helper ospf6d/ospf6_message.c:1884
        #5 0x55ad42d4c86a in ospf6_receive ospf6d/ospf6_message.c:1925
        #6 0x55ad42ed7be4 in event_call lib/event.c:1995
        #7 0x55ad42e1df75 in frr_run lib/libfrr.c:1213
        #8 0x55ad42cf332e in main ospf6d/ospf6_main.c:250
        #9 0x7f5798133c86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)

Add an actual event pointer and just track it appropriately.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
donaldsharp pushed a commit that referenced this pull request Feb 3, 2024
Fix the following crash when logging from rpki_create_socket():

> #0  raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:50
> #1  0x00007f6e21723798 in core_handler (signo=6, siginfo=0x7f6e1e502ef0, context=0x7f6e1e502dc0) at lib/sigevent.c:248
> #2  <signal handler called>
> #3  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
> #4  0x00007f6e2144e537 in __GI_abort () at abort.c:79
> #5  0x00007f6e2176348e in _zlog_assert_failed (xref=0x7f6e2180c920 <_xref.16>, extra=0x0) at lib/zlog.c:670
> #6  0x00007f6e216b1eda in rcu_read_lock () at lib/frrcu.c:294
> #7  0x00007f6e21762da8 in vzlog_notls (xref=0x0, prio=2, fmt=0x7f6e217afe50 "%s:%d: %s(): assertion (%s) failed", ap=0x7f6e1e504248) at lib/zlog.c:425
> #8  0x00007f6e217632fb in vzlogx (xref=0x0, prio=2, fmt=0x7f6e217afe50 "%s:%d: %s(): assertion (%s) failed", ap=0x7f6e1e504248) at lib/zlog.c:627
> #9  0x00007f6e217621f5 in zlog (prio=2, fmt=0x7f6e217afe50 "%s:%d: %s(): assertion (%s) failed") at lib/zlog.h:73
> #10 0x00007f6e21763596 in _zlog_assert_failed (xref=0x7f6e2180c920 <_xref.16>, extra=0x0) at lib/zlog.c:687
> #11 0x00007f6e216b1eda in rcu_read_lock () at lib/frrcu.c:294
> #12 0x00007f6e21762da8 in vzlog_notls (xref=0x7f6e21a50040 <_xref.68>, prio=4, fmt=0x7f6e21a4999f "getaddrinfo: debug", ap=0x7f6e1e504878) at lib/zlog.c:425
> #13 0x00007f6e217632fb in vzlogx (xref=0x7f6e21a50040 <_xref.68>, prio=4, fmt=0x7f6e21a4999f "getaddrinfo: debug", ap=0x7f6e1e504878) at lib/zlog.c:627
> #14 0x00007f6e21a3f774 in zlog_ref (xref=0x7f6e21a50040 <_xref.68>, fmt=0x7f6e21a4999f "getaddrinfo: debug") at ./lib/zlog.h:84
> #15 0x00007f6e21a451b2 in rpki_create_socket (_cache=0x55729149cc30) at bgpd/bgp_rpki.c:1337
> #16 0x00007f6e2120e7b7 in tr_tcp_open (tr_socket=0x5572914d1520) at rtrlib/rtrlib/transport/tcp/tcp_transport.c:111
> #17 0x00007f6e2120e212 in tr_open (socket=0x5572914b5e00) at rtrlib/rtrlib/transport/transport.c:16
> #18 0x00007f6e2120faa2 in rtr_fsm_start (rtr_socket=0x557290e17180) at rtrlib/rtrlib/rtr/rtr.c:130
> #19 0x00007f6e218b7ea7 in start_thread (arg=<optimized out>) at pthread_create.c:477
> #20 0x00007f6e21527a2f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

rpki_create_socket() is a hook function called from the rtrlib library.
The issue arises because rtrlib initiates its own separate pthread in which
it runs the hook, which does not establish an FRR RCU context. Consequently,
this leads to failures in the logging mechanism that relies on RCU.

Initialize a new FRR pthread context from the rtrlib pthread with a
valid RCU context to allow logging from the rpki_create_socket() and
dependent functions.

Link: FRRouting#15260
Fixes: a951752 ("bgpd: create cache server socket in vrf")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
donaldsharp pushed a commit that referenced this pull request Mar 1, 2024
router bgp 65001
 no bgp ebgp-requires-policy
 neighbor 192.168.1.2 remote-as external
 neighbor 192.168.1.2 timers 3 10
 address-family ipv4 unicast
  neighbor 192.168.1.2 route-map r2 in
 exit-address-family
!
ip prefix-list p1 seq 5 permit 172.16.255.31/32
!
route-map r2 permit 10
 match ip address prefix-list p1
 set as-path exclude 65003
route-map r2 permit 20
 set as-path exclude all
!

we make the following commands

bgp as-path access-list FIRST permit ^65
bgp as-path access-list SECOND permit 2
 route-map r2 permit 6
  set as-path exclude as-path-access-list SECOND

and then

no bgp as-path access-list SECOND permit 2
clear bgp *

we have the following crash in bgp

               Stack trace of thread 536083:
                #0  0x00007f87f8aacfe1 raise (libpthread.so.0 + 0x12fe1)
                #1  0x00007f87f8cf6870 core_handler (libfrr.so.0 +
		    0xf6870)
                #2  0x00007f87f8aad140 __restore_rt (libpthread.so.0 +
		    0x13140)
                #3  0x00007f87f89a5122 __GI___regexec (libc.so.6 +
		    0xdf122)
                #4  0x000055d7f198b4a7 aspath_filter_exclude_acl (bgpd +
		    0x2054a7)
                #5  0x000055d7f1902187 route_set_aspath_exclude (bgpd +
		    0x17c187)
                #6  0x00007f87f8ce54b0 route_map_apply_ext (libfrr.so.0
		    + 0xe54b0)
                #7  0x000055d7f18da925 bgp_input_modifier (bgpd +
		    0x154925)
                #8  0x000055d7f18e0647 bgp_update (bgpd + 0x15a647)
                #9  0x000055d7f18e4772 bgp_nlri_parse_ip (bgpd +
		    0x15e772)
                #10 0x000055d7f18c38ae bgp_nlri_parse (bgpd + 0x13d8ae)
                #11 0x000055d7f18c6b7a bgp_update_receive (bgpd +
		    0x140b7a)
                #12 0x000055d7f18c8ff3 bgp_process_packet (bgpd +
		    0x142ff3)
                #13 0x00007f87f8d0dce0 thread_call (libfrr.so.0 +
		    0x10dce0)
                #14 0x00007f87f8cacb28 frr_run (libfrr.so.0 + 0xacb28)
                #15 0x000055d7f18435da main (bgpd + 0xbd5da)
                #16 0x00007f87f88e9d0a __libc_start_main (libc.so.6 +
		    0x23d0a)
                #17 0x000055d7f18415fa _start (bgpd + 0xbb5fa)

analysis

crash is due to the fact that there were always a pointer from
as-path exclude to deleted as-path access list.

fix
we add a backpointer mechanism to manage the dependency beetween
as-path access-list  and aspath exclude.

Signed-off-by: Francois Dumontet <francois.dumontet@6wind.com>
donaldsharp pushed a commit that referenced this pull request Mar 6, 2024
After we call subgroup_announce_check(), we leave communities, large-communities
that are modified by route-maps uninterned, and here we have a memory leak.

```
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323:Direct leak of 80 byte(s) in 2 object(s) allocated from:
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #0 0x7f0858d90037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #1 0x7f08589b15b2 in qcalloc lib/memory.c:105
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #2 0x561f5c4e08d2 in lcommunity_new bgpd/bgp_lcommunity.c:28
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #3 0x561f5c4e11d9 in lcommunity_dup bgpd/bgp_lcommunity.c:141
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #4 0x561f5c5c3b8b in route_set_lcommunity bgpd/bgp_routemap.c:2491
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #5 0x7f0858a177a5 in route_map_apply_ext lib/routemap.c:2675
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #6 0x561f5c5696f9 in subgroup_announce_check bgpd/bgp_route.c:2352
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #7 0x561f5c5fb728 in subgroup_announce_table bgpd/bgp_updgrp_adv.c:682
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #8 0x561f5c5fbd95 in subgroup_announce_route bgpd/bgp_updgrp_adv.c:765
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #9 0x561f5c5f6105 in peer_af_announce_route bgpd/bgp_updgrp.c:2187
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #10 0x561f5c5790be in bgp_announce_route_timer_expired bgpd/bgp_route.c:5032
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #11 0x7f0858a76e4e in thread_call lib/thread.c:1991
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #12 0x7f0858974c24 in frr_run lib/libfrr.c:1185
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #13 0x561f5c3e955d in main bgpd/bgp_main.c:505
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #14 0x7f08583a9d09 in __libc_start_main ../csu/libc-start.c:308
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323:Indirect leak of 144 byte(s) in 2 object(s) allocated from:
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #0 0x7f0858d8fe8f in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #1 0x7f08589b1579 in qmalloc lib/memory.c:100
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #2 0x561f5c4e1282 in lcommunity_dup bgpd/bgp_lcommunity.c:144
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #3 0x561f5c5c3b8b in route_set_lcommunity bgpd/bgp_routemap.c:2491
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #4 0x7f0858a177a5 in route_map_apply_ext lib/routemap.c:2675
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #5 0x561f5c5696f9 in subgroup_announce_check bgpd/bgp_route.c:2352
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #6 0x561f5c5fb728 in subgroup_announce_table bgpd/bgp_updgrp_adv.c:682
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #7 0x561f5c5fbd95 in subgroup_announce_route bgpd/bgp_updgrp_adv.c:765
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #8 0x561f5c5f6105 in peer_af_announce_route bgpd/bgp_updgrp.c:2187
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #9 0x561f5c5790be in bgp_announce_route_timer_expired bgpd/bgp_route.c:5032
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #10 0x7f0858a76e4e in thread_call lib/thread.c:1991
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #11 0x7f0858974c24 in frr_run lib/libfrr.c:1185
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #12 0x561f5c3e955d in main bgpd/bgp_main.c:505
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #13 0x7f08583a9d09 in __libc_start_main ../csu/libc-start.c:308
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-SUMMARY: AddressSanitizer: 224 byte(s) leaked in 4 allocation(s).
```

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
donaldsharp pushed a commit that referenced this pull request Mar 7, 2024
After we call subgroup_announce_check(), we leave communities, large-communities
that are modified by route-maps uninterned, and here we have a memory leak.

```
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323:Direct leak of 80 byte(s) in 2 object(s) allocated from:
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #0 0x7f0858d90037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #1 0x7f08589b15b2 in qcalloc lib/memory.c:105
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #2 0x561f5c4e08d2 in lcommunity_new bgpd/bgp_lcommunity.c:28
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #3 0x561f5c4e11d9 in lcommunity_dup bgpd/bgp_lcommunity.c:141
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #4 0x561f5c5c3b8b in route_set_lcommunity bgpd/bgp_routemap.c:2491
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #5 0x7f0858a177a5 in route_map_apply_ext lib/routemap.c:2675
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #6 0x561f5c5696f9 in subgroup_announce_check bgpd/bgp_route.c:2352
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #7 0x561f5c5fb728 in subgroup_announce_table bgpd/bgp_updgrp_adv.c:682
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #8 0x561f5c5fbd95 in subgroup_announce_route bgpd/bgp_updgrp_adv.c:765
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #9 0x561f5c5f6105 in peer_af_announce_route bgpd/bgp_updgrp.c:2187
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #10 0x561f5c5790be in bgp_announce_route_timer_expired bgpd/bgp_route.c:5032
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #11 0x7f0858a76e4e in thread_call lib/thread.c:1991
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #12 0x7f0858974c24 in frr_run lib/libfrr.c:1185
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #13 0x561f5c3e955d in main bgpd/bgp_main.c:505
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #14 0x7f08583a9d09 in __libc_start_main ../csu/libc-start.c:308
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323:Indirect leak of 144 byte(s) in 2 object(s) allocated from:
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #0 0x7f0858d8fe8f in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #1 0x7f08589b1579 in qmalloc lib/memory.c:100
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #2 0x561f5c4e1282 in lcommunity_dup bgpd/bgp_lcommunity.c:144
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #3 0x561f5c5c3b8b in route_set_lcommunity bgpd/bgp_routemap.c:2491
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #4 0x7f0858a177a5 in route_map_apply_ext lib/routemap.c:2675
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #5 0x561f5c5696f9 in subgroup_announce_check bgpd/bgp_route.c:2352
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #6 0x561f5c5fb728 in subgroup_announce_table bgpd/bgp_updgrp_adv.c:682
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #7 0x561f5c5fbd95 in subgroup_announce_route bgpd/bgp_updgrp_adv.c:765
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #8 0x561f5c5f6105 in peer_af_announce_route bgpd/bgp_updgrp.c:2187
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #9 0x561f5c5790be in bgp_announce_route_timer_expired bgpd/bgp_route.c:5032
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #10 0x7f0858a76e4e in thread_call lib/thread.c:1991
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #11 0x7f0858974c24 in frr_run lib/libfrr.c:1185
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #12 0x561f5c3e955d in main bgpd/bgp_main.c:505
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-    #13 0x7f08583a9d09 in __libc_start_main ../csu/libc-start.c:308
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-SUMMARY: AddressSanitizer: 224 byte(s) leaked in 4 allocation(s).
```

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
donaldsharp pushed a commit that referenced this pull request Mar 20, 2024
`ng` was not properly freed, leading to a memory leak.
The commit calls `nexthop_group_delete` to free memory associated with `ng`.

The ASan leak log for reference:

```
***********************************************************************************
Address Sanitizer Error detected in isis_topo1.test_isis_topo1/r5.asan.zebra.24308

=================================================================
==24308==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 32 byte(s) in 1 object(s) allocated from:
    #0 0x7f4f47b43d28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28)
    #1 0x7f4f4753c0a8 in qcalloc lib/memory.c:105
    #2 0x7f4f47559526 in nexthop_group_new lib/nexthop_group.c:270
    #3 0x562ded6a39d4 in zebra_add_import_table_entry zebra/redistribute.c:681
    #4 0x562ded787c35 in rib_link zebra/zebra_rib.c:3972
    #5 0x562ded787c35 in rib_addnode zebra/zebra_rib.c:3993
    #6 0x562ded787c35 in process_subq_early_route_add zebra/zebra_rib.c:2860
    #7 0x562ded787c35 in process_subq_early_route zebra/zebra_rib.c:3138
    #8 0x562ded787c35 in process_subq zebra/zebra_rib.c:3178
    #9 0x562ded787c35 in meta_queue_process zebra/zebra_rib.c:3228
    #10 0x7f4f475f7118 in work_queue_run lib/workqueue.c:266
    #11 0x7f4f475dc7f2 in event_call lib/event.c:1969
    #12 0x7f4f4751f347 in frr_run lib/libfrr.c:1213
    #13 0x562ded69e818 in main zebra/main.c:486
    #14 0x7f4f468ffc86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)

Indirect leak of 152 byte(s) in 1 object(s) allocated from:
    #0 0x7f4f47b43d28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28)
    #1 0x7f4f4753c0a8 in qcalloc lib/memory.c:105
    #2 0x7f4f475510ad in nexthop_new lib/nexthop.c:376
    #3 0x7f4f475539c5 in nexthop_dup lib/nexthop.c:914
    #4 0x7f4f4755b27a in copy_nexthops lib/nexthop_group.c:444
    #5 0x562ded6a3a1c in zebra_add_import_table_entry zebra/redistribute.c:682
    #6 0x562ded787c35 in rib_link zebra/zebra_rib.c:3972
    #7 0x562ded787c35 in rib_addnode zebra/zebra_rib.c:3993
    #8 0x562ded787c35 in process_subq_early_route_add zebra/zebra_rib.c:2860
    #9 0x562ded787c35 in process_subq_early_route zebra/zebra_rib.c:3138
    #10 0x562ded787c35 in process_subq zebra/zebra_rib.c:3178
    #11 0x562ded787c35 in meta_queue_process zebra/zebra_rib.c:3228
    #12 0x7f4f475f7118 in work_queue_run lib/workqueue.c:266
    #13 0x7f4f475dc7f2 in event_call lib/event.c:1969
    #14 0x7f4f4751f347 in frr_run lib/libfrr.c:1213
    #15 0x562ded69e818 in main zebra/main.c:486
    #16 0x7f4f468ffc86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)

SUMMARY: AddressSanitizer: 184 byte(s) leaked in 2 allocation(s).
***********************************************************************************
```

Signed-off-by: Keelan Cannoo <keelan.cannoo@icloud.com>
donaldsharp pushed a commit that referenced this pull request Mar 20, 2024
This commit frees dynamically allocated memory associated
with `pbrms->nhgrp_name` and `pbrms->dst` which were causing memory leaks.

The ASan leak log for reference:

```
=================================================================
==107458==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 56 byte(s) in 1 object(s) allocated from:
    #0 0x7f87d644ca37 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x7f87d5feaa37 in qcalloc ../lib/memory.c:105
    #2 0x7f87d6054ffd in prefix_new ../lib/prefix.c:1180
    #3 0x55722f3c2885 in pbr_map_match_dst_magic ../pbrd/pbr_vty.c:302
    #4 0x55722f3b5c24 in pbr_map_match_dst pbrd/pbr_vty_clippy.c:228
    #5 0x7f87d5f32d61 in cmd_execute_command_real ../lib/command.c:993
    #6 0x7f87d5f330ee in cmd_execute_command ../lib/command.c:1052
    #7 0x7f87d5f33dc0 in cmd_execute ../lib/command.c:1218
    #8 0x7f87d60e4177 in vty_command ../lib/vty.c:591
    #9 0x7f87d60e905c in vty_execute ../lib/vty.c:1354
    #10 0x7f87d60ef45a in vtysh_read ../lib/vty.c:2362
    #11 0x7f87d60d42d4 in event_call ../lib/event.c:1979
    #12 0x7f87d5fbe828 in frr_run ../lib/libfrr.c:1213
    #13 0x55722f3ac795 in main ../pbrd/pbr_main.c:168
    #14 0x7f87d5b82d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

Direct leak of 2 byte(s) in 1 object(s) allocated from:
    #0 0x7f87d63f39a7 in __interceptor_strdup ../../../../src/libsanitizer/asan/asan_interceptors.cpp:454
    #1 0x7f87d5feaafc in qstrdup ../lib/memory.c:117
    #2 0x55722f3da139 in pbr_nht_set_seq_nhg ../pbrd/pbr_nht.c:551
    #3 0x55722f3c693f in pbr_map_nexthop_group_magic ../pbrd/pbr_vty.c:1140
    #4 0x55722f3bdaae in pbr_map_nexthop_group pbrd/pbr_vty_clippy.c:1284
    #5 0x7f87d5f32d61 in cmd_execute_command_real ../lib/command.c:993
    #6 0x7f87d5f330ee in cmd_execute_command ../lib/command.c:1052
    #7 0x7f87d5f33dc0 in cmd_execute ../lib/command.c:1218
    #8 0x7f87d60e4177 in vty_command ../lib/vty.c:591
    #9 0x7f87d60e905c in vty_execute ../lib/vty.c:1354
    #10 0x7f87d60ef45a in vtysh_read ../lib/vty.c:2362
    #11 0x7f87d60d42d4 in event_call ../lib/event.c:1979
    #12 0x7f87d5fbe828 in frr_run ../lib/libfrr.c:1213
    #13 0x55722f3ac795 in main ../pbrd/pbr_main.c:168
    #14 0x7f87d5b82d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

SUMMARY: AddressSanitizer: 58 byte(s) leaked in 2 allocation(s).
```
Ticket:#3186167
Signed-off-by: Keelan Cannoo <keelan.cannoo@icloud.com>
donaldsharp pushed a commit that referenced this pull request Mar 20, 2024
This commit frees dynamically allocated memory associated
with `pbrms->nhgrp_name` and `pbrms->dst` which were causing memory leaks.

The ASan leak log for reference:

```
=================================================================
==107458==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 56 byte(s) in 1 object(s) allocated from:
    #0 0x7f87d644ca37 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x7f87d5feaa37 in qcalloc ../lib/memory.c:105
    #2 0x7f87d6054ffd in prefix_new ../lib/prefix.c:1180
    #3 0x55722f3c2885 in pbr_map_match_dst_magic ../pbrd/pbr_vty.c:302
    #4 0x55722f3b5c24 in pbr_map_match_dst pbrd/pbr_vty_clippy.c:228
    #5 0x7f87d5f32d61 in cmd_execute_command_real ../lib/command.c:993
    #6 0x7f87d5f330ee in cmd_execute_command ../lib/command.c:1052
    #7 0x7f87d5f33dc0 in cmd_execute ../lib/command.c:1218
    #8 0x7f87d60e4177 in vty_command ../lib/vty.c:591
    #9 0x7f87d60e905c in vty_execute ../lib/vty.c:1354
    #10 0x7f87d60ef45a in vtysh_read ../lib/vty.c:2362
    #11 0x7f87d60d42d4 in event_call ../lib/event.c:1979
    #12 0x7f87d5fbe828 in frr_run ../lib/libfrr.c:1213
    #13 0x55722f3ac795 in main ../pbrd/pbr_main.c:168
    #14 0x7f87d5b82d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

Direct leak of 2 byte(s) in 1 object(s) allocated from:
    #0 0x7f87d63f39a7 in __interceptor_strdup ../../../../src/libsanitizer/asan/asan_interceptors.cpp:454
    #1 0x7f87d5feaafc in qstrdup ../lib/memory.c:117
    #2 0x55722f3da139 in pbr_nht_set_seq_nhg ../pbrd/pbr_nht.c:551
    #3 0x55722f3c693f in pbr_map_nexthop_group_magic ../pbrd/pbr_vty.c:1140
    #4 0x55722f3bdaae in pbr_map_nexthop_group pbrd/pbr_vty_clippy.c:1284
    #5 0x7f87d5f32d61 in cmd_execute_command_real ../lib/command.c:993
    #6 0x7f87d5f330ee in cmd_execute_command ../lib/command.c:1052
    #7 0x7f87d5f33dc0 in cmd_execute ../lib/command.c:1218
    #8 0x7f87d60e4177 in vty_command ../lib/vty.c:591
    #9 0x7f87d60e905c in vty_execute ../lib/vty.c:1354
    #10 0x7f87d60ef45a in vtysh_read ../lib/vty.c:2362
    #11 0x7f87d60d42d4 in event_call ../lib/event.c:1979
    #12 0x7f87d5fbe828 in frr_run ../lib/libfrr.c:1213
    #13 0x55722f3ac795 in main ../pbrd/pbr_main.c:168
    #14 0x7f87d5b82d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

SUMMARY: AddressSanitizer: 58 byte(s) leaked in 2 allocation(s).
```
Ticket:#3186167
Signed-off-by: Keelan Cannoo <keelan.cannoo@icloud.com>
donaldsharp pushed a commit that referenced this pull request Apr 1, 2024
The asan memory leak has been detected:
> Direct leak of 16 byte(s) in 1 object(s) allocated from:
>     #0 0x7f9066dadd28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28)
>     #1 0x7f9066779b5d in qcalloc lib/memory.c:105
>     #2 0x556d6ca527c2 in vpn_leak_zebra_vrf_sid_update_per_af bgpd/bgp_mplsvpn.c:389
>     #3 0x556d6ca530e1 in vpn_leak_zebra_vrf_sid_update bgpd/bgp_mplsvpn.c:451
>     #4 0x556d6ca64b3b in vpn_leak_postchange bgpd/bgp_mplsvpn.h:311
>     #5 0x556d6ca64b3b in vpn_leak_postchange_all bgpd/bgp_mplsvpn.c:3751
>     #6 0x556d6cb9f116 in bgp_zebra_process_srv6_locator_chunk bgpd/bgp_zebra.c:3337
>     #7 0x7f906685a6b6 in zclient_read lib/zclient.c:4490
>     #8 0x7f9066826a32 in event_call lib/event.c:2011
>     #9 0x7f906675c444 in frr_run lib/libfrr.c:1217
>     #10 0x556d6c980d52 in main bgpd/bgp_main.c:545
>     #11 0x7f9065784c86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)

Fix this by freeing the previous memory chunk.

Fixes: b72c9e1 ("bgpd: cli for SRv6 SID alloc to redirect to vrf (step4)")
Fixes: 527588a ("bgpd: add support for per-VRF SRv6 SID")

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
donaldsharp pushed a commit that referenced this pull request May 9, 2024
Fix a couple of memory leaks spotted by Address Sanitizer:

```

=================================================================
==970960==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 592 byte(s) in 2 object(s) allocated from:
    #0 0xfeb98b28a4b4 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0xfeb98ae572f8 in qcalloc lib/memory.c:105
    #2 0xfeb98ae76138 in srv6_locator_chunk_alloc lib/srv6.c:138
    #3 0xb7f3c8508fa0 in ensure_vrf_tovpn_sid_per_vrf bgpd/bgp_mplsvpn.c:831
    #4 0xb7f3c8509494 in ensure_vrf_tovpn_sid bgpd/bgp_mplsvpn.c:866
    #5 0xb7f3c85028a8 in vpn_leak_postchange bgpd/bgp_mplsvpn.h:289
    #6 0xb7f3c851a7c0 in vpn_leak_postchange_all bgpd/bgp_mplsvpn.c:3769
    #7 0xb7f3c86f6ef0 in bgp_zebra_process_srv6_locator_chunk bgpd/bgp_zebra.c:3378
    #8 0xfeb98afa6e14 in zclient_read lib/zclient.c:4608
    #9 0xfeb98af3d684 in event_call lib/event.c:2011
    #10 0xfeb98ae2788c in frr_run lib/libfrr.c:1217
    #11 0xb7f3c83cbf0c in main bgpd/bgp_main.c:545
    #12 0xfeb98a8973f8 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    #13 0xfeb98a8974c8 in __libc_start_main_impl ../csu/libc-start.c:392
    #14 0xb7f3c83c832c in _start (/usr/lib/frr/bgpd+0x2d832c)

Direct leak of 32 byte(s) in 2 object(s) allocated from:
    #0 0xfeb98b28a4b4 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0xfeb98ae572f8 in qcalloc lib/memory.c:105
    #2 0xb7f3c8508fd8 in ensure_vrf_tovpn_sid_per_vrf bgpd/bgp_mplsvpn.c:832
    #3 0xb7f3c8509494 in ensure_vrf_tovpn_sid bgpd/bgp_mplsvpn.c:866
    #4 0xb7f3c85028a8 in vpn_leak_postchange bgpd/bgp_mplsvpn.h:289
    #5 0xb7f3c851a7c0 in vpn_leak_postchange_all bgpd/bgp_mplsvpn.c:3769
    #6 0xb7f3c86f6ef0 in bgp_zebra_process_srv6_locator_chunk bgpd/bgp_zebra.c:3378
    #7 0xfeb98afa6e14 in zclient_read lib/zclient.c:4608
    #8 0xfeb98af3d684 in event_call lib/event.c:2011
    #9 0xfeb98ae2788c in frr_run lib/libfrr.c:1217
    #10 0xb7f3c83cbf0c in main bgpd/bgp_main.c:545
    #11 0xfeb98a8973f8 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    #12 0xfeb98a8974c8 in __libc_start_main_impl ../csu/libc-start.c:392
    #13 0xb7f3c83c832c in _start (/usr/lib/frr/bgpd+0x2d832c)

Direct leak of 32 byte(s) in 2 object(s) allocated from:
    #0 0xfeb98b28a4b4 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0xfeb98ae572f8 in qcalloc lib/memory.c:105
    #2 0xb7f3c8506520 in vpn_leak_zebra_vrf_sid_update_per_vrf bgpd/bgp_mplsvpn.c:439
    #3 0xb7f3c85068d8 in vpn_leak_zebra_vrf_sid_update bgpd/bgp_mplsvpn.c:459
    #4 0xb7f3c86f6aec in bgp_ifp_create bgpd/bgp_zebra.c:3345
    #5 0xfeb98adfd3f8 in hook_call_if_real lib/if.c:48
    #6 0xfeb98adfe750 in if_new_via_zapi lib/if.c:181
    #7 0xfeb98af98084 in zclient_interface_add lib/zclient.c:2592
    #8 0xfeb98afa6d24 in zclient_read lib/zclient.c:4606
    #9 0xfeb98af3d684 in event_call lib/event.c:2011
    #10 0xfeb98ae2788c in frr_run lib/libfrr.c:1217
    #11 0xb7f3c83cbf0c in main bgpd/bgp_main.c:545
    #12 0xfeb98a8973f8 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    #13 0xfeb98a8974c8 in __libc_start_main_impl ../csu/libc-start.c:392
    #14 0xb7f3c83c832c in _start (/usr/lib/frr/bgpd+0x2d832c)

SUMMARY: AddressSanitizer: 656 byte(s) leaked in 6 allocation(s).

```

Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
donaldsharp pushed a commit that referenced this pull request May 24, 2024
> ==2334217==ERROR: AddressSanitizer: heap-use-after-free on address 0x61000001d0a0 at pc 0x563828c8de6f bp 0x7fffbdaee560 sp 0x7fffbdaee558
> READ of size 1 at 0x61000001d0a0 thread T0
>     #0 0x563828c8de6e in prefix_sid_cmp isisd/isis_spf.c:187
>     #1 0x7f84b8204f71 in hash_get lib/hash.c:142
>     #2 0x7f84b82055ec in hash_lookup lib/hash.c:184
>     #3 0x563828c8e185 in isis_spf_prefix_sid_lookup isisd/isis_spf.c:209
>     #4 0x563828c90642 in isis_spf_add2tent isisd/isis_spf.c:598
>     #5 0x563828c91cd0 in process_N isisd/isis_spf.c:824
>     #6 0x563828c93852 in isis_spf_process_lsp isisd/isis_spf.c:1041
>     #7 0x563828c98dde in isis_spf_loop isisd/isis_spf.c:1821
>     #8 0x563828c998de in isis_run_spf isisd/isis_spf.c:1983
>     #9 0x563828c99c7b in isis_run_spf_with_protection isisd/isis_spf.c:2009
>     #10 0x563828c9a60d in isis_run_spf_cb isisd/isis_spf.c:2090
>     #11 0x7f84b835c72d in event_call lib/event.c:2011
>     #12 0x7f84b8236d93 in frr_run lib/libfrr.c:1217
>     #13 0x563828c21918 in main isisd/isis_main.c:346
>     #14 0x7f84b7e4fd09 in __libc_start_main ../csu/libc-start.c:308
>     #15 0x563828c20df9 in _start (/usr/lib/frr/isisd+0xf5df9)
>
> 0x61000001d0a0 is located 96 bytes inside of 184-byte region [0x61000001d040,0x61000001d0f8)
> freed by thread T0 here:
>     #0 0x7f84b88a9b6f in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:123
>     #1 0x7f84b8263bae in qfree lib/memory.c:130
>     #2 0x563828c8e433 in isis_vertex_del isisd/isis_spf.c:249
>     #3 0x563828c91c95 in process_N isisd/isis_spf.c:811
>     #4 0x563828c93852 in isis_spf_process_lsp isisd/isis_spf.c:1041
>     #5 0x563828c98dde in isis_spf_loop isisd/isis_spf.c:1821
>     #6 0x563828c998de in isis_run_spf isisd/isis_spf.c:1983
>     #7 0x563828c99c7b in isis_run_spf_with_protection isisd/isis_spf.c:2009
>     #8 0x563828c9a60d in isis_run_spf_cb isisd/isis_spf.c:2090
>     #9 0x7f84b835c72d in event_call lib/event.c:2011
>     #10 0x7f84b8236d93 in frr_run lib/libfrr.c:1217
>     #11 0x563828c21918 in main isisd/isis_main.c:346
>     #12 0x7f84b7e4fd09 in __libc_start_main ../csu/libc-start.c:308
>
> previously allocated by thread T0 here:
>     #0 0x7f84b88aa037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
>     #1 0x7f84b8263a6c in qcalloc lib/memory.c:105
>     #2 0x563828c8e262 in isis_vertex_new isisd/isis_spf.c:225
>     #3 0x563828c904db in isis_spf_add2tent isisd/isis_spf.c:588
>     #4 0x563828c91cd0 in process_N isisd/isis_spf.c:824
>     #5 0x563828c93852 in isis_spf_process_lsp isisd/isis_spf.c:1041
>     #6 0x563828c98dde in isis_spf_loop isisd/isis_spf.c:1821
>     #7 0x563828c998de in isis_run_spf isisd/isis_spf.c:1983
>     #8 0x563828c99c7b in isis_run_spf_with_protection isisd/isis_spf.c:2009
>     #9 0x563828c9a60d in isis_run_spf_cb isisd/isis_spf.c:2090
>     #10 0x7f84b835c72d in event_call lib/event.c:2011
>     #11 0x7f84b8236d93 in frr_run lib/libfrr.c:1217
>     #12 0x563828c21918 in main isisd/isis_main.c:346
>     #13 0x7f84b7e4fd09 in __libc_start_main ../csu/libc-start.c:308
>
> SUMMARY: AddressSanitizer: heap-use-after-free isisd/isis_spf.c:187 in prefix_sid_cmp
> Shadow bytes around the buggy address:
>   0x0c207fffb9c0: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
>   0x0c207fffb9d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fa
>   0x0c207fffb9e0: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
>   0x0c207fffb9f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fa
>   0x0c207fffba00: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
> =>0x0c207fffba10: fd fd fd fd[fd]fd fd fd fd fd fd fd fd fd fd fa
>   0x0c207fffba20: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
>   0x0c207fffba30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fa
>   0x0c207fffba40: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
>   0x0c207fffba50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fa
>   0x0c207fffba60: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
> Shadow byte legend (one shadow byte represents 8 application bytes):
>   Addressable:           00
>   Partially addressable: 01 02 03 04 05 06 07
>   Heap left redzone:       fa
>   Freed heap region:       fd
>   Stack left redzone:      f1
>   Stack mid redzone:       f2
>   Stack right redzone:     f3
>   Stack after return:      f5
>   Stack use after scope:   f8
>   Global redzone:          f9
>   Global init order:       f6
>   Poisoned by user:        f7
>   Container overflow:      fc
>   Array cookie:            ac
>   Intra object redzone:    bb
>   ASan internal:           fe
>   Left alloca redzone:     ca
>   Right alloca redzone:    cb
>   Shadow gap:              cc
> ==2334217==ABORTING

Fixes: 2f7cc7b ("isisd: detect Prefix-SID collisions and handle them appropriately")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
donaldsharp added a commit that referenced this pull request May 24, 2024
This value is being set and read at the same time according
to the thread sanitizer
WARNING: ThreadSanitizer: data race (pid=2914253)
  Read of size 2 at 0x7ba800011b10 by thread T2:
    #0 validate_header bgpd/bgp_io.c:601 (bgpd+0x60c5e0)
    #1 read_ibuf_work bgpd/bgp_io.c:177 (bgpd+0x608ffe)
    #2 bgp_process_reads bgpd/bgp_io.c:261 (bgpd+0x609880)
    #3 event_call lib/event.c:2011 (libfrr.so.0+0x59168d)
    #4 fpt_run lib/frr_pthread.c:369 (libfrr.so.0+0x35154e)
    #5 frr_pthread_inner lib/frr_pthread.c:178 (libfrr.so.0+0x34fef6)

  Previous write of size 2 at 0x7ba800011b10 by main thread:
    #0 bgp_open_option_parse bgpd/bgp_open.c:1469 (bgpd+0xb5006f)
    #1 bgp_open_receive bgpd/bgp_packet.c:2100 (bgpd+0x6b3f5c)
    #2 bgp_process_packet bgpd/bgp_packet.c:4019 (bgpd+0x6c9549)
    #3 event_call lib/event.c:2011 (libfrr.so.0+0x59168d)
    #4 frr_run lib/libfrr.c:1217 (libfrr.so.0+0x3b04a9)
    #5 main bgpd/bgp_main.c:548 (bgpd+0x49aa3d)

  Location is heap block of size 24328 at 0x7ba80000c000 allocated by main thread:
    #0 calloc ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:667 (libtsan.so.2+0x3fdd2)
    #1 qcalloc lib/memory.c:105 (libfrr.so.0+0x3f2784)
    #2 peer_new bgpd/bgpd.c:1517 (bgpd+0x955024)
    #3 peer_create bgpd/bgpd.c:1941 (bgpd+0x95c908)
    #4 peer_remote_as bgpd/bgpd.c:2211 (bgpd+0x9614a6)
    #5 peer_remote_as_vty bgpd/bgp_vty.c:4788 (bgpd+0x881239)
    #6 neighbor_remote_as bgpd/bgp_vty.c:4869 (bgpd+0x881a28)
    #7 cmd_execute_command_real lib/command.c:1002 (libfrr.so.0+0x2b53a2)
    #8 cmd_execute_command_strict lib/command.c:1111 (libfrr.so.0+0x2b5e0b)
    #9 command_config_read_one_line lib/command.c:1271 (libfrr.so.0+0x2b6972)
    #10 config_from_file lib/command.c:1324 (libfrr.so.0+0x2b7035)
    #11 vty_read_file lib/vty.c:2607 (libfrr.so.0+0x5c0d19)
    #12 vty_read_config lib/vty.c:2853 (libfrr.so.0+0x5c1f37)
    #13 frr_config_read_in lib/libfrr.c:981 (libfrr.so.0+0x3ae76a)
    #14 event_call lib/event.c:2011 (libfrr.so.0+0x59168d)
    #15 frr_run lib/libfrr.c:1217 (libfrr.so.0+0x3b04a9)
    #16 main bgpd/bgp_main.c:548 (bgpd+0x49aa3d)

  Thread T2 'bgpd_io' (tid=2914257, running) created by main thread at:
    #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:1001 (libtsan.so.2+0x63a59)
    #1 frr_pthread_run lib/frr_pthread.c:197 (libfrr.so.0+0x3500da)
    #2 bgp_pthreads_run bgpd/bgpd.c:8490 (bgpd+0x9d7716)
    #3 main bgpd/bgp_main.c:547 (bgpd+0x49a9c8)

Fix this.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
donaldsharp pushed a commit that referenced this pull request Jun 26, 2024
Fix a crash when doing "show isis database detail json" in
isis_srv6_topo1 topotest.

> #0  raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:50
> #1  0x00007fad89524e2c in core_handler (signo=6, siginfo=0x7ffe86a4b8b0, context=0x7ffe86a4b780) at lib/sigevent.c:258
> #2  <signal handler called>
> #3  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
> #4  0x00007fad8904e537 in __GI_abort () at abort.c:79
> #5  0x00007fad8904e40f in __assert_fail_base (fmt=0x7fad891c5688 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x7fad8a3e70e8 "json_object_get_type(jso) == json_type_object",
>     file=0x7fad8a3e7064 "./json_object.c", line=590, function=<optimized out>) at assert.c:92
> #6  0x00007fad8905d662 in __GI___assert_fail (assertion=0x7fad8a3e70e8 "json_object_get_type(jso) == json_type_object", file=0x7fad8a3e7064 "./json_object.c", line=590,
>     function=0x7fad8a3e7440 "json_object_object_add_ex") at assert.c:101
> #7  0x00007fad8a3dfe93 in json_object_object_add_ex () from /lib/x86_64-linux-gnu/libjson-c.so.5
> #8  0x000055708e3f8f7f in format_subsubtlv_srv6_sid_structure (sid_struct=0x602000172b70, buf=0x0, json=0x6040000a21d0, indent=6) at isisd/isis_tlvs.c:2880
> #9  0x000055708e3f9acb in isis_format_subsubtlvs (subsubtlvs=0x602000172b50, buf=0x0, json=0x6040000a21d0, indent=6) at isisd/isis_tlvs.c:3022
> #10 0x000055708e3eefb0 in format_item_ext_subtlvs (exts=0x614000047440, buf=0x0, json=0x6040000a2190, indent=2, mtid=2) at isisd/isis_tlvs.c:1313
> #11 0x000055708e3fd599 in format_item_extended_reach (mtid=2, i=0x60300015aed0, buf=0x0, json=0x6040000a1bd0, indent=0) at isisd/isis_tlvs.c:3763
> #12 0x000055708e40d46a in format_item (mtid=2, context=ISIS_CONTEXT_LSP, type=ISIS_TLV_MT_REACH, i=0x60300015aed0, buf=0x0, json=0x6040000a1bd0, indent=0) at isisd/isis_tlvs.c:6789
> #13 0x000055708e40d4fc in format_items_ (mtid=2, context=ISIS_CONTEXT_LSP, type=ISIS_TLV_MT_REACH, items=0x60600021d160, buf=0x0, json=0x6040000a1bd0, indent=0) at isisd/isis_tlvs.c:6804
> #14 0x000055708e40edbc in format_mt_items (context=ISIS_CONTEXT_LSP, type=ISIS_TLV_MT_REACH, m=0x6180000845d8, buf=0x0, json=0x6040000a1bd0, indent=0) at isisd/isis_tlvs.c:7147
> #15 0x000055708e4111e9 in format_tlvs (tlvs=0x618000084480, buf=0x0, json=0x6040000a1bd0, indent=0) at isisd/isis_tlvs.c:7572
> #16 0x000055708e4114ce in isis_format_tlvs (tlvs=0x618000084480, json=0x6040000a1bd0) at isisd/isis_tlvs.c:7613
> #17 0x000055708e36f167 in lsp_print_detail (lsp=0x612000058b40, vty=0x0, json=0x6040000a1bd0, dynhost=1 '\001', isis=0x60d00001f800) at isisd/isis_lsp.c:785
> #18 0x000055708e36f31f in lsp_print_all (vty=0x0, json=0x6040000a0490, head=0x61f000005488, detail=1 '\001', dynhost=1 '\001', isis=0x60d00001f800) at isisd/isis_lsp.c:820
> #19 0x000055708e4379fc in show_isis_database_lspdb_json (json=0x6040000a0450, area=0x61f000005480, level=0, lspdb=0x61f000005488, sysid_str=0x0, ui_level=1) at isisd/isisd.c:2683
> #20 0x000055708e437ef9 in show_isis_database_json (json=0x6040000a0310, sysid_str=0x0, ui_level=1, isis=0x60d00001f800) at isisd/isisd.c:2754
> #21 0x000055708e438357 in show_isis_database_common (vty=0x62e000060400, json=0x6040000a0310, sysid_str=0x0, ui_level=1, isis=0x60d00001f800) at isisd/isisd.c:2788
> #22 0x000055708e438591 in show_isis_database (vty=0x62e000060400, json=0x6040000a0310, sysid_str=0x0, ui_level=1, vrf_name=0x7fad89806300 <vrf_default_name> "default", all_vrf=false)
>     at isisd/isisd.c:2825
> #23 0x000055708e43891d in show_database (self=0x55708e5519c0 <show_database_cmd>, vty=0x62e000060400, argc=5, argv=0x6040000a02d0) at isisd/isisd.c:2855
> #24 0x00007fad893a9767 in cmd_execute_command_real (vline=0x60300015f220, vty=0x62e000060400, cmd=0x0, up_level=0) at lib/command.c:1002
> #25 0x00007fad893a9adc in cmd_execute_command (vline=0x60300015f220, vty=0x62e000060400, cmd=0x0, vtysh=0) at lib/command.c:1061
> #26 0x00007fad893aa728 in cmd_execute (vty=0x62e000060400, cmd=0x621000025900 "show isis database detail json ", matched=0x0, vtysh=0) at lib/command.c:1227

Note that prior to 2e670cd, there was no crash but only the last
"srv6-sid-structure" was displayed. A "srv6-sid-structure" should be
displayed for each "sid". This commit also fix this.

Was:

> "srv6-lan-endx-sid": [
>   {
>     "sid": "fc00:0:1:1::",
>     "weight": 0,
>     "algorithm": "SPF",
>     "neighbor-id": "0000.0000.0002"
>   },
>   {
>     "sid": "fc00:0:1:2::",
>     "weight": 0,
>     "algorithm": "SPF",
>     "neighbor-id": "0000.0000.0003"
>   }
> ],
> "srv6-sid-structure": {
>   "loc-block-len": 32,
>   "loc-node-len": 16,
>   "func-len": 16,
>   "arg-len": 0
> },

Now (srv6-sid-structure are identical but they are not always):

> "srv6-lan-endx-sid": [
>   {
>     "sid": "fc00:0:1:1::",
>     "algorithm": "SPF",
>     "neighbor-id": "0000.0000.0002",
>     "srv6-sid-structure": {
>       "loc-block-len": 32,
>       "loc-node-len": 16,
>       "func-len": 8,
>       "arg-len": 0
>     },
>   },
>   {
>     "sid": "fc00:0:1:2::",
>     "algorithm": "SPF",
>     "neighbor-id": "0000.0000.0003",
>     "srv6-sid-structure": {
>       "loc-block-len": 32,
>       "loc-node-len": 16,
>       "func-len": 16,
>       "arg-len": 0
>     },
>   }
> ],

Fixes: 2e670cd ("isisd: fix display of srv6 subsubtlvs")
Fixes: 648a158 ("isisd: Add SRv6 End.X SID to Sub-TLV format func")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
donaldsharp pushed a commit that referenced this pull request Jul 2, 2024
…links

When a neighbor connection is disconnected, it may trigger LSP re-generation as a timer task, but this process may be delayed. As a result, the list of neighbors in area->adjacency_list may be inconsistent with the neighbors in lsp->tlvs->oldstyle_reach/extended_reach. For example, the area->adjacency_list may lack certain neighbors even though they are present in the LSP. When computing SPF, the call to isis_spf_build_adj_list() generates the spftree->sadj_list, which reflects the real neighbors in the area->adjacency_list. However, in the case of LAN links, spftree->sadj_list may include additional pseudo neighbors.
The pre-loading of tents through the call to isis_spf_preload_tent involves two steps:
1. isis_spf_process_lsp() is called to generate real neighbor vertices based on the root LSP and pseudo LSP.
2. isis_spf_add_local() is called to add corresponding next hops to the vertex->Adj_N list for the real neighbor vertices.
In the case of LAN links, the absence of corresponding real neighbors in the spftree->sadj_list prevents the execution of the second step. Consequently, the vertex->Adj_N list for the real neighbor vertices lacks corresponding next hops. This leads to a null pointer access when isis_lfa_compute() is called to calculate LFA. 
As for P2P links, since there are no pseudo neighbors, only the second step is executed, which does not create real neighbor vertices and therefore does not encounter this issue.
The backtrace is as follows:
(gdb) bt
#0  0x00007fd065277fe1 in raise () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007fd065398972 in core_handler (signo=11, siginfo=0x7ffc5c0636b0, context=0x7ffc5c063580) at ../lib/sigevent.c:261
#2  <signal handler called>
#3  0x00005564d82f8408 in isis_lfa_compute (area=0x5564d8b143f0, circuit=0x5564d8b21d10, spftree=0x5564d8b06bf0, resource=0x7ffc5c064410) at ../isisd/isis_lfa.c:2134
#4  0x00005564d82f8d78 in isis_spf_run_lfa (area=0x5564d8b143f0, spftree=0x5564d8b06bf0) at ../isisd/isis_lfa.c:2344
#5  0x00005564d8315964 in isis_run_spf_with_protection (area=0x5564d8b143f0, spftree=0x5564d8b06bf0) at ../isisd/isis_spf.c:1827
#6  0x00005564d8315c15 in isis_run_spf_cb (thread=0x7ffc5c064590) at ../isisd/isis_spf.c:1889
#7  0x00007fd0653b1f04 in thread_call (thread=0x7ffc5c064590) at ../lib/thread.c:1990
#8  0x00007fd06534a97b in frr_run (master=0x5564d88103c0) at ../lib/libfrr.c:1198
#9  0x00005564d82e7d5d in main (argc=5, argv=0x7ffc5c0647b8, envp=0x7ffc5c0647e8) at ../isisd/isis_main.c:273
(gdb) f 3
#3  0x00005564d82f8408 in isis_lfa_compute (area=0x5564d8b143f0, circuit=0x5564d8b21d10, spftree=0x5564d8b06bf0, resource=0x7ffc5c064410) at ../isisd/isis_lfa.c:2134
2134    ../isisd/isis_lfa.c: No such file or directory.
(gdb) p vadj_primary
$1 = (struct isis_vertex_adj *) 0x0
(gdb) p vertex->Adj_N->head
$2 = (struct listnode *) 0x0
(gdb) p (struct isis_vertex *)spftree->paths->l.list->head->next->next->next->next->data
$8 = (struct isis_vertex *) 0x5564d8b5b240
(gdb) p $8->type
$9 = VTYPE_NONPSEUDO_TE_IS
(gdb) p $8->N.id
$10 = "\000\000\000\000\000\002"
(gdb) p $8->Adj_N->count
$11 = 0
(gdb) p (struct isis_vertex *)spftree->paths->l.list->head->next->next->next->next->next->data
$12 = (struct isis_vertex *) 0x5564d8b73dd0
(gdb) p $12->type
$13 = VTYPE_NONPSEUDO_TE_IS
(gdb) p $12->N.id
$14 = "\000\000\000\000\000\003"
(gdb) p $12->Adj_N->count
$15 = 0
(gdb) p area->adjacency_list->count
$16 = 0
The backtrace provided above pertains to version 8.5.4, but it seems that the same issue exists in the code of the master branch as well.
The scenario where a vertex has no next hop is normal. For example, the "clear isis neighbor" command invokes isis_vertex_adj_del() to delete the next hop of a vertex. Upon reviewing all the instances where the vertex->Adj_N list is used, I found that only isis_lfa_compute() lacks a null check. Therefore, I believe that modifying this part will be sufficient. Additionally, the vertex->parents list for IP vertices is guaranteed not to be empty.
Test scenario:
Setting up LFA for LAN links and executing the "clear isis neighbor" command easily reproduces the issue.

Signed-off-by: zhou-run <zhou.run@h3c.com>
riw777 pushed a commit that referenced this pull request Jul 16, 2024
… the fragmented LSP

1. When the root IS regenerates an LSP, it calls lsp_build() -> lsp_clear_data() to free the TLV memory of the first fragment and all other fragments. If the number of fragments in the regenerated LSP decreases or if no fragmentation is needed, the extra LSP fragments are not immediately deleted. Instead, lsp_seqno_update() -> lsp_purge() is called to set the remaining time to zero and start aging, while also notifying other IS nodes to age these fragments. lsp_purge() usually does not reset lsp->hdr.seqno to zero because the LSP might recover during the aging process.
2. When other IS nodes receive an LSP, they always call process_lsp() -> isis_unpack_tlvs() to allocate TLV memory for the LSP. This does not differentiate whether the received LSP has a remaining lifetime of zero. Therefore, it is rare for an LSP of a non-root IS to have empty TLVs. Of course, if an LSP with a remaining time of zero and already corrupted is received, lsp_update() -> lsp_purge() will be called to free the TLV memory of the LSP, but this scenario is rare.
3. In LFA calculations, neighbors of the root IS are traversed, and each neighbor is taken as a new root to compute the neighbor SPT. During this process, the old root IS will serve as a neighbor of the new root IS, triggering a call to isis_spf_process_lsp() to parse the LSP of the old root IS and obtain its IP vertices and neighboring IS vertices. However, isis_spf_process_lsp() only checks whether the TLVs in the first fragment of the LSP exist, and does not check the TLVs in the fragmented LSP. If the TLV memory of the fragmented LSP of the old root IS has been freed, it can lead to a null pointer access, causing the current crash.

Additionally, for the base SPT, there are only two places where the LSP of the root IS is parsed:
1. When obtaining the UP neighbors of the root IS via spf_adj_list_parse_lsp().
2. When preloading the IP vertices of the root IS via isis_lsp_iterate_ip_reach().
Both of these checks ensure that frag->tlvs is not null, and they do not subsequently call isis_spf_process_lsp() to parse the root IS's LSP. It is very rare for non-root IS LSPs to have empty TLVs unless they are corrupted LSPs awaiting deletion. If it happens, a crash will occur.

The backtrace is as follows:
(gdb) bt
#0  0x00007f3097281fe1 in raise () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007f30973a2972 in core_handler (signo=11, siginfo=0x7ffce66c2870, context=0x7ffce66c2740) at ../lib/sigevent.c:261
#2  <signal handler called>
#3  0x000055dfa805512b in isis_spf_process_lsp (spftree=0x55dfa950eee0, lsp=0x55dfa94cb590, cost=10, depth=1, root_sysid=0x55dfa950ef6c "", parent=0x55dfa952fca0)
    at ../isisd/isis_spf.c:898
#4  0x000055dfa805743b in isis_spf_loop (spftree=0x55dfa950eee0, root_sysid=0x55dfa950ef6c "") at ../isisd/isis_spf.c:1688
#5  0x000055dfa805784f in isis_run_spf (spftree=0x55dfa950eee0) at ../isisd/isis_spf.c:1808
#6  0x000055dfa8037ff5 in isis_spf_run_neighbors (spftree=0x55dfa9474440) at ../isisd/isis_lfa.c:1259
#7  0x000055dfa803ac17 in isis_spf_run_lfa (area=0x55dfa9477510, spftree=0x55dfa9474440) at ../isisd/isis_lfa.c:2300
#8  0x000055dfa8057964 in isis_run_spf_with_protection (area=0x55dfa9477510, spftree=0x55dfa9474440) at ../isisd/isis_spf.c:1827
#9  0x000055dfa8057c15 in isis_run_spf_cb (thread=0x7ffce66c38e0) at ../isisd/isis_spf.c:1889
#10 0x00007f30973bbf04 in thread_call (thread=0x7ffce66c38e0) at ../lib/thread.c:1990
#11 0x00007f309735497b in frr_run (master=0x55dfa91733c0) at ../lib/libfrr.c:1198
#12 0x000055dfa8029d5d in main (argc=5, argv=0x7ffce66c3b08, envp=0x7ffce66c3b38) at ../isisd/isis_main.c:273
(gdb) f 3
#3  0x000055dfa805512b in isis_spf_process_lsp (spftree=0x55dfa950eee0, lsp=0x55dfa94cb590, cost=10, depth=1, root_sysid=0x55dfa950ef6c "", parent=0x55dfa952fca0)
    at ../isisd/isis_spf.c:898
898     ../isisd/isis_spf.c: No such file or directory.
(gdb) p te_neighs
$1 = (struct isis_item_list *) 0x120
(gdb) p lsp->tlvs
$2 = (struct isis_tlvs *) 0x0
(gdb) p lsp->hdr
$3 = {pdu_len = 27, rem_lifetime = 0, lsp_id = "\000\000\000\000\000\001\000\001", seqno = 4, checksum = 59918, lsp_bits = 1 '\001'}

The backtrace provided above pertains to version 8.5.4, but it seems that the same issue exists in the code of the master branch as well.

I have reviewed the process for calculating the SPT based on the LSP, and isis_spf_process_lsp() is the only function that does not check whether the TLVs in the fragments are empty. Therefore, I believe that modifying this function alone should be sufficient. If the TLVs of the current fragment are already empty, we do not need to continue processing subsequent fragments. This is consistent with the behavior where we do not process fragments if the TLVs of the first fragment are empty.
Of course, one could argue that lsp_purge() should still retain the TLV memory, freeing it and then reallocating it if needed. However, this is a debatable point because in some scenarios, it is permissible for the LSP to have empty TLVs. For example, after receiving an SNP (Sequence Number PDU) message, an empty LSP (with lsp->hdr.seqno = 0) might be created by calling lsp_new. If the corresponding LSP message is discarded due to domain or area authentication failure, the TLV memory wouldn't be allocated.

Test scenario:
In an LFA network, importing a sufficient number of static routes to cause LSP fragmentation, and then rolling back the imported static routes so that the LSP is no longer fragmented, can easily result in this issue.


Signed-off-by: zhou-run <zhou.run@h3c.com>
donaldsharp pushed a commit that referenced this pull request Jul 26, 2024
Fix the following crash when pim options are (un)configured on an
non-existent interface.

> r1(config)# int fgljdsf
> r1(config-if)# no ip pim unicast-bsm
> vtysh: error reading from pimd: Connection reset by peer (104)Warning: closing connection to pimd because of an I/O error!

> #0  raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:50
> #1  0x00007f70c8f32249 in core_handler (signo=11, siginfo=0x7fffff88e4f0, context=0x7fffff88e3c0) at lib/sigevent.c:258
> #2  <signal handler called>
> #3  0x0000556cfdd9b16d in lib_interface_pim_address_family_unicast_bsm_modify (args=0x7fffff88f130) at pimd/pim_nb_config.c:1910
> #4  0x00007f70c8efdcb5 in nb_callback_modify (context=0x556d00032b60, nb_node=0x556cffeeb9b0, event=NB_EV_APPLY, dnode=0x556d00031670, resource=0x556d00032b48, errmsg=0x7fffff88f710 "", errmsg_len=8192)
>     at lib/northbound.c:1538
> #5  0x00007f70c8efe949 in nb_callback_configuration (context=0x556d00032b60, event=NB_EV_APPLY, change=0x556d00032b10, errmsg=0x7fffff88f710 "", errmsg_len=8192) at lib/northbound.c:1888
> #6  0x00007f70c8efee82 in nb_transaction_process (event=NB_EV_APPLY, transaction=0x556d00032b60, errmsg=0x7fffff88f710 "", errmsg_len=8192) at lib/northbound.c:2016
> #7  0x00007f70c8efd658 in nb_candidate_commit_apply (transaction=0x556d00032b60, save_transaction=true, transaction_id=0x0, errmsg=0x7fffff88f710 "", errmsg_len=8192) at lib/northbound.c:1356
> #8  0x00007f70c8efd78e in nb_candidate_commit (context=..., candidate=0x556cffeb0e80, save_transaction=true, comment=0x0, transaction_id=0x0, errmsg=0x7fffff88f710 "", errmsg_len=8192) at lib/northbound.c:1389
> #9  0x00007f70c8f03e58 in nb_cli_classic_commit (vty=0x556d00025a80) at lib/northbound_cli.c:51
> #10 0x00007f70c8f043f8 in nb_cli_apply_changes_internal (vty=0x556d00025a80,
>     xpath_base=0x7fffff893bb0 "/frr-interface:lib/interface[name='fgljdsf']/frr-pim:pim/address-family[address-family='frr-routing:ipv4']", clear_pending=false) at lib/northbound_cli.c:178
> #11 0x00007f70c8f0475d in nb_cli_apply_changes (vty=0x556d00025a80, xpath_base_fmt=0x556cfdde9fe0 "./frr-pim:pim/address-family[address-family='%s']") at lib/northbound_cli.c:234
> #12 0x0000556cfdd8298f in pim_process_no_unicast_bsm_cmd (vty=0x556d00025a80) at pimd/pim_cmd_common.c:3493
> #13 0x0000556cfddcf782 in no_ip_pim_ucast_bsm (self=0x556cfde40b20 <no_ip_pim_ucast_bsm_cmd>, vty=0x556d00025a80, argc=4, argv=0x556d00031500) at pimd/pim_cmd.c:4950
> #14 0x00007f70c8e942f0 in cmd_execute_command_real (vline=0x556d00032070, vty=0x556d00025a80, cmd=0x0, up_level=0) at lib/command.c:1002
> #15 0x00007f70c8e94451 in cmd_execute_command (vline=0x556d00032070, vty=0x556d00025a80, cmd=0x0, vtysh=0) at lib/command.c:1061
> #16 0x00007f70c8e9499f in cmd_execute (vty=0x556d00025a80, cmd=0x556d00030320 "no ip pim unicast-bsm", matched=0x0, vtysh=0) at lib/command.c:1227
> #17 0x00007f70c8f51e44 in vty_command (vty=0x556d00025a80, buf=0x556d00030320 "no ip pim unicast-bsm") at lib/vty.c:616
> #18 0x00007f70c8f53bdd in vty_execute (vty=0x556d00025a80) at lib/vty.c:1379
> #19 0x00007f70c8f55d59 in vtysh_read (thread=0x7fffff896600) at lib/vty.c:2374
> #20 0x00007f70c8f4b209 in event_call (thread=0x7fffff896600) at lib/event.c:2011
> #21 0x00007f70c8ed109e in frr_run (master=0x556cffdb4ea0) at lib/libfrr.c:1217
> #22 0x0000556cfdddec12 in main (argc=2, argv=0x7fffff896828, envp=0x7fffff896840) at pimd/pim_main.c:165
> (gdb) f 3
> #3  0x0000556cfdd9b16d in lib_interface_pim_address_family_unicast_bsm_modify (args=0x7fffff88f130) at pimd/pim_nb_config.c:1910
> 1910			pim_ifp->ucast_bsm_accept =
> (gdb) list
> 1905		case NB_EV_ABORT:
> 1906			break;
> 1907		case NB_EV_APPLY:
> 1908			ifp = nb_running_get_entry(args->dnode, NULL, true);
> 1909			pim_ifp = ifp->info;
> 1910			pim_ifp->ucast_bsm_accept =
> 1911				yang_dnode_get_bool(args->dnode, NULL);
> 1912
> 1913			break;
> 1914		}
> (gdb) p pim_ifp
> $1 = (struct pim_interface *) 0x0

Fixes: 3bb513c ("lib: adapt to version 2 of libyang")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
donaldsharp pushed a commit that referenced this pull request Aug 6, 2024
It might cause this use-after-free:

```
==6523==ERROR: AddressSanitizer: heap-use-after-free on address 0x60300058d720 at pc 0x55f3ab62ab1f bp 0x7ffe5b95a0d0 sp 0x7ffe5b95a0c8
READ of size 8 at 0x60300058d720 thread T0
    #0 0x55f3ab62ab1e in bgp_gr_update_mode_of_all_peers bgpd/bgp_fsm.c:2729
    #1 0x55f3ab62ab1e in bgp_gr_update_all bgpd/bgp_fsm.c:2779
    #2 0x55f3ab73557e in bgp_inst_gr_config_vty bgpd/bgp_vty.c:3037
    #3 0x55f3ab74db69 in bgp_graceful_restart bgpd/bgp_vty.c:3130
    #4 0x7fc5539a9584 in cmd_execute_command_real lib/command.c:1002
    #5 0x7fc5539a98a3 in cmd_execute_command lib/command.c:1061
    #6 0x7fc5539a9dcf in cmd_execute lib/command.c:1227
    #7 0x7fc553ae493f in vty_command lib/vty.c:616
    #8 0x7fc553ae4e92 in vty_execute lib/vty.c:1379
    #9 0x7fc553aedd34 in vtysh_read lib/vty.c:2374
    #10 0x7fc553ad8a64 in event_call lib/event.c:1995
    #11 0x7fc553a0c429 in frr_run lib/libfrr.c:1232
    #12 0x55f3ab57b78d in main bgpd/bgp_main.c:555
    #13 0x7fc55342d249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    #14 0x7fc55342d304 in __libc_start_main_impl ../csu/libc-start.c:360
    #15 0x55f3ab5799a0 in _start (/usr/lib/frr/bgpd+0x2e19a0)

0x60300058d720 is located 16 bytes inside of 24-byte region [0x60300058d710,0x60300058d728)
freed by thread T0 here:
    #0 0x7fc553eb76a8 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:52
    #1 0x7fc553a2b713 in qfree lib/memory.c:130
    #2 0x7fc553a0e50d in listnode_free lib/linklist.c:81
    #3 0x7fc553a0e50d in list_delete_node lib/linklist.c:379
    #4 0x55f3ab7ae353 in peer_delete bgpd/bgpd.c:2796
    #5 0x55f3ab7ae91f in bgp_session_reset bgpd/bgpd.c:141
    #6 0x55f3ab62ab17 in bgp_gr_update_mode_of_all_peers bgpd/bgp_fsm.c:2752
    #7 0x55f3ab62ab17 in bgp_gr_update_all bgpd/bgp_fsm.c:2779
    #8 0x55f3ab73557e in bgp_inst_gr_config_vty bgpd/bgp_vty.c:3037
    #9 0x55f3ab74db69 in bgp_graceful_restart bgpd/bgp_vty.c:3130
    #10 0x7fc5539a9584 in cmd_execute_command_real lib/command.c:1002
    #11 0x7fc5539a98a3 in cmd_execute_command lib/command.c:1061
    #12 0x7fc5539a9dcf in cmd_execute lib/command.c:1227
    #13 0x7fc553ae493f in vty_command lib/vty.c:616
    #14 0x7fc553ae4e92 in vty_execute lib/vty.c:1379
    #15 0x7fc553aedd34 in vtysh_read lib/vty.c:2374
    #16 0x7fc553ad8a64 in event_call lib/event.c:1995
    #17 0x7fc553a0c429 in frr_run lib/libfrr.c:1232
    #18 0x55f3ab57b78d in main bgpd/bgp_main.c:555
    #19 0x7fc55342d249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

previously allocated by thread T0 here:
    #0 0x7fc553eb83b7 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77
    #1 0x7fc553a2ae20 in qcalloc lib/memory.c:105
    #2 0x7fc553a0d056 in listnode_new lib/linklist.c:71
    #3 0x7fc553a0d85b in listnode_add_sort lib/linklist.c:197
    #4 0x55f3ab7baec4 in peer_create bgpd/bgpd.c:1996
    #5 0x55f3ab65be8b in bgp_accept bgpd/bgp_network.c:604
    #6 0x7fc553ad8a64 in event_call lib/event.c:1995
    #7 0x7fc553a0c429 in frr_run lib/libfrr.c:1232
    #8 0x55f3ab57b78d in main bgpd/bgp_main.c:555
    #9 0x7fc55342d249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
```

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
donaldsharp added a commit that referenced this pull request Aug 6, 2024
This value is being set and read at the same time according
to the thread sanitizer
WARNING: ThreadSanitizer: data race (pid=2914253)
  Read of size 2 at 0x7ba800011b10 by thread T2:
    #0 validate_header bgpd/bgp_io.c:601 (bgpd+0x60c5e0)
    #1 read_ibuf_work bgpd/bgp_io.c:177 (bgpd+0x608ffe)
    #2 bgp_process_reads bgpd/bgp_io.c:261 (bgpd+0x609880)
    #3 event_call lib/event.c:2011 (libfrr.so.0+0x59168d)
    #4 fpt_run lib/frr_pthread.c:369 (libfrr.so.0+0x35154e)
    #5 frr_pthread_inner lib/frr_pthread.c:178 (libfrr.so.0+0x34fef6)

  Previous write of size 2 at 0x7ba800011b10 by main thread:
    #0 bgp_open_option_parse bgpd/bgp_open.c:1469 (bgpd+0xb5006f)
    #1 bgp_open_receive bgpd/bgp_packet.c:2100 (bgpd+0x6b3f5c)
    #2 bgp_process_packet bgpd/bgp_packet.c:4019 (bgpd+0x6c9549)
    #3 event_call lib/event.c:2011 (libfrr.so.0+0x59168d)
    #4 frr_run lib/libfrr.c:1217 (libfrr.so.0+0x3b04a9)
    #5 main bgpd/bgp_main.c:548 (bgpd+0x49aa3d)

  Location is heap block of size 24328 at 0x7ba80000c000 allocated by main thread:
    #0 calloc ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:667 (libtsan.so.2+0x3fdd2)
    #1 qcalloc lib/memory.c:105 (libfrr.so.0+0x3f2784)
    #2 peer_new bgpd/bgpd.c:1517 (bgpd+0x955024)
    #3 peer_create bgpd/bgpd.c:1941 (bgpd+0x95c908)
    #4 peer_remote_as bgpd/bgpd.c:2211 (bgpd+0x9614a6)
    #5 peer_remote_as_vty bgpd/bgp_vty.c:4788 (bgpd+0x881239)
    #6 neighbor_remote_as bgpd/bgp_vty.c:4869 (bgpd+0x881a28)
    #7 cmd_execute_command_real lib/command.c:1002 (libfrr.so.0+0x2b53a2)
    #8 cmd_execute_command_strict lib/command.c:1111 (libfrr.so.0+0x2b5e0b)
    #9 command_config_read_one_line lib/command.c:1271 (libfrr.so.0+0x2b6972)
    #10 config_from_file lib/command.c:1324 (libfrr.so.0+0x2b7035)
    #11 vty_read_file lib/vty.c:2607 (libfrr.so.0+0x5c0d19)
    #12 vty_read_config lib/vty.c:2853 (libfrr.so.0+0x5c1f37)
    #13 frr_config_read_in lib/libfrr.c:981 (libfrr.so.0+0x3ae76a)
    #14 event_call lib/event.c:2011 (libfrr.so.0+0x59168d)
    #15 frr_run lib/libfrr.c:1217 (libfrr.so.0+0x3b04a9)
    #16 main bgpd/bgp_main.c:548 (bgpd+0x49aa3d)

  Thread T2 'bgpd_io' (tid=2914257, running) created by main thread at:
    #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:1001 (libtsan.so.2+0x63a59)
    #1 frr_pthread_run lib/frr_pthread.c:197 (libfrr.so.0+0x3500da)
    #2 bgp_pthreads_run bgpd/bgpd.c:8490 (bgpd+0x9d7716)
    #3 main bgpd/bgp_main.c:547 (bgpd+0x49a9c8)

Fix this.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
donaldsharp pushed a commit that referenced this pull request Aug 22, 2024
When 'no rpki' is requested and the rtrlib RPKI object was freed, bgpd
is crashing.

RPKI is configured in VRF red.

> ip l set red down
> ip l del red
> printf 'conf\n vrf red\n no rpki' | vtysh

> Core was generated by `/usr/bin/bgpd -A 127.0.0.1 -M snmp -M rpki -M bmp'.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  __pthread_kill_implementation (no_tid=0, signo=11, threadid=140411103615424) at ./nptl/pthread_kill.c:44
> 44	./nptl/pthread_kill.c: No such file or directory.
> [Current thread is 1 (Thread 0x7fb401f419c0 (LWP 190226))]
> (gdb) bt
> #0  __pthread_kill_implementation (no_tid=0, signo=11, threadid=140411103615424) at ./nptl/pthread_kill.c:44
> #1  __pthread_kill_internal (signo=11, threadid=140411103615424) at ./nptl/pthread_kill.c:78
> #2  __GI___pthread_kill (threadid=140411103615424, signo=signo@entry=11) at ./nptl/pthread_kill.c:89
> #3  0x00007fb4021ad476 in __GI_raise (sig=11) at ../sysdeps/posix/raise.c:26
> #4  0x00007fb4025ce22b in core_handler (signo=11, siginfo=0x7fff831b2d70, context=0x7fff831b2c40) at lib/sigevent.c:248
> #5  <signal handler called>
> #6  rtr_mgr_remove_group (config=0x55fe8789f750, preference=11) at /build/make-pkg/output/source/DIST_RTRLIB/rtrlib/rtrlib/rtr_mgr.c:607
> #7  0x00007fb40145f518 in rpki_delete_all_cache_nodes (rpki_vrf=0x55fe8789f4f0) at bgpd/bgp_rpki.c:442
> #8  0x00007fb401463098 in no_rpki_magic (self=0x7fb40146bba0 <no_rpki_cmd>, vty=0x55fe877f5130, argc=2, argv=0x55fe877fccd0) at bgpd/bgp_rpki.c:1732
> #9  0x00007fb40145c09a in no_rpki (self=0x7fb40146bba0 <no_rpki_cmd>, vty=0x55fe877f5130, argc=2, argv=0x55fe877fccd0) at ./bgpd/bgp_rpki_clippy.c:37
> #10 0x00007fb402527abc in cmd_execute_command_real (vline=0x55fe877fd150, vty=0x55fe877f5130, cmd=0x0, up_level=0) at lib/command.c:984
> #11 0x00007fb402527c35 in cmd_execute_command (vline=0x55fe877fd150, vty=0x55fe877f5130, cmd=0x0, vtysh=0) at lib/command.c:1043
> #12 0x00007fb4025281e5 in cmd_execute (vty=0x55fe877f5130, cmd=0x55fe877fb8c0 "no rpki\n", matched=0x0, vtysh=0) at lib/command.c:1209
> #13 0x00007fb4025f0aed in vty_command (vty=0x55fe877f5130, buf=0x55fe877fb8c0 "no rpki\n") at lib/vty.c:615
> #14 0x00007fb4025f2a11 in vty_execute (vty=0x55fe877f5130) at lib/vty.c:1378
> #15 0x00007fb4025f513d in vtysh_read (thread=0x7fff831b5fa0) at lib/vty.c:2373
> #16 0x00007fb4025e9611 in event_call (thread=0x7fff831b5fa0) at lib/event.c:2011
> #17 0x00007fb402566976 in frr_run (master=0x55fe871a14a0) at lib/libfrr.c:1212
> #18 0x000055fe857829fa in main (argc=9, argv=0x7fff831b6218) at bgpd/bgp_main.c:549

Fixes: 8156765 ("bgpd: Add `no rpki` command")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
donaldsharp pushed a commit that referenced this pull request Aug 28, 2024
Fix crash when flex-algo is configured and mpls-te is disabled.

> interface eth0
>  ip router isis 1
> !
> router isis 1
>  flex-algo 129
>   dataplane sr-mpls
>   advertise-definition

> #0  __pthread_kill_implementation (no_tid=0, signo=11, threadid=140486233631168) at ./nptl/pthread_kill.c:44
> #1  __pthread_kill_internal (signo=11, threadid=140486233631168) at ./nptl/pthread_kill.c:78
> #2  __GI___pthread_kill (threadid=140486233631168, signo=signo@entry=11) at ./nptl/pthread_kill.c:89
> #3  0x00007fc5802e9476 in __GI_raise (sig=11) at ../sysdeps/posix/raise.c:26
> #4  0x00007fc58076021f in core_handler (signo=11, siginfo=0x7ffd38d42470, context=0x7ffd38d42340) at lib/sigevent.c:248
> #5  <signal handler called>
> #6  0x000055c527f798c9 in isis_link_params_update_asla (circuit=0x55c52aaed3c0, ifp=0x55c52a1044e0) at isisd/isis_te.c:176
> #7  0x000055c527fb29da in isis_instance_flex_algo_create (args=0x7ffd38d43120) at isisd/isis_nb_config.c:2875
> #8  0x00007fc58072655b in nb_callback_create (context=0x55c52ab1d2f0, nb_node=0x55c529f72950, event=NB_EV_APPLY, dnode=0x55c52ab06230, resource=0x55c52ab189f8, errmsg=0x7ffd38d43750 "",
>     errmsg_len=8192) at lib/northbound.c:1262
> #9  0x00007fc580727625 in nb_callback_configuration (context=0x55c52ab1d2f0, event=NB_EV_APPLY, change=0x55c52ab189c0, errmsg=0x7ffd38d43750 "", errmsg_len=8192) at lib/northbound.c:1662
> #10 0x00007fc580727c39 in nb_transaction_process (event=NB_EV_APPLY, transaction=0x55c52ab1d2f0, errmsg=0x7ffd38d43750 "", errmsg_len=8192) at lib/northbound.c:1794
> #11 0x00007fc580725f77 in nb_candidate_commit_apply (transaction=0x55c52ab1d2f0, save_transaction=true, transaction_id=0x0, errmsg=0x7ffd38d43750 "", errmsg_len=8192)
>     at lib/northbound.c:1131
> #12 0x00007fc5807260d1 in nb_candidate_commit (context=..., candidate=0x55c529f0a730, save_transaction=true, comment=0x0, transaction_id=0x0, errmsg=0x7ffd38d43750 "", errmsg_len=8192)
>     at lib/northbound.c:1164
> #13 0x00007fc58072d220 in nb_cli_classic_commit (vty=0x55c52a0fc6b0) at lib/northbound_cli.c:51
> #14 0x00007fc58072d839 in nb_cli_apply_changes_internal (vty=0x55c52a0fc6b0,
>     xpath_base=0x7ffd38d477f0 "/frr-isisd:isis/instance[area-tag='1'][vrf='default']/flex-algos/flex-algo[flex-algo='129']", clear_pending=false) at lib/northbound_cli.c:178
> #15 0x00007fc58072dbcf in nb_cli_apply_changes (vty=0x55c52a0fc6b0, xpath_base_fmt=0x55c528014de0 "./flex-algos/flex-algo[flex-algo='%ld']") at lib/northbound_cli.c:234
> #16 0x000055c527fd3403 in flex_algo_magic (self=0x55c52804f1a0 <flex_algo_cmd>, vty=0x55c52a0fc6b0, argc=2, argv=0x55c52ab00ec0, algorithm=129, algorithm_str=0x55c52ab120d0 "129")
>     at isisd/isis_cli.c:3752
> #17 0x000055c527fc97cb in flex_algo (self=0x55c52804f1a0 <flex_algo_cmd>, vty=0x55c52a0fc6b0, argc=2, argv=0x55c52ab00ec0) at ./isisd/isis_cli_clippy.c:6445
> #18 0x00007fc5806b9abc in cmd_execute_command_real (vline=0x55c52aaf78f0, vty=0x55c52a0fc6b0, cmd=0x0, up_level=0) at lib/command.c:984
> #19 0x00007fc5806b9c35 in cmd_execute_command (vline=0x55c52aaf78f0, vty=0x55c52a0fc6b0, cmd=0x0, vtysh=0) at lib/command.c:1043
> #20 0x00007fc5806ba1e5 in cmd_execute (vty=0x55c52a0fc6b0, cmd=0x55c52aae6bd0 "flex-algo 129\n", matched=0x0, vtysh=0) at lib/command.c:1209
> #21 0x00007fc580782ae1 in vty_command (vty=0x55c52a0fc6b0, buf=0x55c52aae6bd0 "flex-algo 129\n") at lib/vty.c:615
> #22 0x00007fc580784a05 in vty_execute (vty=0x55c52a0fc6b0) at lib/vty.c:1378
> #23 0x00007fc580787131 in vtysh_read (thread=0x7ffd38d4ab10) at lib/vty.c:2373
> #24 0x00007fc58077b605 in event_call (thread=0x7ffd38d4ab10) at lib/event.c:2011
> #25 0x00007fc5806f8976 in frr_run (master=0x55c529df9b30) at lib/libfrr.c:1212
> #26 0x000055c527f301bc in main (argc=5, argv=0x7ffd38d4ad58, envp=0x7ffd38d4ad88) at isisd/isis_main.c:350
> (gdb) f 6
> #6  0x000055c527f798c9 in isis_link_params_update_asla (circuit=0x55c52aaed3c0, ifp=0x55c52a1044e0) at isisd/isis_te.c:176
> 176                     list_delete_all_node(ext->aslas);
> (gdb) p ext
> $1 = (struct isis_ext_subtlvs *) 0x0

Fixes: ae27101 ("isisd: fix building asla at first flex-algo config")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
donaldsharp added a commit that referenced this pull request Aug 30, 2024
Let's prevent some issues from happening from
reading bad data from a peer.

==1853291==ERROR: AddressSanitizer: stack-buffer-overflow on address 0xfa85eb113121 at pc 0xc6a2300ccce8 bp 0xffffebaa1c50 sp 0xffffebaa1c48
WRITE of size 1 at 0xfa85eb113121 thread T0
    #0 0xc6a2300ccce4 in unpack_item_ext_subtlv_asla /home/ubuntu/frr-public/frr_public_private-libfuzzer/isisd/isis_tlvs.c:1950:11
    #1 0xc6a2300cc5c0 in unpack_item_ext_subtlvs /home/ubuntu/frr-public/frr_public_private-libfuzzer/isisd/isis_tlvs.c:2555:8
    #2 0xc6a2300c3264 in unpack_item_extended_reach /home/ubuntu/frr-public/frr_public_private-libfuzzer/isisd/isis_tlvs.c:3890:7
    #3 0xc6a2300be3e4 in unpack_item /home/ubuntu/frr-public/frr_public_private-libfuzzer/isisd/isis_tlvs.c:7027:10
    #4 0xc6a2300bd140 in unpack_tlv_with_items /home/ubuntu/frr-public/frr_public_private-libfuzzer/isisd/isis_tlvs.c:7091:8
    #5 0xc6a2300fb268 in unpack_tlv /home/ubuntu/frr-public/frr_public_private-libfuzzer/isisd/isis_tlvs.c:8010:10
    #6 0xc6a2300ad508 in unpack_tlvs /home/ubuntu/frr-public/frr_public_private-libfuzzer/isisd/isis_tlvs.c:8032:8
    #7 0xc6a2300ad2d4 in isis_unpack_tlvs /home/ubuntu/frr-public/frr_public_private-libfuzzer/isisd/isis_tlvs.c:8063:7
    #8 0xc6a230016840 in process_lsp /home/ubuntu/frr-public/frr_public_private-libfuzzer/isisd/isis_pdu.c:969:6
    #9 0xc6a230012ff8 in isis_handle_pdu /home/ubuntu/frr-public/frr_public_private-libfuzzer/isisd/isis_pdu.c:1823:12
    #10 0xc6a22ffa7f0c in LLVMFuzzerTestOneInput /home/ubuntu/frr-public/frr_public_private-libfuzzer/isisd/isis_main.c:376:5

Let's just make sure that we have enough data to read.

Reported-by: Iggy Frankovic <iggyfran@amazon.com>
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
donaldsharp pushed a commit that referenced this pull request Sep 11, 2024
The following causes a isisd crash.

> # cat config
> affinity-map green bit-position 0
> router isis 1
>  flex-algo 129
>   affinity exclude-any green
> # vtysh -f config

> #0  raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:50
> #1  0x00007f650cd32756 in core_handler (signo=6, siginfo=0x7ffc56f93070, context=0x7ffc56f92f40) at lib/sigevent.c:258
> #2  <signal handler called>
> #3  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
> #4  0x00007f650c91c537 in __GI_abort () at abort.c:79
> #5  0x00007f650cd007c9 in nb_running_get_entry_worker (dnode=0x0, xpath=0x0, abort_if_not_found=true, rec_search=true) at lib/northbound.c:2531
> #6  0x00007f650cd007f9 in nb_running_get_entry (dnode=0x55d9ad406e00, xpath=0x0, abort_if_not_found=true) at lib/northbound.c:2537
> #7  0x000055d9ab302248 in isis_instance_flex_algo_affinity_set (args=0x7ffc56f947a0, type=2) at isisd/isis_nb_config.c:2998
> #8  0x000055d9ab3027c0 in isis_instance_flex_algo_affinity_exclude_any_create (args=0x7ffc56f947a0) at isisd/isis_nb_config.c:3155
> #9  0x00007f650ccfe284 in nb_callback_create (context=0x7ffc56f94d20, nb_node=0x55d9ad28b540, event=NB_EV_VALIDATE, dnode=0x55d9ad406e00, resource=0x0, errmsg=0x7ffc56f94de0 "",
>     errmsg_len=8192) at lib/northbound.c:1487
> #10 0x00007f650ccff067 in nb_callback_configuration (context=0x7ffc56f94d20, event=NB_EV_VALIDATE, change=0x55d9ad406d40, errmsg=0x7ffc56f94de0 "", errmsg_len=8192) at lib/northbound.c:1884
> #11 0x00007f650ccfda31 in nb_candidate_validate_code (context=0x7ffc56f94d20, candidate=0x55d9ad20d710, changes=0x7ffc56f94d38, errmsg=0x7ffc56f94de0 "", errmsg_len=8192)
>     at lib/northbound.c:1246
> #12 0x00007f650ccfdc67 in nb_candidate_commit_prepare (context=..., candidate=0x55d9ad20d710, comment=0x0, transaction=0x7ffc56f94da0, skip_validate=false, ignore_zero_change=false,
>     errmsg=0x7ffc56f94de0 "", errmsg_len=8192) at lib/northbound.c:1317
> #13 0x00007f650ccfdec4 in nb_candidate_commit (context=..., candidate=0x55d9ad20d710, save_transaction=true, comment=0x0, transaction_id=0x0, errmsg=0x7ffc56f94de0 "", errmsg_len=8192)
>     at lib/northbound.c:1381
> #14 0x00007f650cd045ba in nb_cli_classic_commit (vty=0x55d9ad3f7490) at lib/northbound_cli.c:57
> #15 0x00007f650cd04749 in nb_cli_pending_commit_check (vty=0x55d9ad3f7490) at lib/northbound_cli.c:96
> #16 0x00007f650cc94340 in cmd_execute_command_real (vline=0x55d9ad3eea10, vty=0x55d9ad3f7490, cmd=0x0, up_level=0) at lib/command.c:1000
> #17 0x00007f650cc94599 in cmd_execute_command (vline=0x55d9ad3eea10, vty=0x55d9ad3f7490, cmd=0x0, vtysh=0) at lib/command.c:1080
> #18 0x00007f650cc94a0c in cmd_execute (vty=0x55d9ad3f7490, cmd=0x55d9ad401d30 "XFRR_end_configuration", matched=0x0, vtysh=0) at lib/command.c:1228
> #19 0x00007f650cd523a4 in vty_command (vty=0x55d9ad3f7490, buf=0x55d9ad401d30 "XFRR_end_configuration") at lib/vty.c:625
> #20 0x00007f650cd5413d in vty_execute (vty=0x55d9ad3f7490) at lib/vty.c:1388
> #21 0x00007f650cd56353 in vtysh_read (thread=0x7ffc56f99370) at lib/vty.c:2400
> #22 0x00007f650cd4b6fd in event_call (thread=0x7ffc56f99370) at lib/event.c:1996
> #23 0x00007f650ccd1365 in frr_run (master=0x55d9ad103cf0) at lib/libfrr.c:1231
> #24 0x000055d9ab29036e in main (argc=2, argv=0x7ffc56f99598, envp=0x7ffc56f995b0) at isisd/isis_main.c:354

Configuring the same in vtysh configure interactive mode works properly.
When using "vtysh -f", the northbound compatible configuration is
committed together whereas, in interactive mode, it committed line by
line. In the first situation, in validation state nb_running_get_entry()
fails because the area not yet in running.

Do not use nb_running_get_entry() northbound validation state.

Fixes: 893882e ("isisd: add isis flex-algo configuration backend")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
donaldsharp pushed a commit that referenced this pull request Sep 17, 2024
Fix a crash when modifying a route-map with set as-path exclude without
as-path-access-list:

> router(config)# route-map routemaptest deny 1
> router(config-route-map)# set as-path exclude 33 34 35
> router(config-route-map)# set as-path exclude as-path-access-list test

> #0  raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:50
> #1  0x00007fb3959327de in core_handler (signo=11, siginfo=0x7ffd122da530, context=0x7ffd122da400) at lib/sigevent.c:258
> #2  <signal handler called>
> #3  0x000055ab2762a1bd in as_list_list_del (h=0x55ab27897680 <as_exclude_list_orphan>, item=0x55ab28204e20) at ./bgpd/bgp_aspath.h:77
> #4  0x000055ab2762d1a8 in as_exclude_remove_orphan (ase=0x55ab28204e20) at bgpd/bgp_aspath.c:1574
> #5  0x000055ab27550538 in route_aspath_exclude_free (rule=0x55ab28204e20) at bgpd/bgp_routemap.c:2366
> #6  0x00007fb39591f00c in route_map_rule_delete (list=0x55ab28203498, rule=0x55ab28204170) at lib/routemap.c:1357
> #7  0x00007fb39591f87c in route_map_add_set (index=0x55ab28203460, set_name=0x55ab276ad2aa "as-path exclude", set_arg=0x55ab281e4f70 "as-path-access-list test") at lib/routemap.c:1674
> #8  0x00007fb39591d3f3 in generic_set_add (index=0x55ab28203460, command=0x55ab276ad2aa "as-path exclude", arg=0x55ab281e4f70 "as-path-access-list test", errmsg=0x7ffd122db870 "",
>     errmsg_len=8192) at lib/routemap.c:533
> #9  0x000055ab2755e78e in lib_route_map_entry_set_action_rmap_set_action_exclude_as_path_modify (args=0x7ffd122db290) at bgpd/bgp_routemap_nb_config.c:2427
> #10 0x00007fb3958fe417 in nb_callback_modify (context=0x55ab28205aa0, nb_node=0x55ab27cb31e0, event=NB_EV_APPLY, dnode=0x55ab28202690, resource=0x55ab27c32148, errmsg=0x7ffd122db870 "",
>     errmsg_len=8192) at lib/northbound.c:1538
> #11 0x00007fb3958ff0ab in nb_callback_configuration (context=0x55ab28205aa0, event=NB_EV_APPLY, change=0x55ab27c32110, errmsg=0x7ffd122db870 "", errmsg_len=8192) at lib/northbound.c:1888
> #12 0x00007fb3958ff5e4 in nb_transaction_process (event=NB_EV_APPLY, transaction=0x55ab28205aa0, errmsg=0x7ffd122db870 "", errmsg_len=8192) at lib/northbound.c:2016
> #13 0x00007fb3958fddba in nb_candidate_commit_apply (transaction=0x55ab28205aa0, save_transaction=true, transaction_id=0x0, errmsg=0x7ffd122db870 "", errmsg_len=8192)
>     at lib/northbound.c:1356
> #14 0x00007fb3958fdef0 in nb_candidate_commit (context=..., candidate=0x55ab27c2c9a0, save_transaction=true, comment=0x0, transaction_id=0x0, errmsg=0x7ffd122db870 "", errmsg_len=8192)
>     at lib/northbound.c:1389
> #15 0x00007fb3959045ba in nb_cli_classic_commit (vty=0x55ab281f6680) at lib/northbound_cli.c:57
> #16 0x00007fb395904b5a in nb_cli_apply_changes_internal (vty=0x55ab281f6680, xpath_base=0x7ffd122dfd10 "/frr-route-map:lib/route-map[name='routemaptest']/entry[sequence='1']",
>     clear_pending=false) at lib/northbound_cli.c:184
> #17 0x00007fb395904ebf in nb_cli_apply_changes (vty=0x55ab281f6680, xpath_base_fmt=0x0) at lib/northbound_cli.c:240
> --Type <RET> for more, q to quit, c to continue without paging--
> #18 0x000055ab27557d2e in set_aspath_exclude_access_list_magic (self=0x55ab2775c300 <set_aspath_exclude_access_list_cmd>, vty=0x55ab281f6680, argc=5, argv=0x55ab28204c80,
>     as_path_filter_name=0x55ab28202040 "test") at bgpd/bgp_routemap.c:6397
> #19 0x000055ab2754bdea in set_aspath_exclude_access_list (self=0x55ab2775c300 <set_aspath_exclude_access_list_cmd>, vty=0x55ab281f6680, argc=5, argv=0x55ab28204c80)
>     at ./bgpd/bgp_routemap_clippy.c:856
> #20 0x00007fb39589435d in cmd_execute_command_real (vline=0x55ab281e61f0, vty=0x55ab281f6680, cmd=0x0, up_level=0) at lib/command.c:1003
> #21 0x00007fb3958944be in cmd_execute_command (vline=0x55ab281e61f0, vty=0x55ab281f6680, cmd=0x0, vtysh=0) at lib/command.c:1062
> #22 0x00007fb395894a0c in cmd_execute (vty=0x55ab281f6680, cmd=0x55ab28200f20 "set as-path exclude as-path-access-list test", matched=0x0, vtysh=0) at lib/command.c:1228
> #23 0x00007fb39595242c in vty_command (vty=0x55ab281f6680, buf=0x55ab28200f20 "set as-path exclude as-path-access-list test") at lib/vty.c:625
> #24 0x00007fb3959541c5 in vty_execute (vty=0x55ab281f6680) at lib/vty.c:1388
> #25 0x00007fb3959563db in vtysh_read (thread=0x7ffd122e2bb0) at lib/vty.c:2400
> #26 0x00007fb39594b785 in event_call (thread=0x7ffd122e2bb0) at lib/event.c:1996
> #27 0x00007fb3958d1365 in frr_run (master=0x55ab27b56d70) at lib/libfrr.c:1231
> #28 0x000055ab2747f1cc in main (argc=3, argv=0x7ffd122e2e08) at bgpd/bgp_main.c:555

Fixes: 094dcc3 ("bgpd: fix "bgp as-pah access-list" with "set aspath exclude" set/unset issues")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
donaldsharp pushed a commit that referenced this pull request Sep 17, 2024
Level 2 adjacency list is not supposed to be always set.

> #0  raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:50
> #1  0x00007f9f0353274f in core_handler (signo=6, siginfo=0x7ffe95260770, context=0x7ffe95260640) at lib/sigevent.c:258
> #2  <signal handler called>
> #3  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
> #4  0x00007f9f0324e537 in __GI_abort () at abort.c:79
> #5  0x00007f9f035744ea in _zlog_assert_failed (xref=0x7f9f0362c6c0 <_xref.15>, extra=0x0) at lib/zlog.c:789
> #6  0x00007f9f034d25ee in listnode_head (list=0x0) at lib/linklist.c:316
> #7  0x000055cd65aaa481 in lib_interface_state_isis_adjacencies_adjacency_get_next (args=0x7ffe95261730) at isisd/isis_nb_state.c:101
> #8  0x00007f9f034feadd in nb_callback_get_next (nb_node=0x55cd673c0190, parent_list_entry=0x55cd67570d30, list_entry=0x55cd6758f8a0) at lib/northbound.c:1748
> #9  0x00007f9f0350bf07 in __walk (ys=0x55cd675782b0, is_resume=false) at lib/northbound_oper.c:1264
> #10 0x00007f9f0350deaa in nb_op_walk_start (ys=0x55cd675782b0) at lib/northbound_oper.c:1741
> #11 0x00007f9f0350e079 in nb_oper_iterate_legacy (xpath=0x55cd67595c60 "/frr-interface:lib", translator=0x0, flags=0, cb=0x0, cb_arg=0x0, tree=0x7ffe952621b0) at lib/northbound_oper.c:1803
> #12 0x00007f9f03507661 in show_yang_operational_data_magic (self=0x7f9f03634a80 <show_yang_operational_data_cmd>, vty=0x55cd675a61f0, argc=4, argv=0x55cd6758eab0,
>     xpath=0x55cd67595c60 "/frr-interface:lib", json=0x0, xml=0x0, translator_family=0x0, with_config=0x0) at lib/northbound_cli.c:1576
> #13 0x00007f9f035037f0 in show_yang_operational_data (self=0x7f9f03634a80 <show_yang_operational_data_cmd>, vty=0x55cd675a61f0, argc=4, argv=0x55cd6758eab0)
>     at ./lib/northbound_cli_clippy.c:906
> #14 0x00007f9f0349435d in cmd_execute_command_real (vline=0x55cd6758e490, vty=0x55cd675a61f0, cmd=0x0, up_level=0) at lib/command.c:1003
> #15 0x00007f9f03494477 in cmd_execute_command (vline=0x55cd67585340, vty=0x55cd675a61f0, cmd=0x0, vtysh=0) at lib/command.c:1053
> #16 0x00007f9f03494a0c in cmd_execute (vty=0x55cd675a61f0, cmd=0x55cd67579040 "do show yang operational-data /frr-interface:lib", matched=0x0, vtysh=0) at lib/command.c:1228
> #17 0x00007f9f0355239d in vty_command (vty=0x55cd675a61f0, buf=0x55cd67579040 "do show yang operational-data /frr-interface:lib") at lib/vty.c:625
> #18 0x00007f9f03554136 in vty_execute (vty=0x55cd675a61f0) at lib/vty.c:1388
> #19 0x00007f9f0355634c in vtysh_read (thread=0x7ffe952647a0) at lib/vty.c:2400
> #20 0x00007f9f0354b6f6 in event_call (thread=0x7ffe952647a0) at lib/event.c:1996
> #21 0x00007f9f034d1365 in frr_run (master=0x55cd67204da0) at lib/libfrr.c:1231
> #22 0x000055cd65a3236e in main (argc=7, argv=0x7ffe952649c8, envp=0x7ffe95264a08) at isisd/isis_main.c:354

Fixes: 2a1c520 ("isisd: split northbound callbacks into multiple files")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants