Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FRRouting 9.1-0~ubuntu22.04.1 - bgpd segfault #15543

Open
2 tasks done
f0o opened this issue Mar 13, 2024 · 24 comments
Open
2 tasks done

FRRouting 9.1-0~ubuntu22.04.1 - bgpd segfault #15543

f0o opened this issue Mar 13, 2024 · 24 comments
Labels
triage Needs further investigation

Comments

@f0o
Copy link

f0o commented Mar 13, 2024

Description

bgpd segfaults frequently seemingly out of nowhere:

Mar 13 14:28:48 rt2 kernel: bgpd[1050426]: segfault at 4100000018 ip 00007f33d40575ff sp 00007ffc404043b8 error 4 in libfrr.so.0.0.0[7f33d3fe4000+b1000]
Mar 13 14:32:43 rt2 kernel: bgpd[1057483]: segfault at 4100000018 ip 00007ffb0fc085ff sp 00007ffde55466b8 error 4 in libfrr.so.0.0.0[7ffb0fb95000+b1000]
Mar 13 14:52:42 rt2 kernel: bgpd[1057775]: segfault at 4100000018 ip 00007f43ce71f5ff sp 00007fff499747f8 error 4 in libfrr.so.0.0.0[7f43ce6ac000+b1000]

Config is super slim, IBGP full mesh with 4 nodes sharing full-tables (~2.4M routes in kernel).

Hardware is 46G Ram and 56 Threads (Xeon Gold 6132) running Ubuntu 22.04 LTS - None of is is pegged, box is pretty idle.

Version

FRRouting 9.1 (rt2) on Linux(5.15.0-97-generic).
Copyright 1996-2005 Kunihiro Ishiguro, et al.
configured with:
    '--build=x86_64-linux-gnu' '--prefix=/usr' '--includedir=${prefix}/include' '--mandir=${prefix}/share/man' '--infodir=${prefix}/share/info' '--sysconfdir=/etc' '--localstatedir=/var' '--disable-option-checking' '--disable-silent-rules' '--libdir=${prefix}/lib/x86_64-linux-gnu' '--libexecdir=${prefix}/lib/x86_64-linux-gnu' '--disable-maintainer-mode' '--localstatedir=/var/run/frr' '--sbindir=/usr/lib/frr' '--sysconfdir=/etc/frr' '--with-vtysh-pager=/usr/bin/pager' '--libdir=/usr/lib/x86_64-linux-gnu/frr' '--with-moduledir=/usr/lib/x86_64-linux-gnu/frr/modules' '--disable-dependency-tracking' '--enable-rpki' '--disable-scripting' '--enable-pim6d' '--with-libpam' '--enable-doc' '--enable-doc-html' '--enable-snmp' '--enable-fpm' '--disable-protobuf' '--disable-zeromq' '--enable-ospfapi' '--enable-bgp-vnc' '--enable-multipath=256' '--enable-user=frr' '--enable-group=frr' '--enable-vty-group=frrvty' '--enable-configfile-mask=0640' '--enable-logfile-mask=0640' 'build_alias=x86_64-linux-gnu' 'PYTHON=python3'

How to reproduce

Unclear, it's only handling 3 peers with full ipv4 tables in IBGP, no filtering or VRF or anything fancy done.

It does happen every so often seemingly without triggers

Expected behavior

Not segfault

Actual behavior

Segfault

Additional context

Happy to provide more logs, I saw some core_handler memstat print outs close to the segfault line

Checklist

  • I have searched the open issues for this bug.
  • I have not included sensitive information in this report.
@f0o f0o added the triage Needs further investigation label Mar 13, 2024
@f0o
Copy link
Author

f0o commented Mar 13, 2024

for what's worth here are some more segfaults:
rt1:

Mar 13 14:13:24 rt1 kernel: bgpd[511742]: segfault at 4100000018 ip 00007fef132255ff sp 00007fff85f705b8 error 4 in libfrr.so.0.0.0[7fef131b2000+b1000]
Mar 13 14:14:51 rt1 kernel: bgpd[564036]: segfault at 4100000018 ip 00007f98157135ff sp 00007ffde3bfc378 error 4 in libfrr.so.0.0.0[7f98156a0000+b1000]
Mar 13 14:15:59 rt1 kernel: bgpd[564120]: segfault at 4100000018 ip 00007eff6f0ee5ff sp 00007fff1abc5af8 error 4 in libfrr.so.0.0.0[7eff6f07b000+b1000]
Mar 13 15:42:16 rt1 kernel: bgpd[564243]: segfault at 4100000018 ip 00007f08647325ff sp 00007ffe193d93b8 error 4 in libfrr.so.0.0.0[7f08646bf000+b1000]
Mar 13 15:43:22 rt1 kernel: bgpd[573542]: segfault at 4100000018 ip 00007f0481bcf5ff sp 00007ffe9a43b6b8 error 4 in libfrr.so.0.0.0[7f0481b5c000+b1000]
Mar 13 16:22:18 rt1 kernel: bgpd[573691]: segfault at 4100000018 ip 00007f1f9092e5ff sp 00007fffe4ca34b8 error 4 in libfrr.so.0.0.0[7f1f908bb000+b1000]
Mar 13 16:41:48 rt1 kernel: bgpd[576472]: segfault at 4100000018 ip 00007fe8f846f5ff sp 00007ffe09d198f8 error 4 in libfrr.so.0.0.0[7fe8f83fc000+b1000]
Mar 13 16:42:52 rt1 kernel: bgpd[579869]: segfault at 4100000018 ip 00007f67e08845ff sp 00007fffa533ad78 error 4 in libfrr.so.0.0.0[7f67e0811000+b1000]
Mar 13 16:49:03 rt1 kernel: bgpd[581033]: segfault at 4100000018 ip 00007f72b54335ff sp 00007ffd190c0538 error 4 in libfrr.so.0.0.0[7f72b53c0000+b1000]
Mar 13 16:56:09 rt1 kernel: bgpd[582663]: segfault at 4100000018 ip 00007f573380c5ff sp 00007fff7a868fb8 error 4 in libfrr.so.0.0.0[7f5733799000+b1000]
Mar 13 16:58:08 rt1 kernel: bgpd[583803]: segfault at 4100000018 ip 00007fcc47fc25ff sp 00007ffd37da1438 error 4 in libfrr.so.0.0.0[7fcc47f4f000+b1000]

rt2:

Mar 13 14:28:48 rt2 kernel: bgpd[1050426]: segfault at 4100000018 ip 00007f33d40575ff sp 00007ffc404043b8 error 4 in libfrr.so.0.0.0[7f33d3fe4000+b1000]
Mar 13 14:32:43 rt2 kernel: bgpd[1057483]: segfault at 4100000018 ip 00007ffb0fc085ff sp 00007ffde55466b8 error 4 in libfrr.so.0.0.0[7ffb0fb95000+b1000]
Mar 13 14:52:42 rt2 kernel: bgpd[1057775]: segfault at 4100000018 ip 00007f43ce71f5ff sp 00007fff499747f8 error 4 in libfrr.so.0.0.0[7f43ce6ac000+b1000]
Mar 13 15:49:21 rt2 kernel: bgpd[1060279]: segfault at 4100000018 ip 00007f30d21775ff sp 00007fffed91fab8 error 4 in libfrr.so.0.0.0[7f30d2104000+b1000]

Nothing worth noting happened during these timestamps, no traffic was being pushed or anything else. The systems are pretty much idling right now

@donaldsharp
Copy link
Member

You are going to need to add debug symbols and give us the decode. As it stands we don't know where/how FRR was compiled and as such we cannot do anything with the segfault data as given. You'll need to provide this to us

@f0o
Copy link
Author

f0o commented Mar 13, 2024

@donaldsharp this FRR is from the official FRR-Repos (https://deb.frrouting.org/frr jammy frr-stable)

but yeah just instruct me with what debug flags you need and I'll toss them in there. As it stands the segfault happens every few minutes

@donaldsharp
Copy link
Member

install the debug symbols from there then. then when the next crash happens give us the decode
and anything else in the log file

@f0o
Copy link
Author

f0o commented Mar 13, 2024

@donaldsharp got a helper to decode the segfault? or do you want me to load up frr into valgrind? or what? bit lost here with terminologies

@IvayloJ
Copy link

IvayloJ commented Mar 14, 2024

I guess Donald means to do this:

apt install frr-dbgsym

(make sure you install frr-dbgsym from the same location - repository from where you install frr package) Then restart frr, wait for the next crash, and post here, logs + config + what you can see. This way logs may show where and why the crash happens. The log from (~/bgpd-valgrind.log):

valgrind -s --leak-check=full --trace-children=yes --log-file=~/bgpd-valgrind.log  bgpd -d -f /etc/frr/bgpd.conf -F traditional -u frr -g frr -A 127.0.0.1

Also can be useful, but in case your bgpd is heavy loaded (and it is you receive/announce couple full bgpd internet tables right?), will be hard because the bgpd process will be super slow and will consume a lot cpu when started in valgrind debugger.

@f0o
Copy link
Author

f0o commented Mar 15, 2024

@IvayloJ how can I run that without watchfrr messing with it?

@f0o
Copy link
Author

f0o commented Mar 15, 2024

@IvayloJ I tried disabling bgpd in daemons and then run the cmd you gave me but it immediately returned:

==92699== Memcheck, a memory error detector
==92699== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==92699== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==92699== Command: /usr/lib/frr/bgpd -d -F traditional -A 127.0.0.1 -M snmp -M rpki
==92699== Parent PID: 5046
==92699==
==92699==
==92699== HEAP SUMMARY:
==92699==     in use at exit: 4,964,002 bytes in 122,181 blocks
==92699==   total heap usage: 294,464 allocs, 172,283 frees, 17,345,061 bytes allocated
==92699==
==92699== 40 bytes in 1 blocks are possibly lost in loss record 52,630 of 71,416
==92699==    at 0x484DA83: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==92699==    by 0x4D83CD9: cap_init (in /usr/lib/x86_64-linux-gnu/libcap.so.2.44)
==92699==    by 0x492EC34: zprivs_init (in /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0.0.0)
==92699==    by 0x4904CB4: frr_init (in /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0.0.0)
==92699==    by 0x1F82DF: main (in /usr/lib/frr/bgpd)
==92699==
==92699== LEAK SUMMARY:
==92699==    definitely lost: 0 bytes in 0 blocks
==92699==    indirectly lost: 0 bytes in 0 blocks
==92699==      possibly lost: 40 bytes in 1 blocks
==92699==    still reachable: 4,963,962 bytes in 122,180 blocks
==92699==         suppressed: 0 bytes in 0 blocks
==92699== Reachable blocks (those to which a pointer was found) are not shown.
==92699== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==92699==
==92699== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
==92699== could not unlink /tmp/vgdb-pipe-from-vgdb-to-92699-by-root-on-???
==92699== could not unlink /tmp/vgdb-pipe-to-vgdb-from-92699-by-root-on-???
==92699== could not unlink /tmp/vgdb-pipe-shared-mem-vgdb-92699-by-root-on-???

I dont think it even connected to any peers

@f0o
Copy link
Author

f0o commented Mar 15, 2024

@IvayloJ I added bgpd_wrap="valgrind -s --leak-check=full --trace-children=yes --log-file=/home/f0o/bgpd-valgrind.log " in /etc/frr/daemons

This made the bgpd churn 100% and be absolutely unusable. But it also didnt segfault after ~2hrs of 100% CPU rock solid.

I removed that line and it immediately segfaulted on startup.

Sounds like a racecondition to me

@f0o
Copy link
Author

f0o commented Mar 15, 2024

perf.zip

Managed to capture some perf samples if that helps

@IvayloJ
Copy link

IvayloJ commented Mar 15, 2024

As I wrote, it will be hard, because your bgpd is heavy loaded with couple full bgp internet tables. Regardless of this if you want to try with valgrind, first you have to stop all frr processes...

--- login as root (sudo su - ) or you have to execute all commands with sudo ---
systemctl stop frr (or /etc/init.d/frr stop)
--- wait and watch when all frr processes will gone (ps -ax |grep frr) ---
-- first start zebra:
zebra -d -f /etc/frr/zebra.conf -F traditional -u frr -g frr -A 127.0.0.1 -s 90000000
-- then start bgpd in vallgrind debugger:
valgrind -s --leak-check=full --trace-children=yes --log-file=~/bgpd-valgrind.log  bgpd -d -f /etc/frr/bgpd.conf -F traditional -u frr -g frr -A 127.0.0.1

But as I said it will be hard and most likely not working, because your bgpd is heavy loaded. I can 100% confirm that frr 8.1 to 9.1 (on debian 8/9/10/11/12 very close to your ubuntu, as well on slackware 13 - 15 my custom compile) some of them works with 20+ peers and up to 5 full internet bgp ipv4+ipv6 tables + rpki checks + another nearly 100k prefixes, without unusual crashes.
So lets start with something simple - Can you post your config file (/etc/frr/bgpd.conf) change your password if you have such and post or clear the section with it, can change and IP address if you are paranoid for the security :) (maybe you do something unusual in the config which trigger the crash). As Donald wrote, install frr-dbgsym they will show more useful (human readable) debug.

@f0o
Copy link
Author

f0o commented Mar 15, 2024

I got frr-dbgsym installed - the segfault kernel message hasn't changed a bit tho.

Sure config is as simple as it can be:

ip route 10.1.0.0/24 blackhole 254
ip route 10.2.0.0/24 blackhole 254
!
interface lo
 ip address 192.168.0.0/32
 ip ospf area 0.0.0.0
exit
!
interface vlan123
 description Loopback Interconnect
 ip address 192.168.0.0 peer 192.168.0.1/32
 ip ospf area 0.0.0.0
 ip ospf dead-interval 3
 ip ospf hello-interval 1
 ip ospf network point-to-point
 ip ospf retransmit-interval 1
exit
!
interface vlan456
 description TempNetwork
 ip address 192.168.1.10/24
exit
!
router bgp 65001
 bgp router-id 192.168.0.0
 bgp suppress-fib-pending
 bgp graceful-restart
 bgp bestpath as-path multipath-relax
 neighbor IBGP peer-group
 neighbor IBGP remote-as 65001
 neighbor IBGP description Core: IBGP
 neighbor IBGP update-source lo
 neighbor TMP peer-group
 neighbor TMP remote-as 65001
 neighbor TMP update-source vlan456
 neighbor 192.160.0.1 peer-group IBGP
 neighbor 192.168.1.20 peer-group TMP
 neighbor 192.168.1.30 peer-group TMP
 !
 address-family ipv4 unicast
  redistribute connected route-map IPV4-EXPORT-TO-IBGP
  redistribute static route-map IPV4-EXPORT-TO-IBGP
  neighbor IBGP next-hop-self
  neighbor IBGP soft-reconfiguration inbound
  neighbor OLD next-hop-self
  neighbor OLD soft-reconfiguration inbound
 exit-address-family
exit
!
router ospf
 ospf router-id 192.168.0.0
 redistribute connected route-map IPV4-OSPF-CONNECTED
 redistribute static route-map IPV4-OSPF-CONNECTED
exit
!
ip prefix-list LOCAL-AGGREGATES seq 100 permit 10.2.0.0/24
ip prefix-list LOCAL-AGGREGATES seq 200 permit 10.1.0.0/24
ip prefix-list LOOPBACK seq 5 permit 192.168.0.0/32
!
route-map IPV4-OSPF-CONNECTED permit 10
 match ip address prefix-list LOOPBACK
exit
!
route-map IPV4-EXPORT-TO-IBGP permit 100
 description Tag locally originated aggregates
 match ip address prefix-list LOCAL-AGGREGATES
 set community 0:400
 set local-preference 400
 set origin igp
exit
!
route-map IPV4-EXPORT-TO-IBGP permit 200
 description Tag rest as internal
 set community 0:500
 set local-preference 400
 set origin igp
exit
!

Stripped out the unused route-maps that are there for later usage, changed IPs and ASNs.
The gist is the same.

The peer-group TMP also run FRR but 5.x on my gentoo's since the dawn of times and push full-tables to these newer routers which are supposed to replace them.

@f0o
Copy link
Author

f0o commented Mar 15, 2024

==15473== Invalid read of size 8
==15473==    at 0x49425FF: srcdest_rnode_prefixes (in /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0.0.0)
==15473==    by 0x4946852: ??? (in /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0.0.0)
==15473==    by 0x49714FF: vbprintfrr (in /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0.0.0)
==15473==    by 0x4974895: zlog_msg_text (in /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0.0.0)
==15473==    by 0x497511B: ??? (in /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0.0.0)
==15473==    by 0x49742F1: zlog_tls_buffer_flush (in /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0.0.0)
==15473==    by 0x4974BEF: ??? (in /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0.0.0)
==15473==    by 0x4974C81: vzlogx (in /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0.0.0)
==15473==    by 0x1FA372: ??? (in /usr/lib/frr/bgpd)
==15473==    by 0x2E2AD7: ??? (in /usr/lib/frr/bgpd)
==15473==    by 0x49687DC: ??? (in /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0.0.0)
==15473==    by 0x4955743: event_call (in /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0.0.0)
==15473==  Address 0x4000000018 is not stack'd, malloc'd or (recently) free'd
==15473==
==15473==
==15473== Process terminating with default action of signal 11 (SIGSEGV)
==15473==  Access not within mapped region at address 0x4000000018
==15473==    at 0x49425FF: srcdest_rnode_prefixes (in /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0.0.0)
==15473==    by 0x4946852: ??? (in /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0.0.0)
==15473==    by 0x49714FF: vbprintfrr (in /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0.0.0)
==15473==    by 0x4974895: zlog_msg_text (in /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0.0.0)
==15473==    by 0x497511B: ??? (in /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0.0.0)
==15473==    by 0x49742F1: zlog_tls_buffer_flush (in /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0.0.0)
==15473==    by 0x497438E: zlog_tls_buffer_fini (in /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0.0.0)
==15473==    by 0x49404EF: ??? (in /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0.0.0)
==15473==    by 0x4B9751F: ??? (in /usr/lib/x86_64-linux-gnu/libc.so.6)
==15473==    by 0x49425FE: srcdest_rnode_prefixes (in /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0.0.0)
==15473==    by 0x4946852: ??? (in /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0.0.0)
==15473==    by 0x49714FF: vbprintfrr (in /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0.0.0)
==15473==  If you believe this happened as a result of a stack
==15473==  overflow in your program's main thread (unlikely but
==15473==  possible), you can try to increase the size of the
==15473==  main thread stack using the --main-stacksize= flag.
==15473==  The main thread stack size used in this run was 8388608.
==15473==
==15473== HEAP SUMMARY:
==15473==     in use at exit: 2,895,062,138 bytes in 21,675,844 blocks
==15473==   total heap usage: 38,035,811 allocs, 16,359,967 frees, 4,059,611,293 bytes allocated
==15473==

Valgrind just caught a sigsegv

@f0o
Copy link
Author

f0o commented Mar 15, 2024

I found the 0x4000000018 stackdump via valgrind finally:

==22432== LEAK SUMMARY:
==22432==    definitely lost: 110 bytes in 11 blocks
==22432==    indirectly lost: 0 bytes in 0 blocks
==22432==      possibly lost: 8,328 bytes in 20 blocks
==22432==    still reachable: 5,081,118,723 bytes in 39,195,266 blocks
==22432==                       of which reachable via heuristic:
==22432==                         newarray           : 76,472 bytes in 644 blocks
==22432==         suppressed: 0 bytes in 0 blocks
==22432==
==22432== ERROR SUMMARY: 15 errors from 14 contexts (suppressed: 0 from 0)
==22432==
==22432== 2 errors in context 1 of 14:
==22432== Invalid read of size 8
==22432==    at 0x49425FF: srcdest_rnode_prefixes (srcdest_table.c:259)
==22432==    by 0x4946852: printfrr_rn.lto_priv.0 (srcdest_table.c:307)
==22432==    by 0x49714FF: UnknownInlinedFun (glue.c:220)
==22432==    by 0x49714FF: vbprintfrr (vfprintf.c:583)
==22432==    by 0x4974895: zlog_msg_text (zlog.c:738)
==22432==    by 0x497511B: zlog_syslog (zlog_targets.c:438)
==22432==    by 0x49742F1: zlog_tls_buffer_flush (zlog.c:397)
==22432==    by 0x4974BEF: vzlog_tls (zlog.c:500)
==22432==    by 0x4974C81: vzlogx (zlog.c:622)
==22432==    by 0x1FA372: zlog_ref.lto_priv.1 (zlog.h:84)
==22432==    by 0x2E2AD7: bgp_zebra_route_notify_owner (bgp_zebra.c:2630)
==22432==    by 0x49687DC: zclient_read (zclient.c:4425)
==22432==    by 0x4955743: event_call (event.c:1970)
==22432==  Address 0x4000000018 is not stack'd, malloc'd or (recently) free'd
==22432==
==22432== ERROR SUMMARY: 15 errors from 14 contexts (suppressed: 0 from 0)

@f0o
Copy link
Author

f0o commented Mar 15, 2024

Downgraded to 8.1 provided from Ubuntu 22.04 repos (not the FRRouting repos) and no more segfaults!

So there's some regression in 9.1

@ton31337
Copy link
Member

Are you able to compile from the master? It should (?) be fixed here. At least what I see from the Valgrind trace, is the related function zlog_tls_buffer_fini().

@IvayloJ
Copy link

IvayloJ commented Mar 18, 2024

@f0o Looking your valgrind log, and your config maybe really there are a problem. Probably in the redistribute. Are you redistribute the full internet bgp table over ospf or from ospf to bgp ? Is there kind of external script on that machine which may change the kernel route tables meanwhile ? Is only bgpd crash or you saw zebra/ospfd to crash too ? I guess it is in your production, and probably is hard for you to keep debug, but if you can compile frr from source (as @ton31337 asking) will be very useful to catch and fix this. I never used ospf nor use redistribute for big number of routes, so dont have much experience, and never had such crashes (I always work only with bgp for large number of internet routes). So I dont have such test setup to try simulate your case in controlled env.

@ton31337 probably it is related, but at all for me it is because the redistribute of large tables and not proper thread/process data memory locking/checking or something like that. 8 bytes illegal read seems to be a pointer and in bgp_zebra.c:2630 it is a call to zlog_err() with *dest as argument... My first just guess in the dark is that something (thread/process) freed the prefix pointer (because it gone from the route table for example), and other thread/process in the same time try to apply something on that pointer. And because the huge number of prefixes + their flapping in internet cases it is shown on random intervals and more often.

@f0o
Copy link
Author

f0o commented Mar 19, 2024

Hi @IvayloJ

I do run OVS which has it's own bug with full-tables that it blocks execution; that might be related?

I dont redistribute full tables but I do import them into VRFs.

The odd part is that the downgrade do 8.1 really fixed it entirely without any other modification to the config or system. So 8.1->9.1 introduced regression somewhere. It is also only bgpd that crashes, Zebra and ospfd are both happy

@IvayloJ
Copy link

IvayloJ commented Mar 19, 2024

Hi @f0o
I am not close familiar with OVS (guess Open vSwitch) nor use it, but I hear about some high CPU load issues there and yes It could be related (very theoretical guess - if high load on all/most cpu cores can badly hit threads/processes synchronizations). What "blocks execution" means ?

I dont redistribute full tables but I do import them into VRFs.

I am a bit confused now, because dont see VRFs in your config. Nor I see any other RTs you work with, except the main(254).

Between 8.1 and 9.1 there are a lot of changes (commits), In large code program, with so many functions (like frr) it is very common a little change somewhere, to have great impact on completely different place in the code, and never is easy to catch it. If 8.1 is good for your setup can keep going with it (for me it still works too, already years), but in 9.1 have some important fixes and improvements. In some moment of the time you will have to upgrade - no choice (and could again hit this issue, because nobody fix it meanwhile). If my guesses are right, maybe you can do a stress test setup with couple virtual machines and a little bash scripting for your case....

Do as much as possible close config to your issue case setup (connect same amount of virtual machines over bgp to the test one). Make the test machine redistribute a RT. Write a little script/program to put (lets say 100k) routes in a table in a endless loop (if route not exist in the table, install it). Write another script again endless loop, to remove on every iteration a single random route from the same table, but with sleep of few milliseconds. Let it run for hours, and if no problem play with the sleep times. This way you will have something very close to your scenario I hope.

@f0o
Copy link
Author

f0o commented Mar 19, 2024

@IvayloJ

The OpenVSwitch issue was resolved with a patch recently that I'm running right now. OVS did iterate through all routes with every change, I'm not sure if that iteration caused any blocking effect on the kernel interface there which would make bgpd have issues adding/removing routes.

Regarding the VRFs, the configs have changed since the downgrade to 8.1 which effectively fixed any and all segfaults so now VRFs are introduced and these routers are now in production.

The segfault occurred without any VRFs or redistribute. Once I added a transit into the mix the bgpd was segfaulting every few minutes very reliably.

@IvayloJ
Copy link

IvayloJ commented Mar 19, 2024

@f0o I am already a little lost in this issue, but anyway my mind never have been in state "found" :)

The segfault occurred without any VRFs or redistribute. Once I added a transit into the mix the bgpd was segfaulting every few minutes very reliably.

What you mean by "Once I added a transit into the mix", define exactly this. Can be written even commands you do or whatever to describe it more precisely - exactly.

@f0o
Copy link
Author

f0o commented Mar 20, 2024

@IvayloJ

"Once I added a transit into the mix" as in added another BGP peer that isnt IBGP and supplies me with full-tables.

ip prefix-list SANITIZE seq 90 deny 10.0.0.0/8 le 32
ip prefix-list SANITIZE seq 91 deny 192.0.2.0/24 le 32
ip prefix-list SANITIZE seq 92 deny 192.168.0.0/16 le 32
ip prefix-list SANITIZE seq 100 permit 0.0.0.0/0 ge 8 le 25

route-map TRANSIT-IN deny 1
 match rpki invalid
 set community 0:200 0:201
 set local-preference 0
exit
!
route-map TRANSIT-IN permit 100
 set community 0:200 0:201
 set local-preference 100
 set origin igp
exit
!
route-map TRANSIT-OUT permit 100
 set community none
 set extcommunity none
exit
!
router bgp 65001
 neighbor EXAMPLE peer-group
 neighbor EXAMPLE remote-as 12345
 neighbor 4.3.2.1 peer-group EXAMPLE
 address-family ipv4 unicast
  neighbor EXAMPLE next-hop-self
  neighbor EXAMPLE soft-reconfiguration inbound
  neighbor EXAMPLE prefix-list SANITIZE in
  neighbor EXAMPLE route-map TRANSIT-IN in
  neighbor EXAMPLE route-map TRANSIT-OUT out
``

@ton31337
Copy link
Member

perf.zip

Managed to capture some perf samples if that helps

Which perf version did you use?

@f0o
Copy link
Author

f0o commented Mar 26, 2024

perf version 5.15.143

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage Needs further investigation
Projects
None yet
Development

No branches or pull requests

4 participants