Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VNET] "zebra" crashes after VNET removal #4615

Closed
volodymyrsamotiy opened this issue May 18, 2020 · 3 comments
Closed

[VNET] "zebra" crashes after VNET removal #4615

volodymyrsamotiy opened this issue May 18, 2020 · 3 comments

Comments

@volodymyrsamotiy
Copy link
Collaborator

Description

"zebra" process crash happens after removing VNET.
It is known FRR issue in 7.2.1 version which is currently in "201911" branch.
Issue is already fixed in FRR 7.2.1-s3 version which is in "master" branch and this version will be cherry-picked to "201911" branch as well.

Steps to reproduce the issue:

  • Create 1 VNET
root@sonic:/home/admin# cat vnet.conf.json
{
    "VXLAN_TUNNEL": {
        "tunnel_v4": {
            "src_ip": "10.1.0.32"
        }
    },
    "VNET": {
        "Vnet1": {
            "vxlan_tunnel": "tunnel_v4",
            "vni": "10001",
            "peer_list": ""
        }
    }
}
root@sonic:/home/admin# config load vnet.conf.json
  • Remove VNET
redis-cli -n 4 del "VNET|Vnet1"
  • Observe "zebra" process crashed
root@sonic:/home/admin# pidof zebra

Describe the results you received:

"zebra" crashed after removing a VNET.

root@sonic:/home/admin# grep -B 7 "exited: zebra" /var/log/syslog
May 18 10:15:55.390252 sonic NOTICE swss#vrfmgrd: :- doTask: Removed vrf netdev Vnet1
...
May 18 10:15:56.754103 sonic INFO bgp#supervisord: fpmsyncd Connection lost, reconnecting...
May 18 10:15:56.754103 sonic INFO bgp#supervisord: fpmsyncd Waiting for fpm-client connection...
May 18 10:16:03.478007 sonic INFO bgp#supervisord 2020-05-18 10:15:56,753 INFO exited: zebra (terminated by SIGABRT (core dumped); not expected)

Describe the results you expected:

VNET removal should be working without any failures

Additional information you deem important (e.g. issue happens only occasionally):

Output of show version:

SONiC Software Version: SONiC.201911.94-e2e3dde3
Distribution: Debian 9.12
Kernel: 4.9.0-11-2-amd64
Build commit: e2e3dde3
Build date: Fri May 15 04:36:14 UTC 2020
Built by: johnar@jenkins-worker-8

Platform: x86_64-mlnx_msn3700c-r0
HwSKU: ACS-MSN3700C
ASIC: mellanox
Serial Number: MT1852X03894
Uptime: 11:32:53 up  2:20,  1 user,  load average: 1.03, 0.48, 0.46
@lguohan
Copy link
Collaborator

lguohan commented May 20, 2020

fixed in 2778363

@lguohan lguohan closed this as completed May 20, 2020
@volodymyrsamotiy
Copy link
Collaborator Author

Issue is still observed on latest 201911 image:
https://sonic-jenkins.westus2.cloudapp.azure.com/job/mellanox/job/buildimage-mlnx-201911/100/

@volodymyrsamotiy
Copy link
Collaborator Author

Fixed by #3763

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants