New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
frr 7.5 : mac-ip type2 route removed after 30s (seem to be related to arp cache && new EVPN-MH feature) #9347
Comments
I have tested to increase net.ipv4.neigh.default.base_reachable_time_ms to 60s-240s, it don't help (only the interval of removal is increase, but still have the type2 route removed). |
I just tested with frr 8.0, and it seem to work, type2 route is not removed and state is not going to local_inactive does it miss some backports fix to 7.5.1 ? (I would like to use debian11 official package, as libyang2 is not officially available)
|
maybe this is related to this fix ? c7bfd08 |
I can confirm that with a dirty hack like
it's working fine. (I don't use MH). |
Hi @aderumier , did you try to backport #7722 ? |
yes, indeed, it should fix it. (as I mentionned it, I just have tested forcing it to always false, and it's works). BTW, I have notice other strange bugs with evpn, but I can't reproduce them, seem to be random, where sometime the mac or mac-ip route never are active in evpn, but reacheable in kernel arp cache. Maybe it's a race with live migration, I'm not sure. (I don't have debug log when this has happened). For now, I have rollback to 7.4, and it seem to evpn-MH code seem really new and maybe a little bit buggy. |
I'm closing it, all seem to be fine in 8.0.1 |
Hi,
I'm currently testing evpn with frr 7.5 (7.5.1 or last stable branch), with a working config with frr 7.4.
Setup is a proxmox hypervisor cluster (kernel 5.11) with a symetric evpn setup with vrf.
vm have ip 192.168.0.2 mac 02:02:99:7b:a8:c0 , with gateway 192.168.0.1 (anycast ip, an svi bridge on each hypervisor, vni 300)
During the test, I'm pinging the svi ip 192.168.0.1 from the vm 192.168.0.2.
It seem than since EVPN-MH implementation && "zebra: support for MAC-IP sync routes"
b169fd6
the type2 mac:ip route of the vm is removed really fast (after 30s).
It seem to correspond when the arp cache of the svi is going to STALE/DELAY state, then at this moment,
local-inactive flag is set and the type2 route is removed.
After 6s, the bridge is doing an arp request to find mac address of 192.168.0.2, then the arp cache is in REACHABLE state,
and the type2 route is added again. (so network is breaking for around 6s)
I didn't have this behaviour with frr 7.4, when the arp could be STALE or DELAY state without any problem, then could make the arp request to refresh his cache.
network config
frr config
initial state
after 30s
logs
The text was updated successfully, but these errors were encountered: