Skip to content

zebra: FRR restart leads to zebra mlag core (backport #20225)#20671

Merged
donaldsharp merged 1 commit intostable/10.5from
mergify/bp/stable/10.5/pr-20225
Feb 3, 2026
Merged

zebra: FRR restart leads to zebra mlag core (backport #20225)#20671
donaldsharp merged 1 commit intostable/10.5from
mergify/bp/stable/10.5/pr-20225

Conversation

@mergify
Copy link

@mergify mergify bot commented Feb 3, 2026

Issue: With higher mroute scale (around 900), in
PIM MLAG active-active setup crash was observed on
a) restarting frr service on standby
b) Enabling/Disabling pim active-active

Root Cause: During bulk delete event the message read from the
socket was around for 500 mroutes.
While decoding the protobuf message the stream size allocated
was 32768 bytes. But after decoding the message for 500 mroute 34000 bytes
are needed. So while adding the 482nd mroute to stream, we run out of the space.
We already have a check in the loop which checks for the size, before writing every mroute.
But the check was for the whole stream size allocated instead of
correctly checking the remaining space in the stream via STREAM_WRITEABLE API.
The change is made to check against the remaining space instead of
checking against the actual size of the stream that has been allocated.

Testing : Tested with 900 mroute scale on the PIM MLAG AA setup with FRR restart.
No crash is observed.

Ticket: #4633514


This is an automatic backport of pull request #20225 done by Mergify.

Issue: With higher mroute scale (around 900), in
       PIM MLAG active-active setup crash was observed on
       a) restarting frr service on standby
       b) Enabling/Disabling pim active-active

Root Cause: During bulk delete event the message read from the
            socket was around for 500 mroutes.
            While decoding the protobuf message the stream size allocated
            was 32768 bytes. But after decoding the message for 500 mroute 34000 bytes
            are needed. So while adding the 482nd mroute to stream, we run out of the space.
            We already have a check in the loop which checks for the size, before writing every mroute.
            But the check was for the whole stream size allocated instead of
            correctly checking the remaining space in the stream via STREAM_WRITEABLE API.
            The change is made to check against the remaining space instead of
            checking against the actual size of the stream that has been allocated.

Testing : Tested with 900 mroute scale on the PIM MLAG AA setup with FRR restart.
	  No crash is observed.

Ticket: #4633514
Signed-off-by: Utkarsh Srivastava <usrivastava@nvidia.com>
(cherry picked from commit 0ba1656)
@greptile-apps
Copy link

greptile-apps bot commented Feb 3, 2026

Target branch is not in the allowed branches list.

@frrbot frrbot bot added the zebra label Feb 3, 2026
@donaldsharp donaldsharp merged commit 4adb875 into stable/10.5 Feb 3, 2026
21 checks passed
@mergify mergify bot deleted the mergify/bp/stable/10.5/pr-20225 branch February 3, 2026 20:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants