zebra: FRR restart leads to zebra mlag core (backport #20225)#20671
Merged
donaldsharp merged 1 commit intostable/10.5from Feb 3, 2026
Merged
zebra: FRR restart leads to zebra mlag core (backport #20225)#20671donaldsharp merged 1 commit intostable/10.5from
donaldsharp merged 1 commit intostable/10.5from
Conversation
Issue: With higher mroute scale (around 900), in
PIM MLAG active-active setup crash was observed on
a) restarting frr service on standby
b) Enabling/Disabling pim active-active
Root Cause: During bulk delete event the message read from the
socket was around for 500 mroutes.
While decoding the protobuf message the stream size allocated
was 32768 bytes. But after decoding the message for 500 mroute 34000 bytes
are needed. So while adding the 482nd mroute to stream, we run out of the space.
We already have a check in the loop which checks for the size, before writing every mroute.
But the check was for the whole stream size allocated instead of
correctly checking the remaining space in the stream via STREAM_WRITEABLE API.
The change is made to check against the remaining space instead of
checking against the actual size of the stream that has been allocated.
Testing : Tested with 900 mroute scale on the PIM MLAG AA setup with FRR restart.
No crash is observed.
Ticket: #4633514
Signed-off-by: Utkarsh Srivastava <usrivastava@nvidia.com>
(cherry picked from commit 0ba1656)
|
Target branch is not in the allowed branches list. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Issue: With higher mroute scale (around 900), in
PIM MLAG active-active setup crash was observed on
a) restarting frr service on standby
b) Enabling/Disabling pim active-active
Root Cause: During bulk delete event the message read from the
socket was around for 500 mroutes.
While decoding the protobuf message the stream size allocated
was 32768 bytes. But after decoding the message for 500 mroute 34000 bytes
are needed. So while adding the 482nd mroute to stream, we run out of the space.
We already have a check in the loop which checks for the size, before writing every mroute.
But the check was for the whole stream size allocated instead of
correctly checking the remaining space in the stream via STREAM_WRITEABLE API.
The change is made to check against the remaining space instead of
checking against the actual size of the stream that has been allocated.
Testing : Tested with 900 mroute scale on the PIM MLAG AA setup with FRR restart.
No crash is observed.
Ticket: #4633514
This is an automatic backport of pull request #20225 done by Mergify.