staticd: Fix SRv6 SID use-after-free on locator deletion#20660
staticd: Fix SRv6 SID use-after-free on locator deletion#20660ton31337 merged 2 commits intoFRRouting:masterfrom
Conversation
When a locator is removed, we iterate over the list of SIDs and uninstall them from the data plane. However, the validity flag of the SID remains set, and the locator pointer still points to the freed locator memory. This causes issues in other parts of the code where we check the SID validity flag. Since the flag is still set, we assume the SID is valid and attempt to access the locator pointer, resulting in use-after-free crashes. Fix this by: - Unsetting the STATIC_FLAG_SRV6_SID_VALID flag to mark the SID as invalid when its locator is deleted - Setting the locator pointer to NULL to indicate that the SID no longer has a locator associated with it This ensures that when the locator is removed, associated SIDs are properly marked as invalid and cannot accidentally reference freed memory. Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
Add a test case that reproduces and validates the fix for the locator deletion crash issue. Without the fix, removing a locator leaves SIDs with the validity flag set and a dangling pointer to the freed locator. Attempting to remove those SIDs from configuration causes a use-after-free crash when the code checks the validity flag and accesses the freed locator pointer. The test reproduces this by: 1. Removing a locator 2. Removing specific SIDs (crash point without the fix) 3. Re-adding the locator 4. Re-adding the removed SIDs With the fix, the test passes and verifies that the locator removal/re-addition workflow handles SID lifecycle correctly, preventing future regressions. Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
Greptile OverviewGreptile SummaryThis PR fixes a critical heap-use-after-free bug in the staticd daemon's SRv6 SID handling code. The bug occurred when a locator was deleted and then individual SIDs were removed from configuration. Root Cause:
However, the SID objects still maintained:
Later, when a SID was deleted from configuration via The Fix:
Test Coverage:
This test validates the fix prevents the crash and ensures proper state management during locator lifecycle operations. Confidence Score: 5/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Config as Configuration
participant StaticZebra as static_zebra.c
participant StaticSrv6 as static_srv6.c
participant Locator as Locator Memory
participant SID as SID Object
Note over Config,SID: Locator Deletion Flow
Config->>StaticZebra: Delete locator MAIN
StaticZebra->>StaticZebra: static_zebra_process_srv6_locator_delete()
loop For each SID using this locator
StaticZebra->>StaticZebra: Check STATIC_FLAG_SRV6_SID_SENT_TO_ZEBRA
StaticZebra->>StaticZebra: static_zebra_srv6_sid_uninstall(sid)
Note over StaticZebra,SID: NEW FIX - Clear stale references
StaticZebra->>SID: sid->locator = NULL
StaticZebra->>SID: UNSET_FLAG(STATIC_FLAG_SRV6_SID_VALID)
end
StaticZebra->>Locator: Free locator memory
Note over Locator: Memory freed
Note over Config,SID: SID Deletion Flow (after locator freed)
Config->>StaticSrv6: Delete SID from config
StaticSrv6->>StaticSrv6: static_srv6_sid_del(sid)
StaticSrv6->>StaticSrv6: Check STATIC_FLAG_SRV6_SID_VALID
alt Flag is SET (without fix)
StaticSrv6->>StaticZebra: static_zebra_release_srv6_sid(sid)
StaticZebra->>StaticZebra: is_srv6_sid_localonly(sid)
StaticZebra->>Locator: Access sid->locator->prefix
Note over StaticZebra,Locator: CRASH - use-after-free
else Flag is UNSET (with fix)
Note over StaticSrv6: Skip release - no crash
end
|
|
ci:rerun |
ahsalam
left a comment
There was a problem hiding this comment.
LGTM. Thanks!
This fixes the issue reported in SONiC sonic-net/sonic-buildimage#22690. Thanks
GaladrielZhao
left a comment
There was a problem hiding this comment.
Looks good. Thanks for the fix.
|
@Mergifyio backport dev/10.6 stable/10.5 stable/10.4 stable/10.3 stable/10.2 |
✅ Backports have been createdDetails
Cherry-pick of 66527dd has failed: Cherry-pick of 2a5118f has failed: To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally
Cherry-pick of 66527dd has failed: Cherry-pick of 2a5118f has failed: To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally
Cherry-pick of 66527dd has failed: Cherry-pick of 2a5118f has failed: To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally
Cherry-pick of 66527dd has failed: Cherry-pick of 2a5118f has failed: To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally |
Port upstream FRR pull request into SONiC: FRRouting/frr#20660 Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
Port upstream FRR pull request into SONiC: FRRouting/frr#20660 Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
…se_after_free_10.4 staticd: Fix SRv6 SID use-after-free on locator deletion (backport #20660 for 10.4)
…se_after_free_10.5 staticd: Fix SRv6 SID use-after-free on locator deletion (backport #20660 for 10.5)
…se_after_free_10.3 staticd: Fix SRv6 SID use-after-free on locator deletion (backport #20660 for 10.3)
staticd: Fix SRv6 SID use-after-free on locator deletion (backport #20660)
When a locator is removed, we iterate over the list of SIDs and uninstall them from the data plane. However, the validity flag of
the SID remains set, and the locator pointer still points to the freed locator memory.
This causes issues in other parts of the code where we check the SID validity flag. Since the flag is still set, we assume the SID
is valid and attempt to access the locator pointer, resulting in use-after-free crashes.
Fix this by:
This ensures that when the locator is removed, associated SIDs are properly marked as invalid and cannot accidentally reference freed memory.