Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

isisd: Fix memory leaks when the transition of neighbor state from non-UP to DOWN #15716

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

zhou-run
Copy link
Contributor

@zhou-run zhou-run commented Apr 9, 2024

When receiving a hello packet, if the neighbor state transitions directly from a non-ISIS_ADJ_UP state (such as ISIS_ADJ_INITIALIZING) to ISIS_ADJ_DOWN state, the neighbor entry cannot be deleted. If the neighbor is removed or the neighbor's System ID changes, it may result in memory leakage in the neighbor entry.

Test Scenario:
LAN link between Router A and Router B is established. Router A does not configure neighbor authentication, while Router B is configured with neighbor authentication. When the neighbor entry on Router B ages out, the neighbor state on Router A transitions to INIT. If Router B is then removed, the neighbor state on Router A transitions to DOWN and persists.

Signed-off-by: zhou-run zhou.run@h3c.com

@frrbot frrbot bot added the isis label Apr 9, 2024
@riw777 riw777 self-requested a review April 9, 2024 14:32
Copy link
Member

@riw777 riw777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good

@zhou-run
Copy link
Contributor Author

Copy link
Member

@odd22 odd22 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@odd22
Copy link
Member

odd22 commented Apr 16, 2024

Does the failed test make sense, or in other words, should the neighbor not be deleted when transitioning from UP to DOWN state? https://ci1.netdef.org/browse/FRR-PULLREQ3-TOPO5U18ARM8-2673 https://ci1.netdef.org/browse/FRR-PULLREQ3-TOPO8U18AMD64-2673 https://ci1.netdef.org/browse/FRR-PULLREQ3-ASAN7-2673 https://ci1.netdef.org/browse/FRR-PULLREQ3-TOPO1DEB10AMD64-2673

I re-run the failed test to be sure it is not a false positive.

Now, regarding your question, the neighbor is created when you add it in the configuration and when IS-IS Hello messages are received on the given interface and are deleted when you remove it in the configuration. I think, we must keep the context when the interface goes from UP to DOWN to speed up the process when the interface goes from DOWN to UP. In particular, when an interface (frequent when optical connector SFP reach end of life) flap (going Down, then Up, then Down ...), if you remove the context, the code will delete dynamic memory allocation and then recreate it and so on. This could decrease performance and at the end the risk of a memory corruption. So, I think it is preferable to keep the context if the interface goes DOWN and only remove the context when the neighbor is removed from the configuration

@donaldsharp donaldsharp self-requested a review April 16, 2024 15:29
@zhou-run
Copy link
Contributor Author

Does the failed test make sense, or in other words, should the neighbor not be deleted when transitioning from UP to DOWN state? https://ci1.netdef.org/browse/FRR-PULLREQ3-TOPO5U18ARM8-2673 https://ci1.netdef.org/browse/FRR-PULLREQ3-TOPO8U18AMD64-2673 https://ci1.netdef.org/browse/FRR-PULLREQ3-ASAN7-2673 https://ci1.netdef.org/browse/FRR-PULLREQ3-TOPO1DEB10AMD64-2673

I re-run the failed test to be sure it is not a false positive.

Now, regarding your question, the neighbor is created when you add it in the configuration and when IS-IS Hello messages are received on the given interface and are deleted when you remove it in the configuration. I think, we must keep the context when the interface goes from UP to DOWN to speed up the process when the interface goes from DOWN to UP. In particular, when an interface (frequent when optical connector SFP reach end of life) flap (going Down, then Up, then Down ...), if you remove the context, the code will delete dynamic memory allocation and then recreate it and so on. This could decrease performance and at the end the risk of a memory corruption. So, I think it is preferable to keep the context if the interface goes DOWN and only remove the context when the neighbor is removed from the configuration

Does 'remove the context when the neighbor is removed from the configuration' refer to the command 'clear isis neighbor [WORD]'?

@odd22
Copy link
Member

odd22 commented Apr 30, 2024

Does 'remove the context when the neighbor is removed from the configuration' refer to the command 'clear isis neighbor [WORD]'?
No. I'm referring to CLI no isis router WORD at the interface level which delete the IS-IS configuration on a given interface.

@riw777
Copy link
Member

riw777 commented May 7, 2024

this says there are linter problems, but it doesn't show that the are ... let's try rerunning ci to see if we can find them

@riw777
Copy link
Member

riw777 commented May 7, 2024

ci:rerun

@zhou-run
Copy link
Contributor Author

Does 'remove the context when the neighbor is removed from the configuration' refer to the command 'clear isis neighbor [WORD]'?
No. I'm referring to CLI no isis router WORD at the interface level which delete the IS-IS configuration on a given interface.

Does this mean that when the IS-IS configuration is not removed on the specified interface, we do not need to consider potential memory leaks caused by frequent changes in the System ID due to neighbor updates?
Also, if neighbors are not deleted, should the neighbor state transition from DOWN to UP again also ensure that it goes through the INITIALIZING state?

@ton31337
Copy link
Member

@Mergifyio backport stable/10.0

Copy link

mergify bot commented May 17, 2024

backport stable/10.0

🟠 Waiting for conditions to match

  • merged [📌 backport requirement]

@ton31337
Copy link
Member

@frrbot rereview

Copy link
Member

@ton31337 ton31337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix frrbot styling issues found.

Copy link
Member

@ton31337 ton31337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Squash to a single commit all 3 commits.

…n-UP to DOWN

When receiving a hello packet, if the neighbor state transitions directly from a non-ISIS_ADJ_UP state (such as ISIS_ADJ_INITIALIZING) to ISIS_ADJ_DOWN state, the neighbor entry cannot be deleted. If the neighbor is removed or the neighbor's System ID changes, it may result in memory leakage in the neighbor entry.

Test Scenario:
LAN link between Router A and Router B is established. Router A does not configure neighbor authentication, while Router B is configured with neighbor authentication. When the neighbor entry on Router B ages out, the neighbor state on Router A transitions to INIT. If Router B is then removed, the neighbor state on Router A transitions to DOWN and persists.

Signed-off-by: zhou-run <166502045+zhou-run@users.noreply.github.com>

fix frrbot styling issues found.

fix frrbot styling issues found.

Signed-off-by: zhou-run <166502045+zhou-run@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants