Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix deletion of tunnel map entries when node has non-zero cluster ID. #27353

Merged
merged 1 commit into from Aug 10, 2023

Conversation

giorio94
Copy link
Member

@giorio94 giorio94 commented Aug 8, 2023

Given that tunnel mapping entries are currently inserted without specifying the cluster ID, we should do the same during the removal phase. Otherwise, we fail to remove the entries if the node is associated with a non-zero cluster ID. Let's additionally update the test so that it would catch future regressions.

Fix deletion of tunnel map entries when node has non-zero cluster ID.

Given that tunnel map entries are currently inserted without
specifying the cluster ID, we should do the same during the removal
phase. Otherwise, we fail to remove the entries if the node is
associated with a non-zero cluster ID. Let's additionally update
the test so that it would catch future regressions.

Fixes: cda8767 ("bpf: Make tunnel map APIs aware of ClusterID")
Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
@giorio94 giorio94 added kind/bug This is a bug in the Cilium logic. area/daemon Impacts operation of the Cilium daemon. release-note/bug This PR fixes an issue in a previous release of Cilium. needs-backport/1.14 This PR / issue needs backporting to the v1.14 branch labels Aug 8, 2023
@giorio94 giorio94 requested review from a team as code owners August 8, 2023 15:05
@giorio94 giorio94 requested a review from brb August 8, 2023 15:05
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from main in 1.14.1 Aug 8, 2023
@giorio94
Copy link
Member Author

giorio94 commented Aug 8, 2023

/test

Copy link
Member

@brb brb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why we didn't see this in the logs earlier. From my reading, under the hood, the deleteTunnelMapping() invokes BPF_MAP_DELETE_ELEM which should return an error if the entry doesn't exist.

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Aug 10, 2023
@brb brb removed the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Aug 10, 2023
@giorio94
Copy link
Member Author

I wonder why we didn't see this in the logs earlier. From my reading, under the hood, the deleteTunnelMapping() invokes BPF_MAP_DELETE_ELEM which should return an error if the entry doesn't exist.

My guess is that it is because we don't remove nodes during the clustermesh tests, and in all the other cases we are likely not setting a ClusterID different from zero (as for instance in the unit test that I modified). I had tested this bug in a local kind environment, and indeed the error log was output after deleting a CiliumNode resource.

@brb
Copy link
Member

brb commented Aug 10, 2023

I had tested this bug in a local kind environment, and indeed the error log was output after deleting a CiliumNode resource.

👍

Do you see any risk with changing the approach to propagating the CID to the tunnel map when adding/deleting instead of passing it as zero?

@giorio94
Copy link
Member Author

I had tested this bug in a local kind environment, and indeed the error log was output after deleting a CiliumNode resource.

+1

Do you see any risk with changing the approach to propagating the CID to the tunnel map when adding/deleting instead of passing it as zero?

My understanding is that the datapath is currently doing the lookups using CID=0, so that would not work.

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Aug 10, 2023
@nebril nebril added this to Needs backport from main in 1.14.2 Aug 10, 2023
@nebril nebril removed this from Needs backport from main in 1.14.1 Aug 10, 2023
@ldelossa ldelossa merged commit 679a162 into cilium:main Aug 10, 2023
60 checks passed
@tklauser tklauser mentioned this pull request Aug 22, 2023
22 tasks
@tklauser tklauser added backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. and removed needs-backport/1.14 This PR / issue needs backporting to the v1.14 branch labels Aug 22, 2023
@joestringer joestringer added backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. and removed backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. labels Aug 25, 2023
@joestringer joestringer moved this from Needs backport from main to Backport done to v1.14 in 1.14.2 Aug 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/daemon Impacts operation of the Cilium daemon. backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. kind/bug This is a bug in the Cilium logic. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
No open projects
1.14.2
Backport done to v1.14
Development

Successfully merging this pull request may close these issues.

None yet

6 participants