Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpf: remove special handle for ICMPv6 echo targeting router IPv6 #24921

Merged
merged 1 commit into from Apr 27, 2023

Conversation

jschwinger233
Copy link
Member

@jschwinger233 jschwinger233 commented Apr 17, 2023

Previously we don't have any network interface whose ipv6 is set to router ipv6, so the special handle for icmp6 echo whose destination address is router ipv6 was required, as kernel couldn't handle such icmp6 echo; but it's no more the case since PR #24208 was merged, now cilium_host has router ipv6 instead of native ipv6, making kernel capable of dealing with this icmp6.

Therefore, this commit removes code relevant to this special handle, including several functions such as icmp6_send_echo_reply, tail_icmp6_send_echo_reply, __icmp6_send_echo_reply.

Macro SKIP_ICMPV6_ECHO_HANDLING and CILIUM_CALL_SEND_ICMP6_ECHO_REPLY are also redundant and deleted.

Signed-off-by: Zhichuan Liang gray.liang@isovalent.com

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Apr 17, 2023
@jschwinger233 jschwinger233 added the release-note/minor This PR changes functionality that users may find relevant to operating Cilium. label Apr 17, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Apr 17, 2023
@jschwinger233 jschwinger233 added sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Apr 17, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Apr 17, 2023
@jschwinger233 jschwinger233 added feature/ipv6 Relates to IPv6 protocol support area/misc Impacts miscellaneous areas of the code not otherwise owned by another area. labels Apr 17, 2023
@jschwinger233 jschwinger233 changed the title bpf: remove special handle for icmp6 echo targeting router IPv6 bpf: remove special handle for ICMPv6 echo targeting router IPv6 Apr 17, 2023
@jschwinger233 jschwinger233 marked this pull request as ready for review April 17, 2023 10:52
@jschwinger233 jschwinger233 requested a review from a team as a code owner April 17, 2023 10:52
@jschwinger233 jschwinger233 changed the title bpf: remove special handle for ICMPv6 echo targeting router IPv6 [WIP] bpf: remove special handle for ICMPv6 echo targeting router IPv6 Apr 17, 2023
@jschwinger233 jschwinger233 changed the title [WIP] bpf: remove special handle for ICMPv6 echo targeting router IPv6 bpf: remove special handle for ICMPv6 echo targeting router IPv6 Apr 17, 2023
@jschwinger233 jschwinger233 force-pushed the gray/icmp6-router branch 3 times, most recently from c0d2992 to a651fbc Compare April 18, 2023 04:22
@@ -78,7 +78,6 @@ enum {

#define CILIUM_CALL_DROP_NOTIFY 1
#define CILIUM_CALL_ERROR_NOTIFY 2
#define CILIUM_CALL_SEND_ICMP6_ECHO_REPLY 3
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we just leave a gap here, or change all the macro definitions below.

For example, we change #define CILIUM_CALL_HANDLE_ICMP6_NS 4 to #define CILIUM_CALL_HANDLE_ICMP6_NS 3.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure if that can cause issues on upgrade. Maybe @ti-mo knows?

Copy link
Contributor

@ti-mo ti-mo Apr 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the ping! Unfortunately, I think we may need to renumber to cause a change in CILIUM_CALL_SIZE, which will force map migration to recreate the prog array from scratch. That is, CILIUM_CALL_SIZE needs to differ from whatever 1.13 version is used during the upgrade test. v1.13.2 has CILIUM_CALL_SIZE = 40, so map migration will still take place if we drop it back down to 42 on main. (To be clear, we'll be fine if we don't touch CILIUM_CALL_SIZE and leave a gap here as well.)

#20691 is relevant here, we should be able to get that done over the coming weeks. This should remove another potential cause of missed tail calls and undefined behaviour during minor upgrades when tail call maps are currently being re-used.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your information @ti-mo , that was very helpful. Just one thing to confirm, does changing CILIUM_CALL_SIZE break the upgrade test? The test suite Tests upgrade and downgrade from a Cilium stable image to master failed because migrate-svc had restarted unexpectedly, do you have any clue where I can look into?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To pass upgrade test suite, I leave a gap here with additional comments to explain what happened here lol

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it fail on every run with the change to CILIUM_CALL_SIZE included? The run you linked seems to fail on downgrade. Either way, the prog array gets recreated when switching between main/master and 1.13, so there shouldn't be any missed tail calls.

Looking at the cilium-metrics-list.md files in the sysdump seems to corroborate this, there are no cilium_drop_ entries with reason=Missed tail call (these are normally echoed in the test stdout), but there are other drop reasons that might be of interest in finding the cause.

You may be seeing a non-tail call-related flake that may or may not be caused by this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your help Timo, I ran K8sUpdates locally with new CILIUM_CALL_SIZE for 3 times and I'm afraid K8sUpdates fails on every run LOL

@@ -78,7 +78,6 @@ enum {

#define CILIUM_CALL_DROP_NOTIFY 1
#define CILIUM_CALL_ERROR_NOTIFY 2
#define CILIUM_CALL_SEND_ICMP6_ECHO_REPLY 3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure if that can cause issues on upgrade. Maybe @ti-mo knows?

@pchaigno
Copy link
Member

/test

@pchaigno pchaigno added release-note/misc This PR makes changes that have no direct user impact. and removed area/misc Impacts miscellaneous areas of the code not otherwise owned by another area. release-note/minor This PR changes functionality that users may find relevant to operating Cilium. labels Apr 18, 2023
Copy link
Member

@brb brb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@jschwinger233 jschwinger233 force-pushed the gray/icmp6-router branch 2 times, most recently from 721d0cd to 96fac18 Compare April 25, 2023 10:52
@@ -78,7 +78,15 @@ enum {

#define CILIUM_CALL_DROP_NOTIFY 1
#define CILIUM_CALL_ERROR_NOTIFY 2
#define CILIUM_CALL_SEND_ICMP6_ECHO_REPLY 3
/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, this is a bit much for a comment, as I can just git-blame to see what previous MACRO was there.

Tho, I don't really feel to strongly about it, I think you could just say the bits about the gap, and the concerns around the backporting.

This is a total nit, so feel free to ignore.

Copy link
Contributor

@ldelossa ldelossa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, love when code gets simpler 👍

@jschwinger233
Copy link
Member Author

/test

Previously we didn't have any network interface whose IPv6 was set to
router IPv6, so the special handle for ICMPv6 echo whose destination
address is router IPv6 was required, as kernel couldn't handle such
ICMPv6 echo; but it's no more the case since PR cilium#24208 was merged, now
cilium_host has router IPv6 instead of native IPv6, making kernel
capable of handling this ICMPv6.

Therefore, this commit removes code relevant to this special handle,
including several functions such as icmp6_send_echo_reply,
tail_icmp6_send_echo_reply, __icmp6_send_echo_reply.

Macro SKIP_ICMPV6_ECHO_HANDLING and CILIUM_CALL_SEND_ICMP6_ECHO_REPLY
are also obsolete and deleted.

Deletion of macro CILIUM_CALL_SEND_ICMP6_ECHO_REPLY leaves a gap in the
sequence of numbers, and we don't renumber the other macros in order to
pass the CI tests, otherwise the K8sUpdates test suite would fail due to
"migrate-svc restart count values do not match"

Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>
@jschwinger233
Copy link
Member Author

/test

@jschwinger233
Copy link
Member Author

/test-1.25-5.4

@jschwinger233
Copy link
Member Author

jschwinger233 commented Apr 27, 2023

/test-1.27-net-next

Don't know why there are 3 CI jobs not triggered by /test

@jschwinger233
Copy link
Member Author

/test-1.27-net-next

@jschwinger233
Copy link
Member Author

/test-1.26-4.19

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Apr 27, 2023
@jschwinger233
Copy link
Member Author

The failed CI job for k8s-1.27-kernel-net-next (test-1.27-net-next) is expected because we reverted the 1.27 support.

@pchaigno pchaigno merged commit ee15ae6 into cilium:main Apr 27, 2023
58 of 59 checks passed
@netcelli-tux
Copy link

Does this fix IPv6 Neighbour Solicitation #16285?

When I deploy an EKS IPv6-only cluster, I see connectivity issues between pods with Cilium 1.13.3. But when I deploy Cilium 1.14.0-snapshot.2, everything works fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature/ipv6 Relates to IPv6 protocol support ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/misc This PR makes changes that have no direct user impact. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants