-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
node: Handle arpinging when remote node is in different L2 #14201
Conversation
test-me-please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
One thing that is not obvious to me is whether the refcount optimization for not pinging gateways can lead to a situation where the entry for the gateway has expired in the ARP cache (I'm assuming this is why we arpping) and there is no way to reinsert this entry because its refcount is >0.
Thanks for the reviews!
@kkourt The ARP entries cannot expire in the local cache, because we set We are arpinging because when BPF NodePort running in XDP or TC gets a request which needs to be forwarded to another node (because of a backend selection), it needs an L2 address of a nexthop to that node (gw or the node itself). If the L2 addr is not present in the neighbor system, then we have to drop such request, as it's not possible to drive the arp resolution from the BPF program. To avoid that, we do arping from cilium-agent once it learns about a new k8s node. |
Previously, insertNeighbor() was assuming that a remote node is in the same L2 subnet, i.e. directly reachable w/o a gateway. However, this is not the case for all deployments. This commit adds a check for detecting whether the remote node is in the same L2. If it's not, then a gateway IP addr (nexthop) is going to be arpinged instead of the remote node IP addr. The missing bit in this commit is a refcounting to avoid redundant arpings and neigh removals when the gateway is used to access more than one remote node. Signed-off-by: Martynas Pumputis <m@lambda.lt>
To avoid redundant pings and neigh entry removals when the entry is still used by other node (happens when two or more nodes can be accessed through the same gateway). Signed-off-by: Martynas Pumputis <m@lambda.lt>
b554271
to
bec216a
Compare
Previously, all checks were green. I've just pushed the comment change to address https://github.com/cilium/cilium/pull/14201/files/4ce23957a6661d8f985651aca1dc00e3a7c0833a#diff-14a5efc51ce7ea3959385022d4800b355c766c60e8b7ba0e6052afced05ddbf0. |
See commit msgs
Fix: #12824