Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible connectivity disruption on agent restart with WireGuard + native routing #31979

Open
2 tasks done
giorio94 opened this issue Apr 16, 2024 · 0 comments
Open
2 tasks done
Labels
area/encryption Impacts encryption support such as IPSec, WireGuard, or kTLS. feature/wireguard Relates to Cilium's Wireguard feature kind/bug This is a bug in the Cilium logic. sig/agent Cilium agent related.

Comments

@giorio94
Copy link
Member

giorio94 commented Apr 16, 2024

Is there an existing issue for this?

  • I have searched the existing issues

What happened?

Temporary connectivity disruption can occur on agent restart when Cilium is configured in native routing mode, and WireGuard encryption is enabled, because the list of Allowed IPs gets recreated from scratch upon reception of the node event for each given remote node, possibly removing entries for valid endpoints that have not yet been discovered at that point through the CiliumEndpoint CRD or the corresponding kvstore representation. This issue, instead, does not affect the current implementation in tunnel mode, as in that case we encrypt encapsulated traffic, which always has source and destination addresses corresponding to Node Internal IPs, which are immediately added as Allowed IPs.

A possible solution would be to restore the list of Allowed IPs for each peer from the WireGuard state after agent restart, and then do a GC pass to remove the stale entries after that ipcache synchronization has completed. IPCache synchronization should account for CiliumEndpoint synchronization (if the CiliumEndpoint CRD is enabled), kvstore synchronization (if kvstore mode is enabled), and clustermesh synchronization (when clustermesh is enabled).

Cilium Version

Tested on tip of main, but likely all versions are affected

Code of Conduct

  • I agree to follow this project's Code of Conduct
@giorio94 giorio94 added kind/bug This is a bug in the Cilium logic. sig/agent Cilium agent related. feature/wireguard Relates to Cilium's Wireguard feature area/encryption Impacts encryption support such as IPSec, WireGuard, or kTLS. labels Apr 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/encryption Impacts encryption support such as IPSec, WireGuard, or kTLS. feature/wireguard Relates to Cilium's Wireguard feature kind/bug This is a bug in the Cilium logic. sig/agent Cilium agent related.
Projects
None yet
Development

No branches or pull requests

1 participant