-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pod2pod strict mode for WireGuard #21856
Conversation
Commit 690e4daeb0252ee5a248889db024d3fbfb7210a4 does not contain "Signed-off-by". Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin |
690e4da
to
299bcc8
Compare
299bcc8
to
b5ba062
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for this contribution. I focused on the agent side of things only, few comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Thanks for running the tests. I hope I fixed all of the linting errors. I will try to investigate the other errors as soon as they arise. |
The CI failure looks legit. The IPsec tests, which are executed after the newly added WG tests, started to fail. In particular, the connectivity between Pods:
@3u13r Did you check cilium-agent logs to see whether they hit the same symptoms? |
As far as I can tell, the actual strict mode test are skipped in all 3 failed CI runs, correct? When my interpretation is that it's unlikely the strict mode test itself but the only other line that has changed in the tests i.e. the deletion of the |
Do we actually need those? |
Yes, we need to delete the |
Yep, but that would be a ticking bomb, as WG and IPsec tests might eventually end up running in the same job. Anyway, if the transition of the tests from the Ginkgo to ci-e2e / BPF unit tests (I think both can be rewritten into the BPF unit tests) happens soon, then ACK this approach. |
@3u13r could you do the suggested change (delete ciliumnode obj only in the strict wg tests)? The feature freeze for v1.14 is next Fri (16th June). It would be nice to get this feature merged before that date. |
Due to changes in CI, this will need rebasing to get CI to pass. From Martynas' last comment it looks like there's feedback remaining to address, but otherwise this seems close. We've initiated the feature freeze for v1.14 now so it'll miss that release series, but please do fix up the PR, we should be able to push this forward towards merging in a week or two once we've completed the branching process. Thanks for your continued interest in contributing to Cilium :-) |
e92152d
to
2e765ad
Compare
Sadly, I've been sick over the past week and the obvious changes first didn't work. With the current version I'm able to execute the K8sDatapath tests on netnext (executing the strict mode tests) and without netnext (executing the direct routing ipsec test). |
/test |
@3u13r If you'd like to push this PR forward, can you please rebase and push? The original base this PR is one has an issue with tests. This will be fixed with a rebase. |
When pod2pod encryption is enabled, there is a slight time window, where one pod may send unencrypted data to another one. This happens when a new pod is created but the information of the new endpoint has not propagated to the other nodes. To prevent this from happening, we block all unencrypted pod2pod traffic. This is done via a filter in the datapath. The filter is configured at the same time the datapth is set up, sine we cannot rely on data which is only eventually updated at runtime. The filter drops any unencrypted tcp/udp egress traffic which originates from and is sent to the PodCIDR and also leaves the node. Signed-off-by: Benedict Schlueter <benedict.schlueter@inf.ethz.ch> Signed-off-by: Leonard Cohnen <lc@edgeless.systems> Co-authored-by: Benedict Schlueter <benedict.schlueter@inf.ethz.ch>
2e765ad
to
269df16
Compare
/test |
Glad to see this finally merged 🎉 |
Yay 🥳. Thanks a lot for the feedback and discussions from everyone! I'm excited that the first step is merged. Next tasks include IPv6 support, migrating the tests, node-to-node strict mode, ... |
Previously, the strict encrypt check [1] was running in bpf_overlay (in addition to bpf_host). That particular check was assuming that no pod-to-pod unencrypted packet should be seen by bpf_overlay. However, after the previous commit it's no longer the case. So, remove the check, and only keep the one in bpf_host. A nice side-effect of the previous commit is that for WG+tunnel we automatically enforce the strict mode w/o relying on strict_allow(). I.e., any tunnel encaped traffic is going to be dropped until cilium-agent has propogated destination node's IP addr into WG's allowed-ips list for that node. This commit also drops the WG strict mode test case for tunneling, as the test configuration is no longer applicable, and the test is going to be migrated to the CLI connectivity suite. [1]: #21856 Signed-off-by: Martynas Pumputis <m@lambda.lt>
Previously, the strict encrypt check [1] was running in bpf_overlay (in addition to bpf_host). That particular check was assuming that no pod-to-pod unencrypted packet should be seen by bpf_overlay. However, after the previous commit it's no longer the case. So, remove the check, and only keep the one in bpf_host. A nice side-effect of the previous commit is that for WG+tunnel we automatically enforce the strict mode w/o relying on strict_allow(). I.e., any tunnel encaped traffic is going to be dropped until cilium-agent has propogated destination node's IP addr into WG's allowed-ips list for that node. This commit also drops the WG strict mode test case for tunneling, as the test configuration is no longer applicable, and the test is going to be migrated to the CLI connectivity suite. [1]: #21856 Signed-off-by: Martynas Pumputis <m@lambda.lt>
Previously, the strict encrypt check [1] was running in bpf_overlay (in addition to bpf_host). That particular check was assuming that no pod-to-pod unencrypted packet should be seen by bpf_overlay. However, after the previous commit it's no longer the case. So, remove the check, and only keep the one in bpf_host. A nice side-effect of the previous commit is that for WG+tunnel we automatically enforce the strict mode w/o relying on strict_allow(). I.e., any tunnel encaped traffic is going to be dropped until cilium-agent has propogated destination node's IP addr into WG's allowed-ips list for that node. This commit also drops the WG strict mode test case for tunneling, as the test configuration is no longer applicable, and the test is going to be migrated to the CLI connectivity suite. [1]: #21856 Signed-off-by: Martynas Pumputis <m@lambda.lt>
Previously, the strict encrypt check [1] was running in bpf_overlay (in addition to bpf_host). That particular check was assuming that no pod-to-pod unencrypted packet should be seen by bpf_overlay. However, after the previous commit it's no longer the case. So, remove the check, and only keep the one in bpf_host. A nice side-effect of the previous commit is that for WG+tunnel we automatically enforce the strict mode w/o relying on strict_allow(). I.e., any tunnel encaped traffic is going to be dropped until cilium-agent has propogated destination node's IP addr into WG's allowed-ips list for that node. This commit also drops the WG strict mode test case for tunneling, as the test configuration is no longer applicable, and the test is going to be migrated to the CLI connectivity suite. [1]: cilium#21856 Signed-off-by: Martynas Pumputis <m@lambda.lt>
When pod2pod encryption is enabled, there is a slight time window, where one pod may send unencrypted data to another one. This happens when a new pod is created but the information of the new endpoint has not propagated to the other nodes.
To prevent this from happening, we block all unencrypted pod2pod traffic. This is done via a filter in the datapath. The filter is configured at the same time the datapth is set up, sine we cannot rely on data which is only eventually updated at runtime.
The filter drops any unencrypted tcp/udp egress traffic which originates from and is sent to the PodCIDR and also leaves the node.
Signed-off-by: Benedict Schlueter bs@edgeless.systems
Signed-off-by: Leonard Cohnen lc@edgeless.systems
Please ensure your pull request adheres to the following guidelines:
description and a
Fixes: #XXX
line if the commit addresses a particularGitHub issue.
Fixes: <commit-id>
tag, thenplease add the commit author[s] as reviewer[s] to this issue.