-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
egress gateway: fix non-tunnel mode #17517
Conversation
test-1.16-netnext |
test-me-please |
a47bdc1
to
89c7059
Compare
/test |
89c7059
to
622e80f
Compare
/test |
5c3cc36
to
bf2e33c
Compare
/test |
/test-gke |
Travis CI hit:
Restarted. |
4aaaafe
to
1fd7359
Compare
/test |
/test-gke |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall, only minor stuff below 🙂
When a client uses an egress gateway node, it forwards traffic via a vxlan tunnel to the egress gateway node. If datapath is configured in non-tunnel mode (direct routing), replies from the gateway to the client do not go via the tunnel. This causes these replies to be dropped by iptables because no Cilium's FORWARD rule matches them This patch identifies above packets (i.e., from egress gw to client), and steers them via the vlxan tunnel after rev-SNAT is performed even when datapath is configured in non-tunnel mode. A suggestion by Paul and Martynas (@brb) was to use the following condition to identify said packets: > if rev-SNATed IP ∈ native CIDR && rev-SNATed IP !∈ node pod CIDR => send to tunnel This patch, instead, checks the egress gateway policy map. This seems like a safer approach, because all packets that match contents of above map in the forward direction will be forwarded to the gw node. Fixes: #17386 Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>
The original patch (06e1f1c) for this test included an additional policy in test/k8sT/manifests/egress-nat-policy.yaml: > apiVersion: cilium.io/v2alpha1 > kind: CiliumEgressNATPolicy > metadata: > name: egress-to-black-hole > spec: > egress: > - podSelector: > matchLabels: > zgroup: testDSClient > namespaceSelector: > matchLabels: > ns: cilium-test > # Route everything to a black hole. > # It shouldn't affect in-cluster traffic. > destinationCIDRs: > - 0.0.0.0/0 > egressSourceIP: 1.1.1.1 # It's a black hole which was meant to test b8c757a, which aimed to address #16147. The above patch, however, lead to a verification error so it was excluded from this PR. Signed-off-by: Yongkun Gui <ygui@google.com> Signed-off-by: Kornlios Kourtis <kornilios@isovalent.com>
While trying to implement @pchaigno changes and running the Egress test locally, the test fail. It is not clear yet if this is a flake or due to the changes. Because, however, there are users waiting for the fix in this PR, and after discussing it offline, we will leave the changes to the test for a follow up PR. Hence, I'm pushing a new branch with only white-space/comment changes so that we will not have to rerun tests. Previous tests were successful, except the Travis test which is a known flake. The diff between the version that the tests run and the new one is: diff --git a/bpf/lib/nodeport.h b/bpf/lib/nodeport.h
index f7b5507f68..ff8787e724 100644
--- a/bpf/lib/nodeport.h
+++ b/bpf/lib/nodeport.h
@@ -1968,7 +1968,7 @@ static __always_inline int rev_nodeport_lb4(struct __ctx_buff *ctx, int *ifindex
csum_l4_offset_and_flags(tuple.nexthdr, &csum_off);
#if defined(ENABLE_EGRESS_GATEWAY) && !defined(TUNNEL_MODE)
- /* traffic from clients to egress gateway nodes, reaches said gateways
+ /* Traffic from clients to egress gateway nodes reaches said gateways
* by a vxlan tunnel. If we are not using TUNNEL_MODE, we need to
* identify reverse traffic from the gateway to clients and also steer
* it via the vxlan tunnel to avoid issues with iptables dropping these
@@ -2011,6 +2011,7 @@ static __always_inline int rev_nodeport_lb4(struct __ctx_buff *ctx, int *ifindex
#ifdef TUNNEL_MODE
{
struct remote_endpoint_info *info;
+
info = ipcache_lookup4(&IPCACHE_MAP, ip4->daddr, V4_CACHE_KEY_LEN);
if (info != NULL && info->tunnel_endpoint != 0) {
tunnel_endpoint = info->tunnel_endpoint;
diff --git a/test/k8sT/Egress.go b/test/k8sT/Egress.go
index c28df5537a..17caeefef1 100644
--- a/test/k8sT/Egress.go
+++ b/test/k8sT/Egress.go
@@ -1,4 +1,4 @@
-// Copyright 2017-2021 Authors of Cilium
+// Copyright 2021 Authors of Cilium
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
@@ -44,7 +44,6 @@ var _ = SkipDescribeIf(func() bool {
namespaceSelector string = "ns=cilium-test"
)
- // TODO(anfernee): Check a better way to deduplicate it with similar snippet in DatapathConfiguration.go
runEchoServer := func() {
// Run echo server on outside node
originalEchoPodPath := helpers.ManifestGet(kubectl.BasePath(), "echoserver-hostnetns.yaml") |
1fd7359
to
091c5fc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After Paul's ✅ and reasoning on #17517 (comment), marking ready-to-merge. |
Fixes: #17386
When a client uses an egress gateway node, it forwards traffic via a vxlan tunnel to the egress gateway node. If datapath is configured in non-tunnel mode (direct routing), replies from the gateway to the client do not go via the tunnel. This causes these replies to be dropped by iptables because no Cilium's FORWARD rule matches them
This patch identifies above packets (i.e., from egress gw to client), and steers them via the vlxan tunnel after rev-SNAT is performed even when datapath is configured in non-tunnel mode.
PR also adds a test for egress gateway functionality.
Signed-off-by: Kornilios Kourtis kornilios@isovalent.com