Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

egress gateway: fix non-tunnel mode #17517

Merged
merged 2 commits into from
Oct 12, 2021

Conversation

kkourt
Copy link
Contributor

@kkourt kkourt commented Oct 1, 2021

Fixes: #17386

When a client uses an egress gateway node, it forwards traffic via a vxlan tunnel to the egress gateway node. If datapath is configured in non-tunnel mode (direct routing), replies from the gateway to the client do not go via the tunnel. This causes these replies to be dropped by iptables because no Cilium's FORWARD rule matches them

This patch identifies above packets (i.e., from egress gw to client), and steers them via the vlxan tunnel after rev-SNAT is performed even when datapath is configured in non-tunnel mode.

PR also adds a test for egress gateway functionality.

Signed-off-by: Kornilios Kourtis kornilios@isovalent.com

egress gateway: fix non-tunnel (direct routing) mode

@kkourt kkourt added kind/bug This is a bug in the Cilium logic. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. feature/egress-gateway Impacts the egress IP gateway feature. labels Oct 1, 2021
@kkourt kkourt requested a review from a team October 1, 2021 17:18
@kkourt kkourt requested a review from a team as a code owner October 1, 2021 17:18
@kkourt kkourt requested a review from borkmann October 1, 2021 17:18
@maintainer-s-little-helper maintainer-s-little-helper bot added dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Oct 1, 2021
@kkourt kkourt requested a review from tklauser October 1, 2021 17:18
@kkourt kkourt marked this pull request as draft October 1, 2021 17:19
@kkourt
Copy link
Contributor Author

kkourt commented Oct 1, 2021

test-1.16-netnext

@kkourt
Copy link
Contributor Author

kkourt commented Oct 4, 2021

test-me-please

@kkourt kkourt force-pushed the pr/kkourt/egress-gw-non-tunnel branch from a47bdc1 to 89c7059 Compare October 4, 2021 08:43
@kkourt
Copy link
Contributor Author

kkourt commented Oct 4, 2021

/test

@kkourt kkourt added release-note/bug This PR fixes an issue in a previous release of Cilium. release-note/ci This PR makes changes to the CI. labels Oct 4, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Oct 4, 2021
@kkourt kkourt changed the title fix egress gateway in non-tunnel (direct routing) mode egress gw fixes: exclude in-cluster traffic, non-tunnel mode Oct 4, 2021
@kkourt kkourt force-pushed the pr/kkourt/egress-gw-non-tunnel branch from 89c7059 to 622e80f Compare October 4, 2021 12:42
@kkourt
Copy link
Contributor Author

kkourt commented Oct 4, 2021

/test

@kkourt kkourt changed the title egress gw fixes: exclude in-cluster traffic, non-tunnel mode egress gw fix non-tunnel mode Oct 4, 2021
@kkourt kkourt force-pushed the pr/kkourt/egress-gw-non-tunnel branch 3 times, most recently from 5c3cc36 to bf2e33c Compare October 4, 2021 22:04
@kkourt
Copy link
Contributor Author

kkourt commented Oct 4, 2021

/test

@kkourt
Copy link
Contributor Author

kkourt commented Oct 5, 2021

/test-gke

@brb
Copy link
Member

brb commented Oct 8, 2021

Travis CI hit:

docker: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit.

Restarted.

@kkourt kkourt force-pushed the pr/kkourt/egress-gw-non-tunnel branch from 4aaaafe to 1fd7359 Compare October 8, 2021 15:53
@kkourt
Copy link
Contributor Author

kkourt commented Oct 8, 2021

/test

@kkourt
Copy link
Contributor Author

kkourt commented Oct 10, 2021

/test-gke

bpf/lib/nodeport.h Show resolved Hide resolved
Copy link
Member

@pchaigno pchaigno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, only minor stuff below 🙂

bpf/lib/nodeport.h Outdated Show resolved Hide resolved
test/helpers/kubectl.go Show resolved Hide resolved
test/k8sT/Egress.go Outdated Show resolved Hide resolved
test/k8sT/Egress.go Outdated Show resolved Hide resolved
test/k8sT/manifests/egress-ip-deployment.yaml Show resolved Hide resolved
test/k8sT/manifests/egress-ip-deployment.yaml Show resolved Hide resolved
test/k8sT/Egress.go Show resolved Hide resolved
test/k8sT/Egress.go Show resolved Hide resolved
test/k8sT/Egress.go Show resolved Hide resolved
test/k8sT/Egress.go Show resolved Hide resolved
kkourt and others added 2 commits October 12, 2021 12:58
When a client uses an egress gateway node, it forwards traffic via a
vxlan tunnel to the egress gateway node. If datapath is configured in
non-tunnel mode (direct routing), replies from the gateway to the client
do not go via the tunnel. This causes these replies to be dropped
by iptables because no Cilium's FORWARD rule matches them

This patch identifies above packets (i.e., from egress gw to client),
and steers them via the vlxan tunnel after rev-SNAT is performed even
when datapath is configured in non-tunnel mode.

A suggestion by Paul and Martynas (@brb) was to use the following
condition to identify said packets:
> if rev-SNATed IP ∈ native CIDR && rev-SNATed IP !∈ node pod CIDR => send to tunnel

This patch, instead, checks the egress gateway policy map. This seems
like a safer approach, because all packets that match contents of above
map in the forward direction will be forwarded to the gw node.

Fixes: #17386

Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>
The original patch
(06e1f1c)
for this test included an additional policy in test/k8sT/manifests/egress-nat-policy.yaml:

> apiVersion: cilium.io/v2alpha1
> kind: CiliumEgressNATPolicy
> metadata:
>   name: egress-to-black-hole
> spec:
>   egress:
>   - podSelector:
>       matchLabels:
>         zgroup: testDSClient
>     namespaceSelector:
>       matchLabels:
>         ns: cilium-test
>   # Route everything to a black hole.
>   # It shouldn't affect in-cluster traffic.
>   destinationCIDRs:
>   - 0.0.0.0/0
>   egressSourceIP: 1.1.1.1 # It's a black hole

which was meant to test
b8c757a,
which aimed to address #16147.

The above patch, however, lead to a verification error so it was
excluded from this PR.

Signed-off-by: Yongkun Gui <ygui@google.com>
Signed-off-by: Kornlios Kourtis <kornilios@isovalent.com>
@kkourt
Copy link
Contributor Author

kkourt commented Oct 12, 2021

While trying to implement @pchaigno changes and running the Egress test locally, the test fail. It is not clear yet if this is a flake or due to the changes. Because, however, there are users waiting for the fix in this PR, and after discussing it offline, we will leave the changes to the test for a follow up PR.

Hence, I'm pushing a new branch with only white-space/comment changes so that we will not have to rerun tests. Previous tests were successful, except the Travis test which is a known flake.

egress-gw-sshots-1
egress-gw-sshots-2

The diff between the version that the tests run and the new one is:

diff --git a/bpf/lib/nodeport.h b/bpf/lib/nodeport.h
index f7b5507f68..ff8787e724 100644
--- a/bpf/lib/nodeport.h
+++ b/bpf/lib/nodeport.h
@@ -1968,7 +1968,7 @@ static __always_inline int rev_nodeport_lb4(struct __ctx_buff *ctx, int *ifindex
        csum_l4_offset_and_flags(tuple.nexthdr, &csum_off);
 
 #if defined(ENABLE_EGRESS_GATEWAY) && !defined(TUNNEL_MODE)
-       /* traffic from clients to egress gateway nodes, reaches said gateways
+       /* Traffic from clients to egress gateway nodes reaches said gateways
         * by a vxlan tunnel. If we are not using TUNNEL_MODE, we need to
         * identify reverse traffic from the gateway to clients and also steer
         * it via the vxlan tunnel to avoid issues with iptables dropping these
@@ -2011,6 +2011,7 @@ static __always_inline int rev_nodeport_lb4(struct __ctx_buff *ctx, int *ifindex
 #ifdef TUNNEL_MODE
                {
                        struct remote_endpoint_info *info;
+
                        info = ipcache_lookup4(&IPCACHE_MAP, ip4->daddr, V4_CACHE_KEY_LEN);
                        if (info != NULL && info->tunnel_endpoint != 0) {
                                tunnel_endpoint = info->tunnel_endpoint;
diff --git a/test/k8sT/Egress.go b/test/k8sT/Egress.go
index c28df5537a..17caeefef1 100644
--- a/test/k8sT/Egress.go
+++ b/test/k8sT/Egress.go
@@ -1,4 +1,4 @@
-// Copyright 2017-2021 Authors of Cilium
+// Copyright 2021 Authors of Cilium
 //
 // Licensed under the Apache License, Version 2.0 (the "License");
 // you may not use this file except in compliance with the License.
@@ -44,7 +44,6 @@ var _ = SkipDescribeIf(func() bool {
                namespaceSelector string = "ns=cilium-test"
        )
 
-       // TODO(anfernee): Check a better way to deduplicate it with similar snippet in DatapathConfiguration.go
        runEchoServer := func() {
                // Run echo server on outside node
                originalEchoPodPath := helpers.ManifestGet(kubectl.BasePath(), "echoserver-hostnetns.yaml")

@kkourt kkourt force-pushed the pr/kkourt/egress-gw-non-tunnel branch from 1fd7359 to 091c5fc Compare October 12, 2021 11:18
@kkourt kkourt mentioned this pull request Oct 12, 2021
@kkourt kkourt requested a review from pchaigno October 12, 2021 11:25
Copy link
Member

@pchaigno pchaigno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

FYI, the diff is also available here 👇 , by clicking on force-pushed:
image

@kkourt
Copy link
Contributor Author

kkourt commented Oct 12, 2021

After Paul's ✅ and reasoning on #17517 (comment), marking ready-to-merge.

@kkourt kkourt added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Oct 12, 2021
@errordeveloper errordeveloper merged commit 0ed817c into master Oct 12, 2021
@errordeveloper errordeveloper deleted the pr/kkourt/egress-gw-non-tunnel branch October 12, 2021 11:54
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from master to Backport pending to v1.10 in 1.10.5 Oct 12, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.10 to Backport done to v1.10 in 1.10.5 Oct 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature/egress-gateway Impacts the egress IP gateway feature. kind/bug This is a bug in the Cilium logic. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium. release-note/ci This PR makes changes to the CI. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages.
Projects
No open projects
1.10.5
Backport done to v1.10
Development

Successfully merging this pull request may close these issues.

egress gw: Non-tunnel mode is broken
9 participants