Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpf: fix in-cluster connectivity for externalTrafficPolicy=Local #12311

Merged
merged 6 commits into from Jul 2, 2020

Conversation

borkmann
Copy link
Member

@borkmann borkmann commented Jun 26, 2020

See commit messages.

Fixes: #11746
Fixes: #11724

@borkmann borkmann added release-note/misc This PR makes changes that have no direct user impact. area/kube-proxy-free labels Jun 26, 2020
@borkmann borkmann requested review from gandro and brb June 26, 2020 22:13
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from master in 1.8.1 Jun 26, 2020
@coveralls
Copy link

coveralls commented Jun 26, 2020

Coverage Status

Coverage increased (+0.007%) to 36.917% when pulling 775b097 on pr/traffic-policy-local into 65e68d0 on master.

@borkmann borkmann force-pushed the pr/traffic-policy-local branch 2 times, most recently from 3be22fd to ee7e264 Compare June 29, 2020 21:34
@borkmann borkmann added the priority/high This is considered vital to an upcoming release. label Jun 29, 2020
@borkmann borkmann changed the title bpf: allow in-cluster connectivity for externalTrafficPolicy=Local bpf: fix in-cluster connectivity for externalTrafficPolicy=Local Jun 29, 2020
@borkmann borkmann force-pushed the pr/traffic-policy-local branch 2 times, most recently from 824df7d to d345586 Compare June 30, 2020 09:59
Copy link
Member

@gandro gandro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good so far! One issue with regards to HealthCheckNodePort

pkg/service/service.go Outdated Show resolved Hide resolved
@borkmann borkmann force-pushed the pr/traffic-policy-local branch 4 times, most recently from 672c047 to e7ac2a0 Compare June 30, 2020 12:23
@borkmann borkmann marked this pull request as ready for review June 30, 2020 12:24
@borkmann borkmann requested a review from a team as a code owner June 30, 2020 12:24
@borkmann borkmann requested a review from a team June 30, 2020 12:24
@borkmann borkmann requested review from a team as code owners June 30, 2020 12:24
@borkmann borkmann requested review from a team June 30, 2020 12:24
@borkmann
Copy link
Member Author

test-me-please

@borkmann
Copy link
Member Author

test-focus K8sService*

@borkmann borkmann requested a review from a team as a code owner June 30, 2020 12:45
@borkmann
Copy link
Member Author

borkmann commented Jul 1, 2020

(don't merge yet, still debugging on sth)

@borkmann
Copy link
Member Author

borkmann commented Jul 1, 2020

test-me-please

@borkmann
Copy link
Member Author

borkmann commented Jul 1, 2020

test-focus K8sService*

@borkmann
Copy link
Member Author

borkmann commented Jul 1, 2020

(ready to merge once CI is green)

@borkmann
Copy link
Member Author

borkmann commented Jul 1, 2020

Runtime-4.9 looks unrelated:

	 START: proxy_test.go:306: DNSProxyTestSuite.TestRejectNonMatchingRefusedResponse
	 proxy_test.go:329:
	     c.Assert(err, IsNil, Commentf("DNS request from test client failed when it should succeed"))
	 ... value *net.OpError = &net.OpError{Op:"dial", Net:"tcp", Source:net.Addr(nil), Addr:(*net.TCPAddr)(0xc00041a1e0), Err:(*poll.TimeoutError)(0x2f8d460)} ("dial tcp [::]:36405: i/o timeout")
	 ... DNS request from test client failed when it should succeed
	 
	 START: proxy_test.go:219: DNSProxyTestSuite.TearDownTest
	 PASS: proxy_test.go:219: DNSProxyTestSuite.TearDownTest	0.000s
	 
	 FAIL: proxy_test.go:306: DNSProxyTestSuite.TestRejectNonMatchingRefusedResponse

@borkmann
Copy link
Member Author

borkmann commented Jul 1, 2020

test-focus had vbox error:

23:26:58  Progress state: VBOX_E_INVALID_OBJECT_STATE
23:26:58  VBoxManage: error: Machine delete failed
23:26:58  VBoxManage: error: Medium '/root/VirtualBox VMs/ubuntu-18.04.4-amd64-60_1593638818730_5500/ubuntu-18.04.4-amd64-60-disk001.vmdk' is locked for reading by another task
23:26:58  VBoxManage: error: Details: code VBOX_E_INVALID_OBJECT_STATE (0x80bb0007), component MediumWrap, interface IMedium
23:26:58  VBoxManage: error: Context: "RTEXITCODE handleUnregisterVM(HandlerArg*)" at line 162 of file VBoxManageMisc.cpp

@borkmann
Copy link
Member Author

borkmann commented Jul 1, 2020

4.19 bailed on HealthCheckNodePort ... looking into it:
https://jenkins.cilium.io/job/Cilium-PR-Ginkgo-Tests-Kernel/2366/

@borkmann borkmann requested a review from a team as a code owner July 1, 2020 23:50
@borkmann
Copy link
Member Author

borkmann commented Jul 1, 2020

test-me-please

Finally, wire-up Kubernetes side to add surrogate entries for cluster-internal
communication in case of externalTrafficPolicy=Local. We only filter local
backends for ScopeExternal. ScopeInternal will have everything.

Example from nginx deployment with externalTrafficPolicy=Local, listing from
node apoc:

  # ./cilium/cilium service list
  ID   Frontend                 Service Type   Backend
  4    10.100.125.187:80        ClusterIP      1 => 10.217.0.252:80
                                               2 => 10.217.1.86:80
  5    192.168.178.29:30465     NodePort       1 => 10.217.0.252:80
  6    0.0.0.0:30465            NodePort       1 => 10.217.0.252:80
  7    192.168.178.29:30465/i   NodePort       1 => 10.217.0.252:80
                                               2 => 10.217.1.86:80
  8    0.0.0.0:30465/i          NodePort       1 => 10.217.0.252:80
                                               2 => 10.217.1.86:80

  # ./cilium/cilium bpf lb list
  SERVICE ADDRESS          BACKEND ADDRESS
  10.100.125.187:80        0.0.0.0:0 (4) [ClusterIP]
                           10.217.0.252:80 (4)
                           10.217.1.86:80 (4)
  192.168.178.29:30465     0.0.0.0:0 (5) [NodePort, Local]
                           10.217.0.252:80 (5)
  192.168.178.29:30465/i   0.0.0.0:0 (7) [NodePort, Local]
                           10.217.0.252:80 (7)
                           10.217.1.86:80 (7)
  0.0.0.0:30465            0.0.0.0:0 (6) [NodePort, Local]
                           10.217.0.252:80 (6)
  0.0.0.0:30465/i          10.217.1.86:80 (8)
                           10.217.0.252:80 (8)
                           0.0.0.0:0 (8) [NodePort, Local]

  # kubectl get pods --all-namespaces -o wide
  NAMESPACE     NAME                           READY   STATUS    RESTARTS   AGE    IP             NODE   NOMINATED NODE   READINESS GATES
  default       nginx6-7d4b5d6bdf-7tpk4        1/1     Running   0          127m   10.217.1.86    tank   <none>           <none>
  default       nginx6-7d4b5d6bdf-wd9hf        1/1     Running   0          127m   10.217.0.252   apoc   <none>           <none>
  [...]

As can be seen the ClusterIP has all backends, the external facing NodePort
only has local backends whereas the internal facing NodePort has all backends.
The 0.0.0.0:30465 could still be optimized away since genCartesianProduct()
has no awareness from the logic in ParseService() which adds the surrogate
entries.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
... given kube-proxy-free setup supports them now, lets run these. Also
add tests from third node to make sure they fail to a node that has no
local backend.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Add an 'umbrella' section on client source IP preservation with links
to the related sections.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
@borkmann
Copy link
Member Author

borkmann commented Jul 1, 2020

test-me-please

EDIT: net-next failed provisioning:
https://jenkins.cilium.io/job/Cilium-PR-K8s-oldest-net-next/1066/

@borkmann
Copy link
Member Author

borkmann commented Jul 2, 2020

test-focus K8sService*

@joestringer
Copy link
Member

retest-net-next

Copy link
Member

@joestringer joestringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docs/cli/bpf LGTM. I didn't look so closely at services side.

@joestringer joestringer merged commit 30f44a5 into master Jul 2, 2020
@joestringer joestringer deleted the pr/traffic-policy-local branch July 2, 2020 04:21
@joestringer joestringer mentioned this pull request Jul 2, 2020
26 tasks
@travisghansen
Copy link
Contributor

#11724 (comment)

@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from master to Backport pending to v1.8 in 1.8.1 Jul 2, 2020
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from master to Backport pending to v1.8 in 1.8.1 Jul 2, 2020
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.8 to Backport done to v1.8 in 1.8.1 Jul 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/high This is considered vital to an upcoming release. release-note/misc This PR makes changes that have no direct user impact.
Projects
No open projects
1.8.1
Backport done to v1.8
8 participants