v1.7 backports 2020-04-29 #11233

vadorovsky · 2020-04-29T17:37:47Z

examples/getting-started: add bind mount for docker socket to docker-compose.yml #11023 -- examples/getting-started: add bind mount for docker socket to docker-compose.yml (@tklauser)
daemon: set k8s options as soon as possible #11101 -- daemon: set k8s options as soon as possible (@aanm)
docs: Pin Hubble version to v0.5 branch #11121 -- docs: Pin Hubble version to v0.5 branch (@gandro)
policymap, fragmap: clean up doc, map size configuration #10964 -- policymap, fragmap: clean up doc, map size configuration (@qmonnet)
dns: Log WriteMsgIP details. #11037 -- dns: Log WriteMsgIP details. (@jrajahalme)
Fix cilium-health healthchecking remote endpoint in ENI mode #11073 -- Fix cilium-health healthchecking remote endpoint in ENI mode (@christarazi)
Add Kubernetes IPAM mode #10407 -- Add Kubernetes IPAM mode (@tgraf) - only the documentation change, without mentioning "kubernetes" mode
Switch to native-routing in GKE guide #11079 -- Switch to native-routing in GKE guide (@tgraf)
cilium: add NodePortBindProtection to allow for opting out bind rejection #11188 -- cilium: add NodePortBindProtection to allow for opting out bind rejection (@borkmann)
Auto-detects EndpointSlice support via apimachinery #11206 -- Auto-detects EndpointSlice support via apimachinery (@Weil0ng)
k8s: Provide fallback for EndpointSlices detection if discovery API is not available #11253 -- k8s: Provide fallback for EndpointSlices detection if discovery API is not available (@tgraf)

Once this PR is merged, you can update the PR labels via:

$ for pr in 11023 11101 11121 10964 11037 11073 10407 11079 11188 11206 11253; do contrib/backporting/set-labels.py $pr done 1.7; done

tklauser

LGTM for my change.

christarazi

My changes seem to be missing 7c783f5 from #11073

jrfastab · 2020-04-29T18:17:39Z

If possible it would be best to merge #11109 first to get things in order.

Also PRs are going to conflict so this will need some reworks after above is merged.

vadorovsky · 2020-04-29T19:11:56Z

My changes seem to be missing 7c783f5 from #11073

@christarazi Yes, you are right. It didn't apply properly, so I thought it can be dropped, since it only changes variable names. But now I see that those variables are there on 1.7 branch and I could just solve the conflict. Sorry for that, I will reapply.

joestringer · 2020-04-29T19:31:11Z

Thanks for following up on these backports :)

FWIW I'd like to hold off on further 1.7 backports until we get #11109 merged and v1.7.3 out, users
are hitting issues quite frequently (in particular the remote-node identity stuff) and other than the PRs that are also in #11109, I don't think any of the remaining backports here are immediately critical.

Hopefully we should be able to get v1.7.3 out today or tomorrow and proceed with the rest of these backports from there.

vadorovsky · 2020-04-29T20:27:46Z

Sure, let's wait for #11109. This pull request is huge anyway and I think that there will be a lot to fix in ginkgo tests. And I will regenerate modules and cmdref before the next push.

joestringer · 2020-04-30T19:24:46Z

By the way I meant to post an update, but the above PR was merged and we put out v1.7.3. No need to block on anything for this PR now.

qmonnet · 2020-05-02T11:13:07Z

For what it's worth the backport for the fix I did for the eppolicymap looks good to me 👍

vadorovsky · 2020-05-04T11:56:03Z

@tgraf Documentation changes from #11079 (which I'm backporting here) seem to depend on documentation from #10407 - specifically on the Documentation/concepts/ipam/kubernetes.rst. How do you feel about backporting #10407? Or maybe we should backport only documentation from #10407?

vadorovsky · 2020-05-04T11:59:42Z

@christarazi All your commits from your PR should be there. Sorry again!

vadorovsky · 2020-05-04T13:14:32Z

never-tell-me-the-odds

vadorovsky · 2020-05-04T14:55:05Z

Doc test fail because of the dependency mentioned here #11233 (comment)

Now I'm sure that we have to backport #10407 too, at least partially.

vadorovsky · 2020-05-04T20:32:28Z

never-tell-me-the-odds

vadorovsky · 2020-05-05T08:58:20Z

never-tell-me-the-odds

vadorovsky · 2020-05-05T09:00:11Z

never-tell-me-the-odds

joestringer · 2020-05-05T20:24:23Z

All of the tests except for the known failing ones (See also https://github.com/cilium/cilium/projects/85) are passing. @mrostecki are you looking for any further feedback on any of the backports?

vadorovsky · 2020-05-05T20:52:55Z

@joestringer I would appreciate a feedback from @tgraf and @christarazi.

…compose.yml [ upstream commit 2d6e3be ] The cilium-docker plugin needs access to the docker socket, thus a bind mount is needed. Fixes #10962 Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>

[ upstream commit c9f94e7 ] Configure k8s as soon as possible so that k8s.IsEnabled() has the right behavior from the start of the Cilium agent initialization. Fixes: 604dab4 ("pkg/k8s: detect k8s mode if env variable K8S_NODE_NAME is set") Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>

[ upstream commit 1cb4020 ] Hubble is undergoing some big architectural changes with Cilium 1.8 which will also impact the way it is deployed. Notably, Hubble v0.5 will be the last version of Hubble to support Cilium 1.7. Therefore, the documentation in Cilium 1.7 needs to be updated to point to the v0.5 branch, as the master branch of `github.com/cilium/hubble` will not contain the Hubble stand-alone server Helm charts (required to run Hubble with Cilium 1.7) going forward. The Hubble documentation will need a significant overhaul for the Cilium 1.8 release. However, this is going to be done at a later date and is out of scope for this PR. This PR is soley intended to ensure the Cilium 1.7 documentation will stay functional going forward. Hubble v0.5 is compatible with Cilium 1.8, therefore the `latest` docs are remaining functional as well. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>

[ upstream commit 5e40659 ] In eppolicymap.CreateWithName(), if the policy map creation in the buildMap.Do() block fails for whatever reason, we return early and EpPolicyMap remains nil. Calling OpenOrCreate() a few lines below attempts to dereference that pointer to take its lock. Ouch. EpPolicyMap used to be created unconditionally, even if the policy map creation would fail. This was changed in commit b384ce8. We could restore this behaviour, or we can simply exit early if inner map creation fails, by producing a fatal error message. Let's do the latter, as there is no point in proceeding with the creation of endpoint policy map if creating the inner map failed. Fixes: b384ce8 ("eppolicymap: Factor out for better testing") Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>

[ upstream commit d07eec5 ] Add detail to WriteMsgIP log messages in hopes to understand why we sometimes see a lot of these logs. Signed-off-by: Jarno Rajahalme <jarno@covalent.io> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>

[ upstream commit dbedb10 ] Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>

[ upstream commit 560bda1 ] Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>

[ upstream commit 26308b6 ] In ENI mode, rules and routes are created for each endpoint (and therefore pod) in order to direct traffic to the appropriate ENI device. This logic was only running in the CNI plugin. Since cilium-health is not a pod, its rules and routes needed for traffic to flow were missing. This commit implements creating the necessary rules and routes for cilium-health in ENI mode. This commit also unifies the installation logic of the rules and routes in `pkg/aws/eni/routing`, allowing the CNI plugin and the cilium-health creation code to share a common implementation. Fixes #10810 Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>

[ upstream commit ced4ad0 ] This documents an IPAM mode which was already supported and widely used but difficult to discover. It requires the PodCIDR for all enabled address familied to be provided via the k8s Node resource and then uses the standard hostscope allocator. Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>

[ upstream commit 787c276 ] The PodCIDR is always populated on GKE. Switch to native routing mode while allocating IPs from Node.Spec.PodCIDR[s] As we transition to native-routing mode: * We can enable endpoint routes to avoid traversing the cilium-host interface * We can disable the local node route as it is not required * We can disable local routing blacklisting as it is known that no local routes will conflict with the PodCIDR Hide all of this behind a new option global.gke.enabled Deriving the cluster CIDR as part of the guide also allows to disable masquerading in the entire cluster. Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>

[ upstream commit 5cf8383 ] The GKE reference guide will switch over to using direct-routing by default. Cover both modes in the GKE CI. Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>

[ upstream commit db3838b ] Instead of instructing to only restart unmanaged pods in kube-system, extend this to restart all unmanaged pods which are not running in host networking mode. Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>

[ upstream commit 3c19db9 ] This step does not seem required at this point. Remove it. Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>

[ upstream commit 82aac9a ] Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>

[ upstream commit c0ef36f ] When running native-routing mode, the source IP from host traffic is unlikely the IP assigned to the cilium-host interface. In the remote-node identity legacy mode, only the IP of the cilium-host interface is considered host. All other IPs are considered world. Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>

…tion [ upstream commit 77f580d ] Add an agent knob for expert users who wish to run backend Pods on the same ports as the NodePort service is exposed. We recently resolved this by making cgroup BPF hooks netns aware in the kernel and Cilium via work from d7bf451 ("bpf: fix init namespace handling in sock lb programs"). This works for 5.7 kernels. However, if there is a need for older kernels to deploy Pods in this way (or to opt out of the bind hooks entirely for different reasons), then one can control this setting through --node-port-bind-protection. The knob defaults to true (same as we do today), so that expert users can opt-out from it if they need to. With bind protection enabled: # ./daemon/cilium-agent [...] # bpftool cgroup tree CgroupPath ID AttachType AttachFlags Name /sys/fs/cgroup/unified 48126 connect4 48122 connect6 48127 post_bind4 <---+ bind prevention enabled by default 48123 post_bind6 <---` 48128 sendmsg4 48124 sendmsg6 48129 recvmsg4 48125 recvmsg6 [...] With bind protection disabled: # ./daemon/cilium-agent [...] --node-port-bind-protection=false # bpftool cgroup tree CgroupPath ID AttachType AttachFlags Name /sys/fs/cgroup/unified 49607 connect4 49604 connect6 49608 sendmsg4 49605 sendmsg6 49609 recvmsg4 49606 recvmsg6 [...] Reported-by: Rene Zbinden <rene.zbinden@postfinance.ch> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>

[ upstream commit d474e71 ] Add related documentation and options for helm to aid configuration for the bindProtection setting. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>

…n-matching. [ upstream commit 2950ea4 ] For some clusters (e.g., gke-1.17.5), EndpointSlice is disabled although the cluster version is >= 1.17. In those cases, version-matching would turn on endpointslice controller while that api is not available, resulting in cilium agent not functioning. Signed-off-by: Weilong Cui <cuiwl@google.com> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>

[ upstream commit 902bac8 ] Discovery of API groups requires the API services of the apiserver to be healthy. Such API services can depend on the readiness of regular pods which require Cilium to function correctly. By treating failure to discover API groups as fatal, a critial loop can be entered in which Cilium cannot start because the API groups can't be discovered and th API groups will only become discoverable once Cilium is up. Warn about the lack of discovery ability and fall back to probing the API directly. Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>

[ upstream commit 6fcf0cd ] The discovery request consists of several API calls to the apiserver. It has already been observed that the k8s client can stsrt throttling during startup phase. Disable the use of the discovery API by default and rely on individual API probing but allow users to benefit from it by opting into it. Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>

joestringer · 2020-05-07T23:06:39Z

never-tell-me-the-odds

joestringer · 2020-05-08T18:15:17Z

Failures seem to be provisioning issues. I will rekick.

https://jenkins.cilium.io/job/Cilium-PR-Ginkgo-Tests-K8s/3078/
https://jenkins.cilium.io/job/Cilium-PR-Ginkgo-Tests-Validated/19602/

joestringer · 2020-05-08T18:15:31Z

restart-ginkgo

joestringer · 2020-05-08T18:15:41Z

test-missed-k8s

christarazi · 2020-05-08T22:33:43Z

Looks like the build went green 🎉

vadorovsky requested a review from a team as a code owner April 29, 2020 17:37

maintainer-s-little-helper bot added backport/1.7 kind/backports This PR provides functionality previously merged into master. labels Apr 29, 2020

tklauser approved these changes Apr 29, 2020

View reviewed changes

vadorovsky marked this pull request as draft April 29, 2020 17:43

christarazi requested changes Apr 29, 2020

View reviewed changes

vadorovsky force-pushed the pr/v1.7-backport-2020-04-29 branch 3 times, most recently from 1b1415f to 655c5bf Compare May 4, 2020 11:18

vadorovsky force-pushed the pr/v1.7-backport-2020-04-29 branch from 655c5bf to 5af5dd5 Compare May 4, 2020 12:06

vadorovsky force-pushed the pr/v1.7-backport-2020-04-29 branch from 5af5dd5 to 2208547 Compare May 4, 2020 20:32

vadorovsky force-pushed the pr/v1.7-backport-2020-04-29 branch from 2208547 to 53bfb3a Compare May 5, 2020 08:52

vadorovsky marked this pull request as ready for review May 5, 2020 08:58

vadorovsky force-pushed the pr/v1.7-backport-2020-04-29 branch from 53bfb3a to 8888078 Compare May 5, 2020 08:59

joestringer requested a review from christarazi May 5, 2020 20:23

tklauser and others added 20 commits May 7, 2020 16:03

dns: Log WriteMsgIP details.

1713672

[ upstream commit d07eec5 ] Add detail to WriteMsgIP log messages in hopes to understand why we sometimes see a lot of these logs. Signed-off-by: Jarno Rajahalme <jarno@covalent.io> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>

Refactor variable names to appease golint

6bb3372

[ upstream commit dbedb10 ] Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>

Add touch ups to ENI docs

71cd95f

[ upstream commit 560bda1 ] Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>

doc: Remove clusterrolebinding step from GKE guide

a939bd5

[ upstream commit 3c19db9 ] This step does not seem required at this point. Remove it. Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>

doc: Document GKE IPAM and datapath modes

4df1e73

[ upstream commit 82aac9a ] Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>

joestringer force-pushed the pr/v1.7-backport-2020-04-29 branch from aa10d31 to 7842a12 Compare May 7, 2020 23:06

joestringer mentioned this pull request May 8, 2020

v1.7 backport for #11333 #11414

Merged

joestringer merged commit 77c16fe into v1.7 May 8, 2020

joestringer deleted the pr/v1.7-backport-2020-04-29 branch May 8, 2020 23:17

tklauser mentioned this pull request May 15, 2020

Unknown IPAM backend kubernetes #11539

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.7 backports 2020-04-29 #11233

v1.7 backports 2020-04-29 #11233

vadorovsky commented Apr 29, 2020 •

edited

tklauser left a comment

christarazi left a comment

jrfastab commented Apr 29, 2020 •

edited

vadorovsky commented Apr 29, 2020

joestringer commented Apr 29, 2020

vadorovsky commented Apr 29, 2020

joestringer commented Apr 30, 2020

qmonnet commented May 2, 2020

vadorovsky commented May 4, 2020

vadorovsky commented May 4, 2020

vadorovsky commented May 4, 2020

vadorovsky commented May 4, 2020

vadorovsky commented May 4, 2020

vadorovsky commented May 5, 2020

vadorovsky commented May 5, 2020

joestringer commented May 5, 2020

vadorovsky commented May 5, 2020

joestringer commented May 7, 2020

joestringer commented May 8, 2020

joestringer commented May 8, 2020

joestringer commented May 8, 2020

christarazi commented May 8, 2020

v1.7 backports 2020-04-29 #11233

v1.7 backports 2020-04-29 #11233

Conversation

vadorovsky commented Apr 29, 2020 • edited

tklauser left a comment

Choose a reason for hiding this comment

christarazi left a comment

Choose a reason for hiding this comment

jrfastab commented Apr 29, 2020 • edited

vadorovsky commented Apr 29, 2020

joestringer commented Apr 29, 2020

vadorovsky commented Apr 29, 2020

joestringer commented Apr 30, 2020

qmonnet commented May 2, 2020

vadorovsky commented May 4, 2020

vadorovsky commented May 4, 2020

vadorovsky commented May 4, 2020

vadorovsky commented May 4, 2020

vadorovsky commented May 4, 2020

vadorovsky commented May 5, 2020

vadorovsky commented May 5, 2020

joestringer commented May 5, 2020

vadorovsky commented May 5, 2020

joestringer commented May 7, 2020

joestringer commented May 8, 2020

joestringer commented May 8, 2020

joestringer commented May 8, 2020

christarazi commented May 8, 2020

vadorovsky commented Apr 29, 2020 •

edited

jrfastab commented Apr 29, 2020 •

edited