-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v1.7 backports 2020-04-29 #11233
v1.7 backports 2020-04-29 #11233
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for my change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If possible it would be best to merge #11109 first to get things in order. Also PRs are going to conflict so this will need some reworks after above is merged. |
@christarazi Yes, you are right. It didn't apply properly, so I thought it can be dropped, since it only changes variable names. But now I see that those variables are there on 1.7 branch and I could just solve the conflict. Sorry for that, I will reapply. |
Thanks for following up on these backports :) FWIW I'd like to hold off on further 1.7 backports until we get #11109 merged and v1.7.3 out, users Hopefully we should be able to get v1.7.3 out today or tomorrow and proceed with the rest of these backports from there. |
Sure, let's wait for #11109. This pull request is huge anyway and I think that there will be a lot to fix in ginkgo tests. And I will regenerate modules and cmdref before the next push. |
By the way I meant to post an update, but the above PR was merged and we put out v1.7.3. No need to block on anything for this PR now. |
For what it's worth the backport for the fix I did for the eppolicymap looks good to me 👍 |
1b1415f
to
655c5bf
Compare
@christarazi All your commits from your PR should be there. Sorry again! |
655c5bf
to
5af5dd5
Compare
never-tell-me-the-odds |
Doc test fail because of the dependency mentioned here #11233 (comment) Now I'm sure that we have to backport #10407 too, at least partially. |
5af5dd5
to
2208547
Compare
never-tell-me-the-odds |
2208547
to
53bfb3a
Compare
never-tell-me-the-odds |
53bfb3a
to
8888078
Compare
never-tell-me-the-odds |
All of the tests except for the known failing ones (See also https://github.com/cilium/cilium/projects/85) are passing. @mrostecki are you looking for any further feedback on any of the backports? |
@joestringer I would appreciate a feedback from @tgraf and @christarazi. |
[ upstream commit c9f94e7 ] Configure k8s as soon as possible so that k8s.IsEnabled() has the right behavior from the start of the Cilium agent initialization. Fixes: 604dab4 ("pkg/k8s: detect k8s mode if env variable K8S_NODE_NAME is set") Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
[ upstream commit 1cb4020 ] Hubble is undergoing some big architectural changes with Cilium 1.8 which will also impact the way it is deployed. Notably, Hubble v0.5 will be the last version of Hubble to support Cilium 1.7. Therefore, the documentation in Cilium 1.7 needs to be updated to point to the v0.5 branch, as the master branch of `github.com/cilium/hubble` will not contain the Hubble stand-alone server Helm charts (required to run Hubble with Cilium 1.7) going forward. The Hubble documentation will need a significant overhaul for the Cilium 1.8 release. However, this is going to be done at a later date and is out of scope for this PR. This PR is soley intended to ensure the Cilium 1.7 documentation will stay functional going forward. Hubble v0.5 is compatible with Cilium 1.8, therefore the `latest` docs are remaining functional as well. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
[ upstream commit 5e40659 ] In eppolicymap.CreateWithName(), if the policy map creation in the buildMap.Do() block fails for whatever reason, we return early and EpPolicyMap remains nil. Calling OpenOrCreate() a few lines below attempts to dereference that pointer to take its lock. Ouch. EpPolicyMap used to be created unconditionally, even if the policy map creation would fail. This was changed in commit b384ce8. We could restore this behaviour, or we can simply exit early if inner map creation fails, by producing a fatal error message. Let's do the latter, as there is no point in proceeding with the creation of endpoint policy map if creating the inner map failed. Fixes: b384ce8 ("eppolicymap: Factor out for better testing") Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
[ upstream commit d07eec5 ] Add detail to WriteMsgIP log messages in hopes to understand why we sometimes see a lot of these logs. Signed-off-by: Jarno Rajahalme <jarno@covalent.io> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
[ upstream commit dbedb10 ] Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
[ upstream commit 560bda1 ] Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
[ upstream commit 26308b6 ] In ENI mode, rules and routes are created for each endpoint (and therefore pod) in order to direct traffic to the appropriate ENI device. This logic was only running in the CNI plugin. Since cilium-health is not a pod, its rules and routes needed for traffic to flow were missing. This commit implements creating the necessary rules and routes for cilium-health in ENI mode. This commit also unifies the installation logic of the rules and routes in `pkg/aws/eni/routing`, allowing the CNI plugin and the cilium-health creation code to share a common implementation. Fixes #10810 Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
[ upstream commit ced4ad0 ] This documents an IPAM mode which was already supported and widely used but difficult to discover. It requires the PodCIDR for all enabled address familied to be provided via the k8s Node resource and then uses the standard hostscope allocator. Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
[ upstream commit 787c276 ] The PodCIDR is always populated on GKE. Switch to native routing mode while allocating IPs from Node.Spec.PodCIDR[s] As we transition to native-routing mode: * We can enable endpoint routes to avoid traversing the cilium-host interface * We can disable the local node route as it is not required * We can disable local routing blacklisting as it is known that no local routes will conflict with the PodCIDR Hide all of this behind a new option global.gke.enabled Deriving the cluster CIDR as part of the guide also allows to disable masquerading in the entire cluster. Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
[ upstream commit 5cf8383 ] The GKE reference guide will switch over to using direct-routing by default. Cover both modes in the GKE CI. Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
[ upstream commit db3838b ] Instead of instructing to only restart unmanaged pods in kube-system, extend this to restart all unmanaged pods which are not running in host networking mode. Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
[ upstream commit 3c19db9 ] This step does not seem required at this point. Remove it. Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
[ upstream commit 82aac9a ] Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
[ upstream commit c0ef36f ] When running native-routing mode, the source IP from host traffic is unlikely the IP assigned to the cilium-host interface. In the remote-node identity legacy mode, only the IP of the cilium-host interface is considered host. All other IPs are considered world. Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
…tion [ upstream commit 77f580d ] Add an agent knob for expert users who wish to run backend Pods on the same ports as the NodePort service is exposed. We recently resolved this by making cgroup BPF hooks netns aware in the kernel and Cilium via work from d7bf451 ("bpf: fix init namespace handling in sock lb programs"). This works for 5.7 kernels. However, if there is a need for older kernels to deploy Pods in this way (or to opt out of the bind hooks entirely for different reasons), then one can control this setting through --node-port-bind-protection. The knob defaults to true (same as we do today), so that expert users can opt-out from it if they need to. With bind protection enabled: # ./daemon/cilium-agent [...] # bpftool cgroup tree CgroupPath ID AttachType AttachFlags Name /sys/fs/cgroup/unified 48126 connect4 48122 connect6 48127 post_bind4 <---+ bind prevention enabled by default 48123 post_bind6 <---` 48128 sendmsg4 48124 sendmsg6 48129 recvmsg4 48125 recvmsg6 [...] With bind protection disabled: # ./daemon/cilium-agent [...] --node-port-bind-protection=false # bpftool cgroup tree CgroupPath ID AttachType AttachFlags Name /sys/fs/cgroup/unified 49607 connect4 49604 connect6 49608 sendmsg4 49605 sendmsg6 49609 recvmsg4 49606 recvmsg6 [...] Reported-by: Rene Zbinden <rene.zbinden@postfinance.ch> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
[ upstream commit d474e71 ] Add related documentation and options for helm to aid configuration for the bindProtection setting. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
…n-matching. [ upstream commit 2950ea4 ] For some clusters (e.g., gke-1.17.5), EndpointSlice is disabled although the cluster version is >= 1.17. In those cases, version-matching would turn on endpointslice controller while that api is not available, resulting in cilium agent not functioning. Signed-off-by: Weilong Cui <cuiwl@google.com> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
[ upstream commit 902bac8 ] Discovery of API groups requires the API services of the apiserver to be healthy. Such API services can depend on the readiness of regular pods which require Cilium to function correctly. By treating failure to discover API groups as fatal, a critial loop can be entered in which Cilium cannot start because the API groups can't be discovered and th API groups will only become discoverable once Cilium is up. Warn about the lack of discovery ability and fall back to probing the API directly. Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
[ upstream commit 6fcf0cd ] The discovery request consists of several API calls to the apiserver. It has already been observed that the k8s client can stsrt throttling during startup phase. Disable the use of the discovery API by default and rely on individual API probing but allow users to benefit from it by opting into it. Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
aa10d31
to
7842a12
Compare
never-tell-me-the-odds |
Failures seem to be provisioning issues. I will rekick. https://jenkins.cilium.io/job/Cilium-PR-Ginkgo-Tests-K8s/3078/ |
restart-ginkgo |
test-missed-k8s |
Looks like the build went green 🎉 |
Once this PR is merged, you can update the PR labels via: