New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v1.8 backports 2020-06-11 #12027
v1.8 backports 2020-06-11 #12027
Conversation
[ upstream commit 749d03a ] Many tests are failing when configuring a device while BPF-based NodePort is disabled (cf. #11972). A bug in the loader causes it to call init.sh with the mode set to direct instead of e.g., vxlan. The cilium_vxlan is then not created (or even removed if it existed) which causes connectivity disruptions in multiple scenarios (mostly multi-node). This commit fixes the wrong assumption that multiple devices are only set when NodePort is enabled. That assumption might have been correct when that code was introduced (in 3a83623). Fixes: #11972 Fixes: 3a83623 ("bpf: add support for local NodePort via tunnel") Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit 009fb34 ] The endpoint slice implementation did not account that a service could map to multiple endpoint slices. Wrongly assuming that only a single endpoint slice existed for a single service can cause Cilium to fail the translation of a service to a backend in case 2 or more endpoint slices existed. In some rare occurrences where a single endpoint slice was deleted and another one was recreated, for the same service, it could cause Cilium stop doing service translation entirely. The unit test added easily replicates the issue as we can see the test fails in the current master where the 2nd endpoint slice added overwrote the 1st endpoint slice: ``` c.Assert(endpoints.String(), check.Equals, "2.2.2.2:8080/TCP,2.2.2.3:8080/TCP") Expected :string = "2.2.2.2:8080/TCP,2.2.2.3:8080/TCP" Actual :string = "2.2.2.3:8080/TCP" ``` Fixes: c3b5ca6 ("add support for k8s endpoint slice") Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit 0d89f05 ] Encryption fixes, when setting the encrypt ctx->mark field we need to skip resetting it with the source identity. Its not needed in the encryption case anywasy because we already checked the endpoint is remote before encoding encryption signal. Next if fib lookup is not available we will discover the route at init time and encode it in the ENCRYPT_IFACE define. If this field is non-zero we should use it. Otherwise in some configurations where there is not a route to egress in the main routing table the packet will be dropped. Fixes: 86db0fd ("cilium: encryption, use fib_lookup to rewrite dmac/smac") Fixes: f25d8b9 ("bpf: Preserve source identity for hairpin via stack") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit b41bbc0 ] We observed in the K8sWatcher for "ciliumendpoints" the call ConvertToCiliumEndpointAddFunc was taking an endpoint event with a valid Encryption field and converting it to '0'. To fix we can make the translation more explicit. Fixes: 720c0b0 ("pkg/k8s: do not DeepCopy when converting to CiliumEndpoint") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit 57deb48 ] This option check is unnecessary in hostns case because either bpf_sock or kube-proxy will always handle the translation of VIPs to backends that the host firewall requires. Fixes: f74087e ("daemon: Introduce enable-host-firewall option") Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit f8da3b9 ] Make 0532f88 ("bpf: avoid using fib lookup in nodeport for dsr v4/v6 hairpin case") generic such that also the tc BPF case can benefit from it when only a single external device was specified or detected on the node. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit 4673688 ] - Update instructions to enable Hubble. - Update minikube and kind guides to use experimental-install.yaml to deploy Cilium instead of quick-install.yaml so that Hubble is enabled. - Delete outdated {hubble-install,getting-started-next-steps}.rst. - Set SKIP_LINT for html-netlify target. There is another status check that performs linting, and this lets us ignore inconsistent spell results for now. Fixes #11151 Signed-off-by: Michi Mutsuzaki <michi@isovalent.com> Signed-off-by: André Martins <andre@cilium.io>
…l7 test. [ upstream commit 9d263a2 ] Add monitoring to gain more insight. Signed-off-by: Jarno Rajahalme <jarno@covalent.io> Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit 49dca5c ] Signed-off-by: Jarno Rajahalme <jarno@covalent.io> Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit 566de37 ] The ineffassign dependency was removed from the VM image in cilium/packer-ci-build#211, in preparation for the switch to golangci-lint implemented by #11528. However, (1) the VM image update was merged before the golangci-lint PR was, and (2) the golangci-lint PR won't be backported to v1.8, whereas the VM image update has already been. We therefore need to reinstall the ineffassign dependency in the dev. VM until #11528 is merged and in order to backport this fix to v1.8. Fixes: cilium/packer-ci-build#211 Fixes: #11917 Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit 121cd37 ] The `prometheus.io/{scrape,port}` annotations are used to tell Prometheus to automatically scrape the Hubble metrics on the given port. Unfortunately, we cannot add this annotation directly on the cilium-agent pod, as it is already used for the cilium-agent metrics and the Prometheus annotation only supports a single port. Therefore, this commit annotates the `hubble-metrics` service instead and changes the Helm chart to create the `hubble-metrics` service whenever Hubble metrics are enabled. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit b910da8 ] This commit slightly restructures the "Monitoring & Metrics" section in the documentation and adds the reference for the available Hubble metrics. The document is restructured to explain how to enable the Cilium/ Cilium Operator metrics in Helm separately from the Hubble metrics provided by embedded Hubble. The rationale for this split is that Cilium metrics are used to monitor Cilium itself, while Hubble metrics are used to monitor network traffic of applications in the cluster. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit c16e111 ] Wait for each POD incrementing their policy revision also after deleting a policy, as otherwise there is a race between policy deletion and the following test traffic that expects policy deletion to have taken effect already. This fixes CI flakes following policy deletes like: 08:35:50 STEP: Deleting Egress deny all clusterwide policy 08:35:51 STEP: Testing ingress connectivity from "app2" to "app1" in "2020060908341k8spolicytestclusterwidepoliciestestclusterwidecon" namespace FAIL: Ingress connectivity should be allowed for service in 2020060908341k8spolicytestclusterwidepoliciestestclusterwidecon namespace Expected command: kubectl exec -n 2020060908341k8spolicytestclusterwidepoliciestestclusterwidecon app2-88b7f8c4b-cvg64 -- curl --path-as-is -s -D /dev/stderr --fail --connect-timeout 5 --max-time 20 http://10.110.187.138/public -w "time-> DNS: '%{time_namelookup}(%{remote_ip})', Connect: '%{time_connect}',Transfer '%{time_starttransfer}', total '%{time_total}'" To succeed, but it failed: Exitcode: 28 Stdout: time-> DNS: '0.000013()', Connect: '0.000000',Transfer '0.000000', total '5.000688' Stderr: command terminated with exit code 28 Signed-off-by: Jarno Rajahalme <jarno@covalent.io> Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit 6530325 ] Update proxy image to build with Envoy 1.13.2. This fixes CVE-2020-11080 by rejecting HTTP/2 SETTINGS frames with too many parameters. Signed-off-by: Jarno Rajahalme <jarno@covalent.io> Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit c9ea0e9 ] Lets avoid pointless runs of sockops tests where the feature is disabled due to kernel limitations. Should save us a couple runs in CI. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit 941e328 ] This makes sure that cancelation signals or timeouts are propagated properly. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: André Martins <andre@cilium.io>
…flag [ upstream commit f6d62b5 ] The flag no longer has any effect and is scheduled for removal in Cilium 1.8. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit a961b04 ] Support instances that are not part of a Virtual Machine Scale Set. Cilium only assumption with regard to machines running in VMSS was limited to interfaces discovery. Also listing standard (non-VMSS) network interfaces from the same Azure ResourceGroup is all we need for those to work proper with Cilium. Signed-off-by: Benjamin Pineau <benjamin.pineau@datadoghq.com> Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit 742c318 ] Standalone (non VMSS) instances are slightly different than VMSS in that regard: the interfaces are attached to instances, not part of the instance definition. Signed-off-by: Benjamin Pineau <benjamin.pineau@datadoghq.com> Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit 20c092c ] Otherwise values set during helm install get wiped out. Signed-off-by: Michi Mutsuzaki <michi@isovalent.com> Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit 2ed8851 ] Include Envoy-supplied error detail in NACK warning logs to facilitate debugging. Signed-off-by: Jarno Rajahalme <jarno@covalent.io> Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit 0de65a1 ] We are occasionally seeing log archives without most of the logs which is caused by this timeout being hit. Let's increase it and see if this helps. Signed-off-by: Maciej Kwiek <maciej@isovalent.com> Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit 7fdedb1 ] This change adds two commands which are meant to gather Cilium logs in case more granular methods fail. Signed-off-by: Maciej Kwiek <maciej@isovalent.com> Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit 2cd23fe ] Pull the latest version which includes a new field LinkIndex in Addr struct [1]. This is going to be used for retrieving link by ip addr. [1]: vishvananda/netlink#543 Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit 298cb13 ] The param specifies to which devices bpf_host.o should be attached. Currently, it's used by BPF NodePort, host-fw and BPF masquerading. Also, mark --device as deprecated. If a user specifies both, cilium-agent will log a fatal msg. Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit bd31f7d ] Previously, the constraint made it impossible to have BPF NodePort exposed via multiple devices, as a request to a NodePort service which was received by non-XDP device (handled by TC) was dropped due to the constraint. Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit 1fbaf82 ] Attach XDP NodePort to a device which is used for direct routing among nodes. In the case of a multi-dev setup, if we attach to other than the direct routing device, a request to a NodePort svc which real destination is a remote backend won't be redirected to another device (only possible via hairpinning). So, we need to attach to a device which can forward to another node which is the direct routing device. Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit dcbce7e ] Otherwise, we would need to make the daemon unit test suite as privileged due to bpftool depedency for the feature detection in initKubeProxyReplacementOptions(). Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit 0b5891a ] Previously, when running with "--k8s-require-ipv{4,6}-pod-cidr=false", retrieveNodeInformation() might have failed retrieving a self (Cilium)Node object on a busy cluster. To avoid this, return an error if the BPF NodePort dev auto-detection might happen. This will make the retry mechanism not to give up on waiting for the object. Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit 6beb6b3 ] Disable the kube-proxy replacement on the CI 4.19 job until has been merged #11915, but keep bpf_sock to avoid bpf_lxc complexity issues. Also, get rid of non-working kernel vsn setting which complicates passing of the KERNEL env var to ginkgo test runner. Finally, disable ipsec + vxlan test on 4.19, as it is broken with the contemporary 4.19 setup (TODO to investigate it). Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit b2c73e7 ] When Reinitialize()-ing the datapath, transient iptables rules are set up to avoid dropping packets while Cilium's rules are not in place. In rare occasions, a failure to add those rules has been observed (see issue #11276), leading to an early exit from Reinitialize() and a failure to set up the daemon. But those transient rules are just used to lend a hand and keep packets going for a very small window of time: it does not actually matter much if we fail to install them, and it should not stop the reinitializing of the daemon. Let's simply log a warning and carry on if we fail to add those rules. Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit 66aa41d ] We do the same thing for IPv4 and IPv6, with just the name of the program changing. Then we do nearly the same thing for flushing and deleting a chain. Let's refactor (no functional change). Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit 052fa31 ] When Cilium reinitialises its daemon, transient iptables rules are set up to avoid dropping packets while the regular rules are not in place. On rare occasions, setting up those transient rules has been found to fail, for an unknown reason (see issue #11276). The error message states that the "Chain already exists", even though we try to flush and remove any leftover from previous transient rules before adding the new ones. It sounds likely that removing the leftovers is failing, but we were not able to understand why, because we quieten the function to avoid spurious warnings the first time we try to remove them (since none is existing). It would be helpful to get more information to understand what happens in those rare occasions where setting up transient rules fails. Let's find a way to get more logs, without making too much noise. We cannot warn unconditionally in remove() since we want removal in the normal case to remain quiet. What we can do is logging when the "quiet" flag is not passed, _or_ when the error is different from the chain being not found, i.e. when the error is different from the one we want to silence on start up. This means matching on the error message returned by ip(6)tables. It looks fragile, but at least this message has not changed since 2009, so it should be relatively stable and pretty much the same on all supported systems. Since remove() is used for chains other than for transient rules too, we also match on chain name to make sure we are dealing with transient rules if ignoring the "quiet" flag. This additional logging could be removed once we reproduce and fix the issue. Alternative approaches could be: - Uncoupling the remove() function for transient rules and regular rules, to avoid matching on chain name (but it sounds worse). - Logging on failure for all rules even when the "quiet" flag is passed, but on "info" level instead of "warning". This would still require a modified version of runProg(), with also a modified version of CombinedOutput() in package "exec". Here I chose to limit the number of logs and keep the changes local. - Listing the chain first before trying to remove it, so we only try to remove if it exists, but this would likely add unnecessary complexity and latency. Should help with (but does not solve): #11276 Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit e78405a ] In an attempt to catch #11276 in CI, let's add any message related to a failure to flush or delete the chain related to iptables transient rules to the list of badLogMessages we want to catch. We need to filter on the name of the chain for transient rules to avoid false positives, which requires exporting that name. We also need to modify the log message error, to avoid adding four disting logs to the list (combinations for iptables/ip6tables, flush/delete). Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: André Martins <andre@cilium.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 for my changes!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for my changes
This will fix the hubble-relay version issue: |
[ upstream commit ea52a9e ] In particular, add the new hubble-relay values file to the list of files that we fix these values up for in the make target. Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: André Martins <andre@cilium.io>
test-backport-1.8 |
All failures are because of the hubble-relay image pull issue which is likely because of a lack of EDIT: Hmm, not that simple; it appears to be already tagged in dockerhub. Maybe there's another lurking issue in the image generation. Trying it out locally with a GKE cluster to see what kind of infrastructure issue it might be. |
Counter-intuitively, in the CI framework we build images from the current tree and tag them as `latest`, even if the current tree is for instance `v1.8` (with version `v1.8.0-rc2`). At the same time, the Helm charts in the tree specify in the files that they will pull the current tag from `VERSION`, for instance right now `v1.8.0-rc2`. For `cilium` and `operator` images, we *override* the helm chart defaults with `latest` to pick up the build/cached images from the first step above. However we weren't doing this for Hubble Relay, which meant that it tried to pull the image with the tag `v1.8.0-rc2` but this image didn't exist in the local registry. To fix this and get the v1.8 tree to properly build with a version of hubble-relay from v1.8 branch, we must tag, cache and deploy images that represent the latest v1.8 development with `latest`, and pull that image from the CI framework. :exploding_head: Signed-off-by: Joe Stringer <joe@cilium.io>
Given that all tests are currently passing (except K8sHubble), I will push the fix for the hubble issue and focus-test only the K8sHubble tests. If that + smoke test pass, we should be good to go. |
test-focus K8sHubbleTest |
Hit known flake #11955 in Travis, ignoring. |
Once this PR is merged, you can update the PR labels via:
@genbit this PR didn't apply successfully, is the 1.8 branch already with such changes?