New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v1.8 backport 2020-06-16 #12103
v1.8 backport 2020-06-16 #12103
Conversation
[ upstream commit b330436 ] Restart command in documentation pipes complete output of the awk into xargs which make creates invalid kubectl command. This commit updates xargs to use per line mode and also makes it omit empty input to not execute empty "kubectl delete pod" on the last line. Signed-off-by: Arthur Evstifeev <aevstifeev@gitlab.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit 4764e6d ] Signed-off-by: Michi Mutsuzaki <michi@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit f3b474a ] Since recently, allocations results returned without CIDRs became fatal for CNI (CNI cmdAdd() would fail due to interfaceAdd() failing, due to RoutingInfo.parse() erroring out on `len(cidrs) == 0`). That (now mandatory) `IPv4CIDRs` is used when masquerading is enabled, to source-route traffic from the pods to its attached VPC CIDRs through a dedicated local table (that I assume is exempt from masquerade). But on Azure we don't collect and dont't propagate subnets CIDRs, and that bit of information is missing from ciliumnodes azure crd/ipam. Adding masquerade support to Azure CNI (therefore, collecting and bubbling down per-address CIDRs) would be a nice feature to have in the longer term. But we might be late in the feature freeze to implement that on time to unbreak Azure CNI before 1.8 release; and there's no reason to insist on having CIDRs when masquerade is disabled anyway. Signed-off-by: Benjamin Pineau <benjamin.pineau@datadoghq.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit da09664 ] Add info to make people aware that they don't need CNI chaning anymore. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit 831c5ee ] Clarify that this is only needed when kube-proxy replacement is disabled. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit 5637e0e ] PR #11991 added additional BPF maps to be influenced by the bpf-map-dynamic-size-ratio flag. Adjust the upgrade guide to document these changes, as well as the newly added neighMax option. Also correct the ratio given in the architecture section, so it corresponds to 0.25%, not 2.5%. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit 325b06a ] Signed-off-by: Maciej Kwiek <maciej@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit bdb6247 ] Signed-off-by: Maciej Kwiek <maciej@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit 17b597d ] Signed-off-by: Maciej Kwiek <maciej@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit b2762cc ] Such device is denoted with "(DR)", e.g.: KubeProxyReplacement: Strict [eth0 (DR), eth1] [NodePort (SNAT, 30000-32767, XDP: DISABLED), HostPort, ExternalIPs, HostReachableServices (TCP, UDP), SessionAffinity] Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit fe4e456 ] Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit f1f4e94 ] Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit 1d190eb ] v1.5 is going out of style with v1.8, drop the upgrade notes. Users can refer to the v1.7 upgrade notes if they need to upgrade from v1.5. Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit c216dfb ] Update the example output of the commands to the way they look wiht Cilium 1.8.0-rc3. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit bed67f6 ] Just for DRY. Also, make the log messages in initKubeProxyReplacementOptions() more consistent and less ambiguous. Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit 0e3f827 ] Move all the checks which do not depend on option.Config.{DirectRoutingDevice,Devices} above the device auto detection. Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit 11650b3 ] We no longer consider that the first device among option.Config.Devices is going to be used for direct routing. Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit 5c2c8b1 ] The second helper function is supposed to call the device detection, and the third to finish the initialization. It's required, because the host-fw also requires the auto detection, and extending the former helper to consider host-fw is a bit brittle, as any premature return would prevent from the auto-detection for the host-fw. Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit e4bd54c ] Enable the device auto-detection when BPF NodePort is disabled, but host-fw is enabled. Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit d876cdf ] Previously, when falling back to iptables-based masquerading, if a user had specified --egress-masquerade-interface, the agent considered it as part of BPF masquerading options. This made it obviously to fail. Fix: b7b962b ("daemon: Fallback to iptables if BPF masq cannot be enabled") Reported-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit f5b7027 ] To avoid breaking existing guides, fallback to iptables-based masquerading if --egress-masquerade-interface is set. In v1.9, we plan to add a support for the flag which will make the transition from BPF to iptables -based masquerading transparent when the flag is set. Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit 4530c0e ] These are shell sessions for highlighting purposes, not bash scripts. Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit 6dfa5f1 ] These instructions will *not* import on bird 1.x. Make this more clear. Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit 26a4b29 ] The previous instructions hinted at a successful export, but only proved that bird parsed the config correctly. Validate the peering & export via the bird commandline. Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit 00ef71b ] This pulls in cilium/ebpf#81 which fixes a crash when trying to initialize BPF per ring buffers for offline CPUs: level=fatal msg="Cannot initialise BPF perf ring buffer sockets" error="failed to create perf ring for CPU 2: can't create perf event: can't create perf event: no such device" startTime="2020-06-15 12:15:09.153912253 +0000 UTC m=+129.850487215" subsys=monitor-agent Reported-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit d1f2b48 ] This updates the script to allow the user to specify which namespace Cilium is deployed in. Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit 348686d ] We have --enable-{ipv4,ipv6}={true,false} setting. The --disable-ipv4 is legacy and hidden for some time already. Schedule for removal. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit d03996c ] This fixes a bug where during a Cilium upgrade test full connectivity is lost. The major reason why it is happening is basically the test for ... if strings.Contains(bpfFilter.Name, "bpf_netdev.o") [...] ... which is buggy due to multi-dev support since we now end up with something like bpf_netdev_eno1.o instead of bpf_netdev.o . This means that tc BPF progs never get removed from the devices, yet we're tearing down everything else and once we remove all files from the BPF fs then in case of tail call maps, the kernel will purge all BPF programs inside tail call maps due to circular dependency avoidance. However, this means that those still-attached programs go into a drop-all mode due to missed tail calls. Fix this in four ways: i) Match on "bpf_netdev" prefix instead so we catch old but also new names, ii) Reorder BPF fs cleanup with tc BPF prog removal, meaning, we first need to remove all progs from the devices and only then clean up BPF fs, iii) When calling with --bpf-state only, then also remove progs from devices, and iv) Last but not least, add also other tc prog names. Step ii) to iv) is necessary since there can still be a small race window where we missed tail calls right before the device BPF prog removal. 'K8sUpdates Tests upgrade and downgrade from a Cilium stable image to master' test case is then fixed. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit 83b43e9 ] In XDP, we can only xmit via XDP_TX, so a multi-device setup would otherwise be broken. Tell the user to specify the device manually in case there were multiple devices detected. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit 90b8062 ] It is otherwise just confusing to users to figure out whether they need to react to this or not. Only really warn if we cannot auto-derive it from the k8s Node IP since this case would be unexpected. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit 71e994c ] Add few notes on how to deal with nodes that have multiple devices which are auto-detected. Also make it clear as a limitation. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit 8b73d6c ] Add a small paragraph for helping users to verify whether enabled or not. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit 087345c ] Make users aware that they might need to tweak their map sizes and link to the guide which has more details on it. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit dfa8bb8 ] Commit b2c9b07 ("bpf, sock: reduce xlation for externalTrafficPolicy=Local to host_id") skips service translation for remote externalTrafficPolicy=Local services (aka not being HOST_ID ipcache scope). However, later on, we relaxed bpf_lxc to compile legacy service xlation back in for services with externalIPs via 6426c86 ("bpf, external ip: fix service xlation for containers"), which would then go and try to xlate such services locally. Issue is that here we don't differentiate between the different service types so we would go and attempt to xlate. Given we don't populate service maps with remote backends that traffic gets blackholed on bpf_lxc. Instead, only xlate externalIP services at that layer and skip the rest, so we let externalTrafficPolicy=Local dsts pass through and be xlated at the remote LB. We /already/ have this kind of behavior when we have netns cookies since the bpf_lxc xlation is compiled out entirely there, but not for older kernels, thus, fix it to make both consistent. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit 64dddf7 ] On Linux 4.19 (and any kernel with 4096 BPF instructions limit), when running with kube-proxy-replacement=strict, IPv4, IPv6, fragment support (and in debug mode), Cilium fails to load bpf_overlay and bpf_host with: Prog section '2/19' rejected: Argument list too long (7) - Type: 3 - Attach Type: 0 - Instructions: 4203 (107 over limit) - License: GPL Error filling program arrays! 2/19 is section CILIUM_CALL_NODEPORT_REVNAT. We can reduce the number of instructions in that section by breaking the IPv4 and IPv6 cases with tail calls. Instead of introducing a new tail call on each path, we can turn CILIUM_CALL_NODEPORT_REVNAT into CILIUM_CALL_IPV{4,6}_NODEPORT_REVNAT. Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit c862a71 ] Given v4/v6 has been split off into different tail calls, it enables us to significantly bump the collision retries on older kernels. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit 874bb0e ] ... as otherwise it won't work on multi-dev node. Revert back to auto-detection after XDP tests have run. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit bc5e161 ] Re-enable it such that we can run CI with kube-proxy-free BPF code alongside kube-proxy. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit cb65fca ] Prior tests have been run with the latest Cilium image. Using the old one for the initial cleanup is broken since it will not clean-up everything. For example the 1.7 cleanup method will try to remove tc attached filters with the name bpf_netdev.o, but for 1.8/latest they are names bpf_netdev_<dev>.o which will then break connectivity after flushing all files from BPF fs as tail call maps are purged which gets us into drop-all due to missed tail calls. Run new (and old) to make sure there are no quirks from old version left behind that the new one didn't catch. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit 0fa8ac7 ] ... and instead rely on kube-proxy's strict replacement. Also disable tests that are not working on 4.19 due to missing kernel features on NodePort side. Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit 67281df ] This adds a note about setting the `--node-ip` correctly in kube-proxy free deployments with kubeadm. Without `--node-ip` set correctly, Cilium's multi-device detection will not pick up all needed interfaces. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit 0c4037c ] Use parsed-literal, so SCM_WEB gets expanded properly in the kubectl apply command. Also add a missing AGE column value in the `kubectl get pods` output and remove all trailing whitespaces. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ upstream commit 015ffbd ] Remove ginkgo test source dependency on pkg/iptables, as that only compiles on Linux. This allows running ginkgo from OSX against GKE, for example. Signed-off-by: Jarno Rajahalme <jarno@covalent.io> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
test-backport-1.8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 for my changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#11978 LGTM 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for my changes
retest-4.19 |
v1.8 backports 2020-06-16 (no conflicts, had to take in #11929 to get #11981):
--node-ip
kubelet option #12095 -- docs: Add note about--node-ip
kubelet option (@gandro)Once this PR is merged, you can update the PR labels via: