-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v1.10 backports 2021-05-12 #16103
v1.10 backports 2021-05-12 #16103
Conversation
Two PRs didn't get their commits correctly picked up by the scripts. I manually found them in the log and grabbed the shas. I'm not sure what confused the scripts. Is this common?
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My commits look good, but it seems like all the commits in the PR are not signed off by you, but rather the author references vagrant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@glibsm my commit (216ddbb) seems to have been picked up correctly as 4c2874b. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My commits look good 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 for my commit (apart from Signed-off-by
already pointed out by Chris)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OpenShift GSG changes LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My commit is OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except the sign-off as pointed out by others
Thanks for pointing it out, that's really bizarre since I didn't even shut down this vagrant since last backport. Strange. I'll force push a fix |
[ upstream commit b99403f ] Provide minor improvements of the text or its formatting at various places of the documentation. Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Glib Smaga <code@gsmaga.com>
[ upstream commit d35c94a ] The host firewall is now supported when running with endpoint routes enabled. We can remove the note from the getting started guide. Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Glib Smaga <code@gsmaga.com>
[ upstream commit cb4c6ad ] Detail why we won't loop when jumping from bpf_lxc to bpf_host and back for ingress to the pods with host firewall enabled. Suggested-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Glib Smaga <code@gsmaga.com>
[ upstream commit 9ab5c8e ] When endpoint routes are enabled, the host firewall requires the remote-node identity to be enabled as well to function properly. The host identity is used to detect traffic that should be sent through bpf_host on egress/ingress of bpf_lxc. If the remote-node identity is disabled, the host identity then also matches remote-node IPs. Reported-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Glib Smaga <code@gsmaga.com>
[ upstream commit bf48876 ] HOST_EP_ID was defined as a static data variable in a695f53 ("Endpoint for host") because the ID of the host endpoint wasn't known at the time we generated the datapath header files. Commit 81f0626 ("datapath: Include the host endpoint ID in all endpoint headers") changed this by making the host endpoint ID available globally to the agent via the node package and by setting the ID sooner on startup. We can therefore define the HOST_EP_ID as a normal constant now, without needing static data. That in turns allows us to revert commit 04cbfa1 ("bpf: Lift constraint for inline asm of static tail calls"). We however still use a template-specific value in bpf_host for the BPF program injected in the policy map (see TEMPLATE_HOST_EP_ID). We could instead directly use HOST_EP_ID for the program's name, but we would then have to ignore that symbol when replacing symbols in the ELF [1]. To ignore that symbol we would also have to ignore similar symbols corresponding to pod endpoints (i.e., all "1/xxxx" symbols), opening the door to silent errors. It's probably not worth it since there's no benefit to using HOST_EP_ID for the program's name directly. 1 - https://github.com/cilium/cilium/blob/v1.10.0-rc0/pkg/datapath/loader/cache.go#L39 Suggested-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Glib Smaga <code@gsmaga.com>
[ upstream commit 421314d ] * Correct some minor errors in scripts. * Update openshift-install output for latest stable version. * Add instructions on how to delete the cluster. * Minor grammatical tweaks. * Minor formatting tweaks. Signed-off-by: Tom Payne <tom@isovalent.com> Signed-off-by: Glib Smaga <code@gsmaga.com>
[ upstream commit 6ae08fd ] For more optimal packet sizes in large MTU setups, set the cilium_wg0 MTU based on the detected MTU minus overhead. Additionally set the route MTU on container side to detected MTU minus wireguard overhead when wireguard is enabled. Tested manually with "--mtu 1400", causing cilium_wg0 and route MTU to be set to 1320. Signed-off-by: Jussi Maki <jussi@isovalent.com> Signed-off-by: Glib Smaga <code@gsmaga.com>
[ upstream commit e147250 ] go.mod is currently using a revoked wireguard version. To fix this we will update to the latest wireguard commit. More info: WireGuard/wgctrl-go#102 Reported-by: Vlad Ungureanu <vladu@palantir.com> Suggested-by: Nicolas Busseneau <nicolas@isovalent.com> Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: Glib Smaga <code@gsmaga.com>
[ upstream commit 2e0603b ] To make it easier debug or share configurations across users and developers, Cilium will store its viper.Config as well as agent's configuration in files under `/var/run/cilium`. It will store up to the last 3 previous configurations. Another use case for it will be to check the previous configuration run by Cilium in case certain steps need to be executed, for example, to clean up state left from a previous run that will not be cleaned up by a new set of flags. Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: Glib Smaga <code@gsmaga.com>
[ upstream commit 4eafc09 ] Following the vote and topic raised on 2021-05-05 SIG-Datapath, since the primary community meeting is moved to Wednesday, the SIG-Datapath meeting is moved to Thursday. Q: When would you prefer for SIG-Datapath to meet? Votes: Tuesday 8am PT, 5pm CET 3 Tuesday 9am PT, 6pm CET 3 Thursday 8am PT, 5pm CET 6 Thursday 9am PT, 6pm CET 2 Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Glib Smaga <code@gsmaga.com>
[ upstream commit f7bd191 ] This pulls in vishvananda/netlink#637 and vishvananda/netlink#648 with data race fixes. Fixes cilium#15438 Signed-off-by: Tobias Klauser <tobias@cilium.io> Signed-off-by: Glib Smaga <code@gsmaga.com>
[ upstream commit 9fa9672 ] ValidateEndpointsAreCorrect ensures that all Docker containers have a corresponding Cilium endpoints. The information necessary to match containers with Cilium endpoints is however only available once the endpoints are ready (have received labels, identity, and other information). We therefore need to wait for all endpoints to be ready before running ValidateEndpointsAreCorrect. Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Glib Smaga <code@gsmaga.com>
[ upstream commit f578ba5 ] PR cilium#15125 broke race detector pipelines due to the `RACE` environment variable not being present, when it is required for `SkipRaceDetectorEnabled` to work properly. A quick inspection leads me to believe `LOCKDEBUG` also is required for package `lock`. I am less convinced about `BASE_IMAGE` being required, but I'm adding it back anyway in a sort of spiritual revert until we figure it out. Fixes cilium#15214: the Kafka tests were actually already disabled for race detector pipelines, as per cilium#13757. They were only running the pipelines due to cilium#15125 re-enabling them incidentally. Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> Signed-off-by: Glib Smaga <code@gsmaga.com>
[ upstream commit 437e2bb ] This is meant as a temporary workaround for cilium#16007. Signed-off-by: Bruno Miguel Custódio <brunomcustodio@gmail.com> Signed-off-by: Glib Smaga <code@gsmaga.com>
[ upstream commit 3b350ce ] Improve the AWS VPC CNI plugin chaining page. Also, make use of the Cilium CLI to check for the status of the installation and for performing the connectivity test. Signed-off-by: Bruno Miguel Custódio <brunomcustodio@gmail.com> Signed-off-by: Glib Smaga <code@gsmaga.com>
[ upstream commit 1208496 ] This is also mirror the Wireguard documentation. Signed-off-by: Alexandre Perrin <alex@kaworu.ch> Signed-off-by: Glib Smaga <code@gsmaga.com>
[ upstream commit 6f695aa ] Signed-off-by: Alexandre Perrin <alex@kaworu.ch> Signed-off-by: Glib Smaga <code@gsmaga.com>
[ upstream commit 12dcef5 ] Signed-off-by: Alexandre Perrin <alex@kaworu.ch> Signed-off-by: Glib Smaga <code@gsmaga.com>
[ upstream commit bbec696 ] Previously, the tests would leave state on the host machine. Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Glib Smaga <code@gsmaga.com>
[ upstream commit ebd3833 ] This will be used in the subsequent commit, when creating a XFRM policy specifically to allow to-proxy traffic. Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Glib Smaga <code@gsmaga.com>
[ upstream commit a9f18f3 ] This is needed to fix the L7 ingress policy case. In tunneling mode when a packet is received on the destination node, it makes two passes through the stack. The first pass decrypts the packet because it matches the XFRM IN policy with mark 0xd00, indicating it needs decryption. The second pass through, since L7 ingress policy is enabled, the packet mark is set to 0x200 meaning the packet is destined to the proxy. The problem occurs because there is only one XFRM IN policy matching on mark 0xd00. Since the packet mark is 0x200, the match fails and the packet is dropped by the stack. Therefore, we add a new XFRM policy that matches packets destined for the proxy so that they're allowed. Why doesn't this happen in direct routing mode? The reason is because the skb extension bits[1] are cleared in DR, whereas they are not in tunneling. When the bits are toggled on, then this causes extra logic to be executed in the kernel inside `__xfrm_policy_check()`. This logic upon a policy lookup failure drops the packet [2]. When the bits are cleared, there is no logic to cause a drop upon policy lookup failure. Why are the skb extension bits cleared in DR and not in tunneling? Because the packet path traversal in DR is `cilium_host` -> `cilium_net` -> stack, where the veth pair of `cilium_host` and `cilium_net` calls the kernel `veth_forward_skb()`, which eventually calls `skb_scrub_packet()` where the extension bits are cleared. The path for tunneling is `cilium_{vxlan,geneve}` -> stack, where there is no veth pair traversal, and thus no call to `skb_scrub_packet()`. Hence why we only create a new XFRM policy in tunneling mode. (This was debugged with the help of the following bpftrace script: https://gist.github.com/christarazi/4bb48eb623a03f25026be21856ea10fb) [1]: https://elixir.bootlin.com/linux/v5.12.2/source/net/xfrm/xfrm_policy.c#L3558 [2]: https://elixir.bootlin.com/linux/v5.12.2/source/net/xfrm/xfrm_policy.c#L3590 Co-authored-by: John Fastabend <chris@isovalent.com> Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Glib Smaga <code@gsmaga.com>
d66e0c8
to
38ba079
Compare
test-backport-1.10 |
Triggering manually because test-backport trigger wasn't set for that job. Now fixed. |
Once this PR is merged, you can update the PR labels via: