Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.15 Backports 2024-01-30 #30529

Merged
merged 41 commits into from
Jan 31, 2024
Merged

v1.15 Backports 2024-01-30 #30529

merged 41 commits into from
Jan 31, 2024

Conversation

joamaki
Copy link
Contributor

@joamaki joamaki commented Jan 30, 2024

PRs skipped due to conflicts:

Once this PR is merged, a GitHub action will update the labels of these PRs:

 29025 29023 29024 30340 30286 30154 30282 30167 29423 29938 30220 30299 30403 30052 30410 30399 30219 30236 30407 30445 30477 30488 30422 30442 30365 30503 30416 30329 30483 30504 30247

dylandreimerink and others added 30 commits January 30, 2024 14:51
[ upstream commit 00b1e5e ]

Once a service gets selected we start leader election. However, if
we lose the lease for some reason, we don't retry getting it until
the service is deselected and reselect, recreated or the agent
restarts. This commit surrounds the lease leader election logic with
a loop that ends when the context is cancelled.

Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit bc65ca3 ]

Commit 9c1d1de ("egressgateway: Redirect from bpf_overlay to egress gw SNAT netdev")
introduced some EgressGW code into bpf_overlay. Cover it in the complexity
configs.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit e2760e6 ]

When egress_gw_fib_lookup_and_redirect() in to-netdev selects the final
egress interface for a packet (based on its desired EgressIP), the FIB
lookup potentially returns BPF_FIB_LKUP_RET_NO_NEIGH.

For 5.10+ kernels this is gracefully handled in fib_do_redirect() by
redirecting to the neigh subsystem. But for older kernels we have no
possibility to fall back to the NEIGH map, and the packet would just get
dropped with DROP_NO_FIB / BPF_FIB_MAP_NO_NEIGH.

Have egress_gw_fib_lookup_and_redirect() catch this case, and just let the
packet continue on the current egress interface. Users that strictly
require the *correct* egress interface need to run a 5.10+ kernel.

We'll update the relevant code in from-overlay with a subsequent patch.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit 918e47b ]

With a previous patch, egress_gw_fib_lookup_and_redirect() now potentially
doesn't redirect the packet, and just returns CTX_ACT_OK instead. Handle
this by forwarding the packet to the stack, as was done prior to
9c1d1de ("egressgateway: Redirect from bpf_overlay to egress gw SNAT netdev").

Ideally this happens just once per connection - the pass through the stack
should trigger a fresh ARP resolution, and subsequent traffic can obtain a
L2 resolution from the FIB lookup.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit b09561c ]

The only functions left in egress_policies.h are SRv6 related.
Let's rename this to 'srv6.h' and update references to the old file
name.

Signed-off-by: ldelossa <louis.delos@gmail.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit 2de0fea ]

Include a trace reason for SRv6 encapsulation and decapsulation.

This greatly improves the debugging process, indicating whether SRv6
VPN related packets are processed by our datapath.

Signed-off-by: ldelossa <louis.delos@gmail.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit a6bfb79 ]

Consider encap/decap as egress/ingress (respectively) and both as
unknown reply ct status.

Signed-off-by: Alexandre Perrin <alex@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit 5eee9de ]

The Node{Add,Update,Delete} functions of the linux node handler are
already guarded in order not to execute the underlying logic if the node
subsystem is not yet fully initialized. Once initialized, all updates
are then automatically replayed.

Yet, this does not apply to the NodeValidateImplementation and
AllNodeValidateImplementation functions, which can also be invoked
asynchronously, leading to a panic if not fully initialized (even
without panicing, we would be enforcing an incorrect configuration,
possibly disrupting existing connections):

github.com/cilium/cilium/pkg/datapath/linux.(*linuxNodeHandler).nodeUpdate(0xc0022be1a0, 0x0, 0xc001936480, 0x0)
	/go/src/github.com/cilium/cilium/pkg/datapath/linux/node.go:1030 +0x142d
github.com/cilium/cilium/pkg/datapath/linux.(*linuxNodeHandler).NodeValidateImplementation(_, {{0xc000f10720, 0x1b}, {0xc00068b2d8, 0x13}, {0xc000d6a3c0, 0x4, 0x4}, 0xc0005fc0e8, {0x0, ...}, ...})
	/go/src/github.com/cilium/cilium/pkg/datapath/linux/node.go:1337 +0xc8
github.com/cilium/cilium/pkg/node/manager.(*manager).backgroundSync.func1({0x4019e80, 0xc0022be1a0})
	/go/src/github.com/cilium/cilium/pkg/node/manager/manager.go:342 +0x9a
github.com/cilium/cilium/pkg/node/manager.(*manager).Iter(0x3251f40?, 0xc001f0bdb8)
	/go/src/github.com/cilium/cilium/pkg/node/manager/manager.go:174 +0xdb
github.com/cilium/cilium/pkg/node/manager.(*manager).backgroundSync(0xc00083c460, {0x400c390, 0xc00135b630})
	/go/src/github.com/cilium/cilium/pkg/node/manager/manager.go:341 +0x4ab
github.com/cilium/workerpool.(*WorkerPool).run.func1()

Let's fix this by also checking the initialization status there.

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit f299dc1 ]

Signed-off-by: viktor-kurchenko <viktor.kurchenko@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit 7cdadbc ]

Global variable `countErrors` converted to the function local.

Signed-off-by: viktor-kurchenko <viktor.kurchenko@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit 000edce ]

Signed-off-by: viktor-kurchenko <viktor.kurchenko@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit 5b04e08 ]

Ensure errs are checked for the calls below:

- deleteNodeIPSecOutRoute
- replaceNodeIPSecOutRoute

Signed-off-by: Fernand Galiana <fernand.galiana@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit d84a650 ]

Signed-off-by: Ludovic Ortega <ludovic.ortega@adminafk.fr>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit d721b8b ]

Previously, the restoreIPCache() method would return an error on
new installs because it was checking for the presence of the
"file or dir missing" error but this error was being wrapped by
another method in the call tree.

This PR updates the restoreIPCache() method to use errors.Is()
that reports whether any error in err's tree matches the target and
thus reports a nil error on new installs when the "cilium_ipcache"
file does not exist.

Fixes: #29328

Signed-off-by: Daneyon Hansen <daneyon.hansen@solo.io>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit 945ad0c ]

Currently we try to remove IPv6 proxy rules if the IPv6 option is
disabled. This is to clean up those rules if a previously running agent
has installed them but was restarted with a configuration change. This
can fail if the underlying kernel has no IPv6 support. This commit fixes
this, by allowing the necessary netlink syscall to fail with
EAFNOSUPPORT.

Fixes: #29965

Signed-off-by: Robin Gögge <r.goegge@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit 09f18fd ]

Signed-off-by: Filip Nikolic <oss.filipn@gmail.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit db14f4b ]

Fixes: #30051
Signed-off-by: Yingnan Zhang <342144303@qq.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit 3932a4b ]

Fix up some ctx_load_bytes() usage to return a drop reason, and not the
raw kernel errno.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit 87d948e ]

The Cilium nodeinit startup script lays down a temporary CNI config in order to
be able to restart a version of containerd that doesn't allow a missing CNI
config.

This commit fixes an issue with missing double quotes in the temporary config
which causes an error in containerd and leads to NotReady Kubernetes nodes

I also considered heredoc or escaping the quote characters but settled on single
quoting as I think its the most readable one line solution without needing to
deal with the indentation issue with heredoc

Signed-off-by: Tom Cowling <952241+tlcowling@users.noreply.github.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit a388c42 ]

Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.2 to 3.1.3.
- [Release notes](https://github.com/pallets/jinja/releases)
- [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst)
- [Commits](pallets/jinja@3.1.2...3.1.3)

---
updated-dependencies:
- dependency-name: jinja2
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
[ upstream commit 068dc47 ]

Signed-off-by: Cilium Imagebot <noreply@cilium.io>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit f30fd60 ]

This commit adds a warning to the Egress Gateway documentation to help user avoid deploying a known bad configuration.

Co-authored-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: soggiest <nicholas@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit 746b0f3 ]

Signed-off-by: Nico Vibert <nicolas.vibert@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit 7141c22 ]

Remove a misplaces ls alias that caused cilium-dbg blf auth ls to flush
the map.

Signed-off-by: Maartje Eyskens <maartje@eyskens.me>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit 0983a57 ]

This is to update the Gateway API version to align with the conformance
tests running in CI.

Signed-off-by: Tam Mach <tam.mach@cilium.io>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit e777df1 ]

#26215 changed how we do
egressGW-specific routing on the gateway node - instead of installing
custom IP rules, we rely on the node's routing setup.
#30286 then fixed up a corner-case on
older kernels.

Reflect both parts in the docs.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit 3092ed1 ]

Signed-off-by: Dmitry Kharitonov <dmitry@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit 36c7d66 ]

On modern kernels, bpf_host currently builds a
tail_nodeport_nat_ingress_ipv4() that includes the code for RevSNAT,
HostFW and RevDNAT. But at least for the 6.1 kernel we're scratching at the
complexity limit and hitting verifier troubles in main [0] and 1.15 [1].
It's currently unclear why we can't reliably(!) reproduce those troubles in
CI.

Take the pressure off by splitting the tail-call into two parts, whenever
the HostFW is enabled - leaving RevSNAT and HostFW in the first part, while
RevDNAT is handled in a separate tail-call. This prevents the trace_ctx
from the RevSNAT to reach the nodeport_add_tunnel_encap() call in the
RevDNAT code, but that's acceptable for now.

[0]: #30266
[1]: #30093
Co-developed-by: Tom Hadlaw <tom.hadlaw@isovalent.com>
Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit 9bf7a8e ]

In cases where the endpoint regen fails, such as in a complexity issue,
the endpoints reporter should be put into a degraded state.

However, currently it will degrade the endpoint hr, and the immediately
clear the degraded state with hr.Ok(...).

This fixes that to only clear if there is no error while regenerating.

Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit 5581963 ]

This commit brings two fixes to the script that we use to determine to
which version we should upgrade/downgrade in some CI workflows.

The first fix is the most important one. When looking for the closest
patch version, make the script return the value in VERSION instead of
decrementing it. The rationale is that for stable branches, VERSION
already points to the latest patch release, there is no need to decrease
it further! This fix does not affect the output for the calculation of
the previous minor version number.

The second fix is simply the addition of an error message in case the
minor version number is 0, to get some explicit error instead of a
silent failure if we ever reach Cilium 2.0.0.

Updated samples of numbers from VERSION and the corresponding values
returned:

    VERSION         Previous minor  Previous patch release

    1.14.3          v1.13           v1.14.3
    1.14.1          v1.13           v1.14.1
    1.14.0          v1.13           <error>
    1.14.1-dev      v1.13           v1.14.1
    1.15.0-dev      v1.14           <error>
    1.13.90         v1.12           <error>
    2.0.1           <error>         v2.0.1

Fixes: 56dfec2 ("contrib/scripts: Support patch releases in print-downgrade-version.sh")
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>
Copy link
Contributor

@viktor-kurchenko viktor-kurchenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Thank you!

@nvibert
Copy link
Contributor

nvibert commented Jan 30, 2024

LGTM !

Copy link
Contributor

@derailed derailed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joamaki LGTM Thanks for these updates Jussi!

@gandro gandro mentioned this pull request Jan 30, 2024
1 task
@aanm aanm marked this pull request as ready for review January 30, 2024 17:19
@aanm aanm requested review from a team as code owners January 30, 2024 17:19
Signed-off-by: Cilium Imagebot <noreply@cilium.io>
@joestringer
Copy link
Member

(All tests passed. docs-builder triggered an update on top)

@joestringer
Copy link
Member

/test-backport-1.15

Copy link
Member

@sayboras sayboras left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks and looks good for my commits.

Copy link
Contributor

@ldelossa ldelossa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM

@@ -221,6 +221,7 @@
# after the less specific paths, otherwise the ownership for the specific paths
# is not properly picked up in Github.
* @cilium/tophat
/.github/workflows/*perf*.yaml @cilium/sig-scalability @cilium/github-sec @cilium/ci-structure
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to block the PR on this but I wanted to point out to @marseel / @cilium/sig-scalability that in general we expect most groups to apply their review feedback during the primary PR against main then the backport is a sanity check, so for this case @cilium/sig-scalability does not need to feel obliged to review all changes to stable branch modifications of this workflow. Perhaps as a followup we can just drop this change and defer to the existing rule.

Suggested change
/.github/workflows/*perf*.yaml @cilium/sig-scalability @cilium/github-sec @cilium/ci-structure

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is totally fine with me - modification of CODEOWNERS was part of the commit where I introduced a new test, which seems reasonable in general, but I guess for backports we should get rid of it. I will open followup PR.

@jrajahalme
Copy link
Member

@joamaki Looks like the first three PRs listed in the description are not included in this backport?

@joamaki
Copy link
Contributor Author

joamaki commented Jan 31, 2024

@joamaki Looks like the first three PRs listed in the description are not included in this backport?

Looks like they were in v1.15 already. Fixed the labels.

@joamaki joamaki removed the request for review from jrajahalme January 31, 2024 11:44
Copy link
Member

@aanm aanm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reviewed the remaining backported PRs.

@aanm aanm merged commit 0f2a341 into v1.15 Jan 31, 2024
217 checks passed
@aanm aanm deleted the pr/v1.15-backport-2024-01-30 branch January 31, 2024 12:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.15 This PR represents a backport for Cilium 1.15.x of a PR that was merged to main. kind/backports This PR provides functionality previously merged into master.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet