Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.15 Backports 2024-01-02 #30079

Merged
merged 25 commits into from
Jan 12, 2024

Conversation

pippolo84
Copy link
Member

@pippolo84 pippolo84 commented Jan 2, 2024

Once this PR is merged, a GitHub action will update the labels of these PRs:

 29720 29957 29880 29954 29934 29813 29246 29896 29616 29815 29826 29674 30000 29865 29343 29848 30013

mhofstetter and others added 25 commits January 2, 2024 09:08
[ upstream commit 4838ca2 ]

Currently, the test `TestGetIdentity` fails with the following error.

```
--- FAIL: TestGetIdentity (0.85s)
    --- FAIL: TestGetIdentity/Multiple_identities (0.26s)
        identity_test.go:226: Identity not found in the store
```

The reason is that the watch of the backend (started in its own
goroutine) might not be ready when the test starts to create test
identities.

Therefore, this commit introduces that the test waits with its
continuation until the watch is ready.

Fixes: cilium#23856

Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 8c2dfda ]

This commit changes the log fields for the "stale identity observed"
message to include the full context (i.e. trace observation point,
source and destination addresses etc) for identifying which flow caused
the message to be emitted. In addition, the "old" and "new" identity
field names are replaced with the better fitting "userspace" and
"datapath" terminology.

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit caef2d3 ]

This was reported by Paul in [1]. After investigating the sysdump, it is
clear that Hubble should not complain about such packets.

[1] cilium#15283 (comment)

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 4196935 ]

This limits the amount of "stale identities observed" messages to one
every 30 seconds. This is particularly important if monitor aggregation
is disabled, as otherwise any unknown discrepancy fills the log buffers
and thus makes it hard to e.g. debug CI flakes.

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 0d35af0 ]

To make it easy for callers, ipv4_handle_fragmentation() should always
return a DROP_* reason on error. But for errors from l4_load_ports() we're
currently just propagating those raw errors back.

Return a drop reason instead. This also makes us consistent with the
non-fragment path in ipv4_load_l4_ports().

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit cab1568 ]

This changes BGP Controller's event handling from hive hooks
and workerpool to hive jobs, mainly to avoid creating resource.Store
in the Start() hive hook of the BGP controller.

Since creating resource.Store() blocks until the store is initialized,
we should avoid calling it in the Start() hive hook to not
slow down the startup process and to allow progressing with the
startup even when the CRD is not yet installed.

Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 08017c2 ]

This changes event handling in BGP CP resource.Store wrappers
(DiffStore, BGPCPResourceStore) from hive hooks and a goroutine
to a hive job, mainly to avoid creating resource.Store in the
Start() hive hook.

This also makes them wrappers rather than a super set of the
resource.Store, exposing only API that is actually used within
the BGP CP, and allowing all methods to return an error if the store
has not yet initialized.

Since creating resource.Store() blocks until the store is initialized,
we should avoid calling it in the Start() hive hook to not
slow down the startup process and to allow progressing with the
startup even when the CRD is not yet installed.

Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 1bf697f ]

The IPsec upgrade test started timing out from time to time on main's
CI. This commit bumps the timeout a bit to avoid such failures.

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit a573bb4 ]

Commit 10f04fd (endpoint: Resolve named
ports for redirects) fixed redirect creation for L7 policies using a
named port, but failed to use the resolved destination port also in proxy
stats. This commit does that.

Fixes: cilium#29023
Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit b825850 ]

Modularize the agent stale CiliumEndpoint init procedure in an
independent cell.  The logic is functionally equivalent to the previous
implementation.

The tests have been refactored to use the hive and cell framework, thus
implementing unit-style integration tests.

As an additional benefit, this remove the direct dependency from the CEP
and CES stores managed in the related k8s watchers.

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit e5b1de4 ]

GetIndexer and SetIndexer were meant to ease the testability of the
previous non-modularized implementation of the agent endpoint GC. Since
they are not needed anymore, it is possible to safely remove them.

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 7f0e373 ]

During external workloads initialization, the clustermesh-apiserver
allocates a new identity for the external workload, which is then
propagated through etcd to the agent running on the external workload;
yet, the agent eventually overrides it with the default value (1).

While this appears to be harmless from a functional point of view, as
the clustermesh-apiserver configures the original identity as part of
the associated the CiliumNode and CiliumEndpoint resources (which are
then watched by the other agents), it also triggers an error, due to
the unexpected mismatch:

  CEW: Invalid identity 1 in ...

Let's fix this by avoiding to override the originally assigned identity
from the external workload agent.

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit a495abd ]

This commit modifies calls to `send_trace_notify` in the datapath to
drop trace events related to encrypted packets when monitor aggregation
is enabled. More specifically, this commit ensures that whenever
`send_trace_notify` is called with a `trace_reason` of `TRACE_REASON_ENCRYPTED`,
the `monitor` argument is set to zero. A Coccinelle script is provided
in this commit to add a build-time check for this requirement moving forward.

This change helps to reduce the overall CPU usage of Cilium Agents when IPSec
encryption is enabled, by reducing the number of trace events emitted by
the datapath. Normally monitor aggregation can be used in order to
reduce the number of trace events, however IPSec-related trace events
are not able to be aggregated since they lack the necessary connection
tracking information. See the function `emit_trace_notify` in `bpf/lib/trace.h`
for more information.

This same change was applied in `bpf/bpf_lxc.c` in commit 3e52822.

Thank you to Lorenz who added a workaround for passing the verifier with
Clang 10.

See also: cilium#27168

Co-authored-by: Lorenz Bauer <lmb@isovalent.com>
Signed-off-by: Ryan Drew <ryan.drew@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit e95f2e0 ]

Add an initial bpf_network configuration. The defines are taken from a
sysdump for a failing CI run that seems to exhibit verifier problems.

Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 3d3f69c ]

Datapath equalness of the old and new entry must be evaluated before
entries are merged, afterwards the entries will be the same and the key
is not added in ChangeState.Adds, which may lead to a policy map entry
not being updated during incremental updates.

Fixes: cilium#26331

Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit d24fa08 ]

d9a4594 ("etcd: rework lease manager to additionally create
sessions") appears to have caused a regression in the duration of the
endpoint tests, due to a slightly different implementation of the
isConnectedAndHasQuorum() etcd client method, which is in turn
executed before starting a Watcher (through Connected()).

The regression is the combination of multiple causes:

* Incorrect endpoint tests tear down ordering, due to the interplay of
  TearDownTest and Cleanup; specifically, the identity allocator manager
  (which internally starts the watch operation) is stopped after closing
  the kvstore client;
* The identity allocator does not propagate a context to the watch
  operation, but instead relies on a external channel to stop it;
* The etcd Connected() function does no longer terminate immediately
  when the client is closed (i.e., when the session gets closed).
  The reason is that the session is no longer guaranteed to be unique.
* The etcd client appears to retry multiple times certain operations
  if it is closing, but the given context is not canceled (as in this
  case, as the given context is never canceled). This is the reason for
  the extra delay, and comes with 100 repetitions of the warning:

{"level":"warn","ts":"...","logger":"etcd-client","caller":"v3/retry_interceptor.go:62",
 "msg":"retrying of unary invoker failed","target":"...","attempt":0,
 "error":"rpc error: code = Canceled desc = grpc: the client connection is closing"}

Let's adopt the easiest fix here, and make sure that the client is
closed after closing the manager. This ensures correct tear down
ordering and prevents the etcd client retry process.

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit babba79 ]

Historically, the Cilium helm chart allowed to override the routing mode
leveraged in combination with {gke,aksbyocni}.enabled. This is no longer
possible since aff16b2 ("Change routing-mode and tunnel interaction.").

According to the Cilium documentation [1,2], this appears to be the
correct behavior, as the routing mode must be respectively set to native
and tunnel in these cases. Hence, let's validate that users didn't
configure a different routing mode, to avoid falling back silently,
which may be confusing.

[1]: https://docs.cilium.io/en/stable/network/concepts/routing/#id6
[2]: https://docs.cilium.io/en/stable/gettingstarted/k8s-install-default/#install-cilium (AKS tab)

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit b26d9be ]

Previously, when determing a keyid before the rotation, the doc suggested to
run "cut -c 1". This returns only the first digit (e.g., if keyid is "15",
then "1" is returned). This breaks the rotation 15=>1.

Fixes: 42ef7f3 ("docs: Update IPsec key rotation command")
Reported-by: Paul Chaignon <paul.chaignon@gmail.com>
Signed-off-by: Martynas Pumputis <m@lambda.lt>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 70ec61c ]

With the newer, asynchronous APIs, the ipcache takes on the
responsibility of updating the policy engine (i.e. the SelectorCache)
whenever it allocates an identity. However, the legacy APIs don't do
that. This commit changes that.

By centralizing this in the ipcache, we can (in a subsequent commit)
stop regenerating all endpoints whenever an identity is allocated. This
is wasteful, since local identities can only ever be allocated by the
ipcache now.

Signed-off-by: Casey Callendrello <cdc@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit afc3b5f ]

AllocateIdentity() takes a parameter, `notifyOwner`, that when true,
tells the allocator to propagate those updates to the policy subsystem.

However, the local (i.e. CIDR) identity allocator ignores this field,
and *always* propagates identity updates to the policy subsystem as well
as **always regenerating all endpoints**. This is needless, since the
ipcache always updates the policy system as well.

This change is safe since the ipcache is the only caller that passes
`notifyOwner = false`, and the ipcache updates the policy system itself.
All other callers to `AllocateIdentity()` set `notifyOwner = true`.

Fixes: cilium#29681

Signed-off-by: Casey Callendrello <cdc@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 7ebaf63 ]

This should be fixed now.

Fixes: cilium#29681

Signed-off-by: Casey Callendrello <cdc@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 9f845ab ]

Commit bd5ec0b introduced
pre-declaring metrics label value ranges for k8s watcher metrics.
This also performs a check when the metric is updated to check whether
the metric label value is within the (optional) declared range.

However, k8s watchers would require declaring all possible CRD types in
the scope.

We could make this change, but this list would be hard to maintain, and
more importantly, we probably don't want to initialize metrics for CRDs
that are not necessarily going to be used (i.e. CES if disabled).

This reverts this change, until a better solution for this type of
metric is developed.

Reported-by: Marco Iorio <marco.iorio@isovalent.com>
Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit c084bc0 ]

When a metric declared with XWithLabels(...) is updated with a vector
label value that is outside the expected range of values for that label
then a error log is emitted (with the metric being still emitted).

This improves this by:

* Switch error log to warning, as these logs do not indicate a serious
  runtime error but are meant to help developers catch use of undeclared
  label values.
* Add information about what metric/label/value caused the violation.

As well, this DRY's up the label value checking code.

Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 0bf036e ]

Hive uses pflag and viper to parse configuration flags from multiple sources.
If a flag is set via command-line then the pflag parser is invoked to get to
the destination type as defined in the FlagSet ("flags.StringSlice" [1]), however
if the flag comes from environment or config-map, then the parsing was done by
a mapstructure hook [2].

This is all well and good as long as these two ways of parsing into say []string
are aligned with each other. And they were. Unfortunately these were not aligned
with the pre-Hive way of parsing which used viper.GetStringSlice and similar
methods. Specifically viper.GetStringSlice is implemented ([3]) via cast.ToStringSlice,
which uses strings.Fields that splits by whitespace instead of by commas.

So to summarize the different ways a StringSlice can be parsed:

- [1]: flags.StringSlice: parses with csv.Reader (split by comma)
- [2]: stringToSliceHookFunc: splits by comma
- [3]: viper.GetStringSlice: splits by whitespaces

So while arguably the first two are more consistent, we can't just flip
from splitting by spaces to splitting by commas as that creates a huge foot-gun
when fields are moved from option.Config to individual hive.Config structs.

To solve this we allow splitting both ways by using two mapstructure
hooks that process the values before they're pushed to the config struct:

- mapstructure.StringToSliceHookFunc(",") splits first string by commas.
  This only impacts input coming from environment, configmap or flags.String and
  going to []string.

- fixupStringSliceHookFunc takes []string coming from flags.StringSlice or from
  the StringToSliceHookFunc and resplits it by whitespace if it was of length 1.

With this, we have unified the parsing of []string across all the config input
methods:

  "foo,bar,baz" => []string{"foo", "bar", "baz"}
  "foo bar baz" => []string{"foo", "bar", "baz"}
  "foo bar,baz" => []string{"foo bar", "baz"}

[1]: https://github.com/spf13/pflag/blob/master/string_slice.go#L27
[2]: https://github.com/mitchellh/mapstructure/blob/main/decode_hooks.go#L104
[3]: https://github.com/spf13/viper/blob/9154b900c34ad9d88897f7e5288ce43f457f698b/viper.go#L1067

Fixes: cilium#29210
Fixes: b407ffc ("hive: Reimplement on top of dig")
Signed-off-by: Jussi Maki <jussi@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 1571cce ]

BGP Manager to unset running flag when Stop is called. This is to fix
flaky tests where reconcile gets called after Stop, which recreates BGP
servers at shutdown stage.

Signed-off-by: harsimran pabla <hpabla@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
@pippolo84 pippolo84 added kind/backports This PR provides functionality previously merged into master. backport/1.15 This PR represents a backport for Cilium 1.15.x of a PR that was merged to main. labels Jan 2, 2024
@pippolo84
Copy link
Member Author

/test-backport-1.15

@pippolo84 pippolo84 marked this pull request as ready for review January 2, 2024 10:16
@pippolo84 pippolo84 requested review from a team as code owners January 2, 2024 10:16
@pippolo84
Copy link
Member Author

Conformance Ginkgo flake tracked here and Conformance IPsec E2E here, rerunning.

Copy link
Member

@giorio94 giorio94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My commits look good, thanks!

Copy link
Contributor

@learnitall learnitall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@julianwiedmann julianwiedmann removed the request for review from brb January 11, 2024 12:21
@julianwiedmann julianwiedmann added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jan 11, 2024
@dylandreimerink dylandreimerink merged commit 5e41b7c into cilium:v1.15 Jan 12, 2024
61 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.15 This PR represents a backport for Cilium 1.15.x of a PR that was merged to main. kind/backports This PR provides functionality previously merged into master. ready-to-merge This PR has passed all tests and received consensus from code owners to merge.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet