New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor k8s identities GC into a cell.Module #22892
Refactor k8s identities GC into a cell.Module #22892
Conversation
dea7575
to
8e99618
Compare
operator/cmd/k8s_identity_gc_test.go
Outdated
} | ||
|
||
// signal that endpoints are sync-ed, otherwise identities gc won't start | ||
close(watchers.CiliumEndpointsSynced) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should add Resource[*CiliumEndpoints]
and get rid of this. The test also doesn't seem to check whether we correctly keep identities that have a CiliumEndpoint associated with them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll also need to have the CiliumEndpoints indexed by identity ID (e.g. with something like: https://github.com/joamaki/cilium/blob/pr/joamaki/operator-resource-trials/pkg/k8s/resource/index.go).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I'll make this the priority for the next architecture-related PR.
8e99618
to
484ae47
Compare
I've merged the kvstore mode gc into the same cell and reworked the code. The changes regarding the unit test are not there yet. |
070876d
to
d44050e
Compare
d44050e
to
e890124
Compare
The unit test has been extended to check that identities used by an endpoint are not gc-ed. |
716d684
to
b5839ff
Compare
Added this commit to prevent modification of identity objects returned (and owned) by the resource informer. |
4dfee60
to
c4fab44
Compare
Add Cilium Identities as a shared resource in the proper cell. This will enable the modularization of all subsystems that depend on CiliumIdentities events streaming. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Add a new cell.Module to implement the CiliumIdentity GC. This improves the testability of the GC code through unit tests, without relying on more complicated (and slow) controlplane tests. Besides, the modularization of the GC cleans up the operator bootstrap code, since all the GC specific parts are now moved to their cell. The cell supports both GC modes: CRD allocated identities or kvstore allocated identities. The code is moved inside the `identity` package and has the following structure: operator/identity/ ├── cell.go ├── crd_gc.go ├── gc.go ├── heartbeat.go └── kvstore_gc.go where: - `cell.go` contains all the exported API for the new cell, alongside its dependencies and configuration. - `gc.go` contains the common part of both gc modes. - `crd_gc.go` and `heartbeat.go` contain the CRD allocation mode specific GC part. - `kvstore`contains the kvstore allocation mode specific GC part. The new cell relies on `resource.Resource[*v2.CiliumIdentity]` to listen for CiliumIdentity related events and react accordingly. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
The heartbeat store used to keep track of the last insert/update timestamps of each identities is an implementation detail related only to the inner workings of the identities GC. The commit changes all the related APIs to make them unexported. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Unlike global configuration options, cell-specific configuration options (i.e. the ones defined through cell.Config(...)) will not be loaded from `agentOption` or `operatorOption`, but from the `*viper.Viper` object bound to the test hive. `modCellConfig` function exposes the operator hive viper struct to each controlplane test, so to allow changing those options as needed. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Add a unit test for the CiliumIdentities GC operating in CRD allocation mode. The new unit test verifies that an Identity is correctly gc-ed when it has neither been updated in the last heartbeat cycle nor used by a CiliumEndpoint. Besides, the test adds a check for leaked goroutines using goleak. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
The global node watcher relies on an unexported rate limiting queue that starts a goroutine to listen for related events. Being unexported, there are no means to stop that goroutine when running unit tests like the identities gc one in operator/identity. To avoid leaking goroutines, an exported API is added so to stop the node queue before the test termination. This is a temporary solution before refactoring the watchers package to use the new "hive and cells" architecture. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Controlplane test identities GC has been superseded by the identities GC unit test in operator/identity. The two test are functionally equivalent, so the controlplane one is removed. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Do not start the heartbeat updater for CRD mode identities GC when the related interval is set to 0. The heartbeat updater single purpose is to mark all identities that have been just created, in order to avoid the GC to collect them immediately after creation. When the GC is disabled, there is no point in starting it. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
To avoid repeating log entries, inner logging of updates and deletions ops on identities is removed. Also, the log entry related to the collection of an identity is lowered to debug level to avoid flooding the log too much. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
In order to gracefully shutdown the identities GC, all critical errors are reported to the upper layer instead of relying on a call to Fatal(). Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
All the code in operator/identity is there only to implement the Cilium identities gc. Renaming it makes its purpose clearer and also avoids the name clashing with pkg/identity package. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Since the GC does not export any API, it is possible not to return the internal object and just register its start and stop hooks to hive.Lifecycle directly. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Even if (prometheus.Gauge).Set requires a float64 value, declaring the counters for successful and unsuccessful gc runs as such might feel confusing to the casual reader. The type is then changed to int and a type conversion to float64 is added at each Set call site. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
The identities returned by the underlying identity resource informer are owned by the informer itself, thus they should be deep copied before being modified. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
This commit regenerates the cmdref documentation files following the refactor of the identites gc into a cell. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
c4fab44
to
45485f3
Compare
Rebased onto master to fix build-commits test failure and updated operator cmdref. |
/test |
Port the current implementation of the CiliumIdentity GC to a separate
cell.Module
.This makes possible to improve the testing of the GC code through unit tests, without relying on more complicated (and slow) controlplane tests.
Besides, the modularization of the GC cleans up the operator bootstrap code, since all the GC specific parts are now moved to their cell.
The new cell contains the code for both GC modes: CRD allocated identities and kvstore allocated identities.
The GC is moved inside the
identity
package and has the following structure:where:
cell.go
contains all the exported API for the new cell, alongside its dependencies and configuration.gc.go
contains the common part of both gc modes.crd_gc.go
andheartbeat.go
contain the CRD allocation mode specific GC part.kvstore
contains the kvstore allocation mode specific GC part.The new cell relies on
resource.Resource[*v2.CiliumIdentity]
to listen for CiliumIdentity related events and react accordingly.Note for the reviewers
This is probably better to review commit by commit. The most impactful one is 1a482f8d3a08147410e65d91adbc6e49515a49b4, but despite the number of files modified, it aims to add the minimum amount of changes needed to modularize the identities GC code in a separate cell.
Further cleanups are left in the subsequent commits.