Ignore annotations made by other Kopf-based operators #539
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What do these changes do?
Inject Kopf's identity into annotations of all Kopf-based operators. Use that identity to detect and ignore annotations belonging to other Kopf-based operators regardless of their identities.
By this, reduce ping-pong effects with multiple operators handling the same resources, which can overload the K8s API and explode the resources' sizes.
Description
Problem:
Kopf implements additional constructs on top of the regular watch-and-react flow of K8s. To make this work in multiple reaction cycles, Kopf persist its own state into the resource itself. Since #331, the state is persisted into the resource's annotations in addition to (or instead of) the status stanza.
If multiple Kopf-based operators handle the same resources, they can cause ping-pong behaviour:
the state of the operator A as the essential payload;
This can quickly overload K8s API with requests per second (ping-pong happens fast), and blow the size of the resources.
The problem becomes more complicated when the operators have different identities: then, operators in the cluster could be unaware that other operators are Kopf-based too, and their annotations are for their state persistence, i.e. it is not something useful.
It is impossible for operator developers to configure their operators so that they would be aware of any other Kopf-based operators, unless the developers control each and every of those operators.
Solution:
To solve the problem with other Kopf-based operators' persistence being detected as meaningful content (i.e. the "essence"), these annotations should be filtered out from the essence and become invisible to any other Kopf-based operator — same as this already happens with the whole
.status
stanza.However, Kopf's annotations are stored in a space shared with other non-Kopf-based operators, controllers, applications, and human-made annotations — all of these should be considered as the essential content, and its changes should be detected. Currently, there is no way to distinguish the annotations by their type or origin.
To filter out the annotations of Kopf-based operators, they all should be marked or tainted, and then these marks/taints should be used to ignore them.
Such detection should be independent of the operator's identity, i.e. shared by all operators made on this framework that produces these annotations.
Why so?
Looking back into history, the annotations store the information that previously (before #331) was stored in the status stanza, which is fully ignored when extracting the body's essence and diff — unless some specific fields are explicitly marked as of interest by having a field-handler for that field.
The state-persisting annotations should behave the same: they should all be ignored, no matter from which operator they come, and only be noticed when explicitly monitored. This is not the case now.
Exclusion logic:
In this PR, the following logic of annotation detection is used:
Ignore all annotations that have a prefix, which contains "kopf" as a separate item (dot- or dash-separated): e.g.UPD: Narrowed to the following, to allow mentioning Kopf without losing the annotations from diffs, and to keep to specific namespaces only. After all, it can be just a coincidence that the annotations contain "kopf".kopf.zalando.org/*
,my-kopf-operator.com/*
, etc.Reserved prefixes: Ignore all annotations that have a prefix which is a reserved prefix, or a subdomain of that reserved prefix. Currently, only
kopf.zalando.org
is reserved, but can be extended. Examples:kopf.zalando.org/*
,demo.kopf.zalando.org/*
, etc. — but notkopf.my-operator.com/*
,my-kopf-operator.com/*
orcom.example.kopf/*
.Arbitrary prefixes: Ignore all annotations that have a prefix, for which a standalone annotation named
*/kopf-managed
exists: e.g.example.com/kopf-managed
. The value of the marking annotation is irrelevant for detection. This annotation is automatically created every time any other annotation with that prefix is stored by Kopf, but only if the annotation is not identifiable by the prefix itself (i.e. the annotation is not detectable by the 1st rule).This logic only works with prefixed annotations.
For non-prefixed annotations, the implementation would be too complex. Instead, the non-prefixed storages (both diffbase & progress) are deprecated, and a warning is issued when the prefix is set to
None
. By default, the prefix iskopf.zalando.org
, so it would work "out of the box" without any issues as before.An example (based on a code snippet from #517):
Reproduction:
Start 2 operators in parallel:
With the unfixed code: The "ping-pong" issue is observed immediately: the object size increases to approximately 188KB in a matter of a second, and the operator becomes unable to PATCH the resource since the API fails with "Unprocessable Entity" — and the throttling begins.
With the fixed code: Go to
kopf.storage.conventions.StorageKeyMarkingConvention
and modify the "known marker" (fromkopf-managed
tokopf-managed-2
). Observe the issue as if the code is unfixed. Once the marker is restored, all those garbage annotations disappear on the first successful handling cycle.Risks:
If the user modifies the resource and deletes the
*/kopf-managed
annotation in case of a non-Kopf-branded prefix, this would immediately notify all Kopf-based operators about a change, and they could "notice" the addition of all annotations in that group. It is the same kind of risk, as deleting the*/last-handled-configuration
, which would trigger the on-creation handlers. It is expected that the users do not modify the annotations they do no understand.The existing state-persisting annotations remain the same, so both the upgrades from any previous versions are possible, and the downgrades (rollbacks) to any other previous version are possible.
Discarded ideas:
There was a draft implementation which added
kopf--
infixes to the annotation names (e.g.example.com/kopf--handler1.subhanderA-HaSh
) and used that for detection both of the Kopf-based prefixes and even non-prefixed annotations. The solution was quite complex, but what is worse: it would require storing duplicate annotations even for the*/last-handled-configuration
(i.e.*/kopf--last-handled-configuration
). This would pollute the annotations for the time of upgrade/downgrade for no good reason, would break the upgrade path through versions, and would steal extra 6 characters from the 63 available for the annotation names. This schema didn't have any significant advantages over having just one Kopf-branding annotation per prefix.Another idea was to inject the markers into the content of the annotations, such as
//kopf
in the end or#kopf
in the beginning of the annotation's value. There are no promises on what exactly and how exactly is stored in the annotations, so there is no need for it to be a JSON. However, JSON is a common convention and we should avoid breaking it. This approach would also break backward compatibility.Another idea was to explicitly configure the operators to be aware of other operators by listing their prefixes. This wouldn't work: the developers (1) do not want to control the operators' environment that closely, and (2) are not able to control the environment with some third-party Kopf-based operators involved. It all should work "out of the box".
Issues/PRs
Type of changes
Checklist
CONTRIBUTORS.txt