-
Notifications
You must be signed in to change notification settings - Fork 68
🌱 Use Patch instead of Update for finalizer operations #2328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
✅ Deploy Preview for olmv1 ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
The ClusterExtension and ClusterCatalog finalizer code used Update, the ClusterExtensionRevision used Apply, so the CE and CE finalizer code was rewritten to be like CER. There is common finalizer code between them now. |
|
Ping @joelanford |
|
catalogd pod is panicing... it only fails due to the summary generated, it's not "noticeable" during a regular run! |
If we want to remove conflicts, we should use patch, but adding/removing finalizer we do not touch removed fields (including the annotation). Controller-runtime cache is write-through cache, so an update or patch operation does not perform any direct modification on the cache - all changes are made by the informer. In what situations did we see conflicts, given that only single controller is responsible for a resource type? |
This is not a conflict issue. The problem was that we are using Update(), which uses the cached version of the resource to make the changes. The cached version no longer includes the This causes a problem with e.g. kubectl applying a ClusterExtension a second time. If you apply a CE twice without this code (i.e. current The problem is that the |
21e90d9 to
0608a72
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2328 +/- ##
==========================================
- Coverage 74.23% 70.51% -3.72%
==========================================
Files 91 92 +1
Lines 7239 7255 +16
==========================================
- Hits 5374 5116 -258
- Misses 1433 1700 +267
- Partials 432 439 +7
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| return false, fmt.Errorf("marshalling patch to add finalizer: %w", err) | ||
| } | ||
| // Note: Patch will update obj with the server response, including ResourceVersion | ||
| if err := c.Patch(ctx, obj, client.RawPatch(types.MergePatchType, patchJSON)); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use SSA here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that could be a good improvement, that would simplify the reconcile logic a lot, i.e. we do not need to care about appending/sorting finalizers at all.
Thanks for explaining it, make sense. Given that I think we should update PR description to reflect that, currently we have:
Given the explanation above, we are not reducing conflict or improving performances - we are ensuring correctness, i.e. not removing the annotation. |
cf27d6e to
bfb7a16
Compare
internal/catalogd/controllers/core/clustercatalog_controller.go
Outdated
Show resolved
Hide resolved
internal/catalogd/controllers/core/clustercatalog_controller.go
Outdated
Show resolved
Hide resolved
internal/catalogd/controllers/core/clustercatalog_controller.go
Outdated
Show resolved
Hide resolved
| Applier Applier | ||
| RevisionStatesGetter RevisionStatesGetter | ||
| Finalizers crfinalizer.Finalizers | ||
| FinalizerHandlers map[string]FinalizerHandler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a controller should set single finalizer, more than one is strange. I would suggest reverting the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is more than one for CEs right now because there are multiple finalizer concerns:
- Cleaning up the unpacked bundle image directory
- Shutting down the content manager informers (helm-only)
I'm not sure how the current helm-to-boxcutter migration deals with the content manager informer shutdown. Do those informers get cleaned up when boxcutter takes over or when the CE is deleted (potentially much later). Ideally we'd shutdown the helm-only content manager informers as soon as boxcutter takes over.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(but that's getting a bit outside the scope of this PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, there ARE two pre-existing finalizers on the ClusterExtension
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how the current helm-to-boxcutter migration deals with the content manager informer shutdown. Do those informers get cleaned up when boxcutter takes over or when the CE is deleted (potentially much later). Ideally we'd shutdown the helm-only content manager informers as soon as boxcutter takes over.
In order to switch from the Helm applier to the Boxcutter applier, the pod has to restart, so everything is effectively shutdown. The content-manager finalizer is a no-op under Boxcutter, but the finalizer still needs to be removed.
| u.SetGroupVersionKind(gvk) | ||
| u.SetName(obj.GetName()) | ||
| u.SetNamespace(obj.GetNamespace()) | ||
| u.SetFinalizers(newFinalizers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it has to be u.SetFinalizers(finalizers)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. this is based on the assumption that the arguments to EnsureFinalizers represent the only finalizers we want in the list. This is not the intent.
The EnsureFinalizers API ensures that the given finalizers are present. The intent is not to set the finalizers to only be those given, and is not intended to remove finalizers.
Because this is a list, we can't patch individual items into the list, we need to specify the whole list (at least that's how the SSA API is working). I verified this in the code by calling EnsureFinalizers() separately for each finalizer string, rather than in bulk. Only the last finalizer showed up when it was u.SetFinalizers(finalizers). Both finalizers showed up when the code was as above.
Yes, I realize the name EnsureFinalizers is a bit confusing, and should probably be AddFinalizers, but I was following the naming scheme (and behavior) of the removed code, which was inconsistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because this is a list, we can't patch individual items into the list, we need to specify the whole list (at least that's how the SSA API is working).
I need to disagree here, because Finalizer is defined as a set with merge patch stratefy:
https://github.com/kubernetes/apimachinery/blob/master/pkg/apis/meta/v1/types.go#L256-L259
Therefore, we do not patch item by item, we set all finalizers we need to own and make SSA requests.
BTW, my understanding here is that finalizers argument contains all finalizers we would like to set, not just one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Therefore, we do not patch item by item, we set all finalizers we need to own and make SSA requests.
Agreed, that's how SSA works, but that's not how the API layer above it has to work.
The function as defined adds finalizers, that's the intent. The intent is not to make the finalizer list equal to the arguments.
I'm renaming the function to indicate the desired (existing) behavior. It is confusing to have the mechanism to remove finalizers as "ensuring nothing".
If we wanted the behavior as you suggest, then SetFinalizers() would be more appropriate. But it would be at the cost of flexibility, and potential issues if other controllers add their own finalizers.
| } | ||
|
|
||
| // Update the passed object with the new finalizers | ||
| obj.SetFinalizers(newFinalizers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it has to get `obj.SetFinalizers(u.GetFinalizers())
|
|
||
| // RemoveFinalizer removes one or more finalizers from the object using server-side apply. | ||
| // If none of the finalizers exist, this is a no-op. | ||
| func RemoveFinalizer(ctx context.Context, c client.Client, obj client.Object, finalizers ...string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to have this function because EnsureFinalizer(ctx, c, obj) is going to remove previously added finalizers that we owned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intent of EnsureFinalizers is to only add finalizers to the list of finalizers, not to remove them. The RemoveFinalizers removes finalizers individually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in SSA approach, you are not remove things by omitting them in request, if you owned them previously. Hence, if we had somewhere in the code:
EnsureFinalizers(ctx, client, ce, "foo", "bar")to remove bar, we can perform:
EnsureFinalizers(ctx, client, ce, "foo")There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But that is not the intent of the API; it's to add items individually; not to remove. Just because the underlying mechanism is SSA, doesn't mean the layered API has to behave that way.
IMHO, It's confusing to have "Ensure" remove items, "Ensure" means to check that something is true or correct. In this case "Ensure" is confusing, so I'm renaming the function.
|
EDIT: All our finalizers use the same prefix, other controllers' finalizers should use a different prefix. So, now that I thought about it some more, I think a single |
bfa9322 to
9dea235
Compare
|
@tmshort apologize I did not mention it earlier: moving to SSA requires also to migrate the information stored under In our case, it is easy to reproduce:
- apiVersion: olm.operatorframework.io/v1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:finalizers:
.: {}
v:"olm.operatorframework.io/cleanup-contentmanager-cache": {}
v:"olm.operatorframework.io/cleanup-unpack-cache": {}
manager: operator-controller
operation: Update This is expected, because we have added finalizers via
To fix that we would need to migrate the ownership under Instead of going to through SSA migration, I would suggest that we revert removal of that annotation as the quick fix, and then introduce SSA if needed + detailed testing. |
|
/hold That being said, there is currently a failure in the test-experimental-e2e, which seems to be triggered by this change; since I'm not seeing the issue on main. I would need to fix that before this could be merged. I will point out that managed fields is not a "bit of memory", but can double the size of the cache. The memory savings was significant. |
I agree, but managed fields are readded in the cache with #2318 |
Whoops, I meant last-applied-config. |
|
/unhold |
cb203de to
385295c
Compare
Refactor all controllers to use Patch() instead of Update() when adding or removing finalizers to improve performance, and to avoid removing non-cached fields erroneously. Create shared finalizer utilities to eliminate code duplication across controllers. This is necesary because we no longer cache the`last-applied-configuration` annotation, so when we add/remove the finalizers, we are removing that field from the metadata. This causes issues with clients when they don't see that annotation (e.g. apply the same ClusterExtension twice). - Add shared finalizer.EnsureFinalizer() utilities - Update ClusterCatalog, ClusterExtension, and ClusterExtensionRevision controllers to use Patch-based finalizer management - Maintain early return behavior after adding finalizers on create - Remove unused internal/operator-controller/finalizers package - Update all unit tests to match new behavior 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Todd Short <tshort@redhat.com>
Signed-off-by: Todd Short <tshort@redhat.com>
Signed-off-by: Todd Short <tshort@redhat.com>
Signed-off-by: Todd Short <tshort@redhat.com>
Signed-off-by: Todd Short <tshort@redhat.com>
Signed-off-by: Todd Short <tshort@redhat.com>
Signed-off-by: Todd Short <tshort@redhat.com>
Signed-off-by: Todd Short <tshort@redhat.com>
Signed-off-by: Todd Short <tshort@redhat.com>
…mework#2338)" This reverts commit 39cbdbe.
Signed-off-by: Todd Short <tshort@redhat.com>
Signed-off-by: Todd Short <tshort@redhat.com>
Signed-off-by: Todd Short <tshort@redhat.com>
Signed-off-by: Todd Short <tshort@redhat.com>
9fc4525 to
6e95c7b
Compare
Refactor all controllers to use Patch() instead of Update()
when adding or removing finalizers to improve performance, and to avoid
removing non-cached fields erroneously. Create shared finalizer utilities
to eliminate code duplication across controllers.
This is necesary because we no longer cache the
last-applied-configurationannotation, so when we add/remove the finalizers, we are removing that field
from the metadata. This causes issues with clients when they don't see that
annotation (e.g. apply the same ClusterExtension twice).
controllers to use Patch-based finalizer management
🤖 Generated with Claude Code
Co-Authored-By: Claude noreply@anthropic.com
Signed-off-by: Todd Short tshort@redhat.com
This also restores the non-caching of "last-applied-config".
Description
Reviewer Checklist