-
Notifications
You must be signed in to change notification settings - Fork 68
🌱 Refactor ClusterExtension reconciler to use composable step-based pipeline
#2332
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
🌱 Refactor ClusterExtension reconciler to use composable step-based pipeline
#2332
Conversation
✅ Deploy Preview for olmv1 ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
ClusterExtension reconciler to use composable step-based pipelineClusterExtension reconciler to use composable step-based pipeline
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors the ClusterExtension reconciler from a monolithic 170-line reconcile method into a composable step-based pipeline architecture. Each reconciliation phase (finalizer handling, revision state retrieval, metadata resolution, bundle unpacking, and bundle application) is now a standalone function that can be configured differently for Helm vs Boxcutter workflows.
- Introduces
ReconcileStepFunctype andReconcileStepsexecutor for sequential step execution - Extracts reconciliation logic into individual step functions with context-based data passing
- Moves Boxcutter-specific code (
BoxcutterRevisionStatesGetter,MigrateStorage) into dedicated file - Configures applier-specific pipelines in main.go
- Updates tests to use functional options pattern for reconciler setup
Reviewed Changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 13 comments.
Show a summary per file
| File | Description |
|---|---|
| internal/operator-controller/controllers/clusterextension_controller.go | Removes monolithic reconcile method, adds ReconcileStepFunc infrastructure and step executor |
| internal/operator-controller/controllers/clusterextension_reconcile_steps.go | New file containing common reconciliation step functions with context-based data flow |
| internal/operator-controller/controllers/boxcutter_reconcile_steps.go | New file containing boxcutter-specific steps (MigrateStorage) and BoxcutterRevisionStatesGetter |
| cmd/operator-controller/main.go | Configures reconciliation step pipelines for both boxcutter and Helm appliers |
| internal/operator-controller/controllers/suite_test.go | Adds functional options pattern for test reconciler configuration |
| internal/operator-controller/controllers/clusterextension_controller_test.go | Refactors tests to use new functional options pattern for reconciler setup |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| for _, applyOpt := range opts { | ||
| applyOpt(ctrlBuilder) | ||
| } |
Copilot
AI
Nov 13, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The removal of the blank line at line 312 reduces code readability. There should be a blank line between the loop and the return statement to separate the loop logic from the function return.
Consider keeping the blank line for better visual separation.
| } | |
| } |
| func HandleFinalizers(f finalizer.Finalizer) ReconcileStepFunc { | ||
| return func(ctx context.Context, ext *ocv1.ClusterExtension) (context.Context, *ctrl.Result, error) { | ||
| l := log.FromContext(ctx) | ||
|
|
||
| l.Info("handling finalizers") | ||
| finalizeResult, err := f.Finalize(ctx, ext) | ||
| if err != nil { | ||
| setStatusProgressing(ext, err) | ||
| return ctx, nil, err | ||
| } | ||
| if finalizeResult.Updated || finalizeResult.StatusUpdated { | ||
| // On create: make sure the finalizer is applied before we do anything | ||
| // On delete: make sure we do nothing after the finalizer is removed | ||
| return ctx, &ctrl.Result{}, nil | ||
| } | ||
|
|
||
| if ext.GetDeletionTimestamp() != nil { | ||
| // If we've gotten here, that means the cluster extension is being deleted, we've handled all of | ||
| // _our_ finalizers (above), but the cluster extension is still present in the cluster, likely | ||
| // because there are _other_ finalizers that other controllers need to handle, (e.g. the orphan | ||
| // deletion finalizer). | ||
| return nil, nil, nil | ||
| } | ||
| return ctx, nil, nil | ||
| } | ||
| } | ||
|
|
||
| func RetrieveRevisionStates(r RevisionStatesGetter) ReconcileStepFunc { | ||
| return func(ctx context.Context, ext *ocv1.ClusterExtension) (context.Context, *ctrl.Result, error) { | ||
| l := log.FromContext(ctx) | ||
| l.Info("getting installed bundle") | ||
| revisionStates, err := r.GetRevisionStates(ctx, ext) | ||
| if err != nil { | ||
| setInstallStatus(ext, nil) | ||
| var saerr *authentication.ServiceAccountNotFoundError | ||
| if errors.As(err, &saerr) { | ||
| setInstalledStatusConditionUnknown(ext, saerr.Error()) | ||
| setStatusProgressing(ext, errors.New("installation cannot proceed due to missing ServiceAccount")) | ||
| return ctx, nil, err | ||
| } | ||
| setInstalledStatusConditionUnknown(ext, err.Error()) | ||
| setStatusProgressing(ext, errors.New("retrying to get installed bundle")) | ||
| return ctx, nil, err | ||
| } | ||
| return context.WithValue(ctx, revisionStatesKey{}, revisionStates), nil, nil | ||
| } | ||
| } | ||
|
|
||
| func RetrieveRevisionMetadata(r resolve.Resolver) ReconcileStepFunc { | ||
| return func(ctx context.Context, ext *ocv1.ClusterExtension) (context.Context, *ctrl.Result, error) { | ||
| l := log.FromContext(ctx) | ||
| revisionStates := ctx.Value(revisionStatesKey{}).(*RevisionStates) | ||
| var resolvedRevisionMetadata *RevisionMetadata | ||
| if len(revisionStates.RollingOut) == 0 { | ||
| l.Info("resolving bundle") | ||
| var bm *ocv1.BundleMetadata | ||
| if revisionStates.Installed != nil { | ||
| bm = &revisionStates.Installed.BundleMetadata | ||
| } | ||
| resolvedBundle, resolvedBundleVersion, resolvedDeprecation, err := r.Resolve(ctx, ext, bm) | ||
| if err != nil { | ||
| // Note: We don't distinguish between resolution-specific errors and generic errors | ||
| setStatusProgressing(ext, err) | ||
| setInstalledStatusFromRevisionStates(ext, revisionStates) | ||
| ensureAllConditionsWithReason(ext, ocv1.ReasonFailed, err.Error()) | ||
| return ctx, nil, err | ||
| } | ||
|
|
||
| // set deprecation status after _successful_ resolution | ||
| // TODO: | ||
| // 1. It seems like deprecation status should reflect the currently installed bundle, not the resolved | ||
| // bundle. So perhaps we should set package and channel deprecations directly after resolution, but | ||
| // defer setting the bundle deprecation until we successfully install the bundle. | ||
| // 2. If resolution fails because it can't find a bundle, that doesn't mean we wouldn't be able to find | ||
| // a deprecation for the ClusterExtension's spec.packageName. Perhaps we should check for a non-nil | ||
| // resolvedDeprecation even if resolution returns an error. If present, we can still update some of | ||
| // our deprecation status. | ||
| // - Open question though: what if different catalogs have different opinions of what's deprecated. | ||
| // If we can't resolve a bundle, how do we know which catalog to trust for deprecation information? | ||
| // Perhaps if the package shows up in multiple catalogs and deprecations don't match, we can set | ||
| // the deprecation status to unknown? Or perhaps we somehow combine the deprecation information from | ||
| // all catalogs? | ||
| SetDeprecationStatus(ext, resolvedBundle.Name, resolvedDeprecation) | ||
| resolvedRevisionMetadata = &RevisionMetadata{ | ||
| Package: resolvedBundle.Package, | ||
| Image: resolvedBundle.Image, | ||
| BundleMetadata: bundleutil.MetadataFor(resolvedBundle.Name, *resolvedBundleVersion), | ||
| } | ||
| } else { | ||
| resolvedRevisionMetadata = revisionStates.RollingOut[0] | ||
| } | ||
| return context.WithValue(ctx, resolvedRevisionMetadataKey{}, resolvedRevisionMetadata), nil, nil | ||
| } | ||
| } | ||
|
|
||
| func UnpackBundle(i imageutil.Puller, cache imageutil.Cache) ReconcileStepFunc { | ||
| return func(ctx context.Context, ext *ocv1.ClusterExtension) (context.Context, *ctrl.Result, error) { | ||
| l := log.FromContext(ctx) | ||
| l.Info("unpacking resolved bundle") | ||
| resolvedRevisionMetadata := ctx.Value(resolvedRevisionMetadataKey{}).(*RevisionMetadata) | ||
| imageFS, _, _, err := i.Pull(ctx, ext.GetName(), resolvedRevisionMetadata.Image, cache) | ||
| if err != nil { | ||
| revisionStates := ctx.Value(revisionStatesKey{}).(*RevisionStates) | ||
| // Wrap the error passed to this with the resolution information until we have successfully | ||
| // installed since we intend for the progressing condition to replace the resolved condition | ||
| // and will be removing the .status.resolution field from the ClusterExtension status API | ||
| setStatusProgressing(ext, wrapErrorWithResolutionInfo(resolvedRevisionMetadata.BundleMetadata, err)) | ||
| setInstalledStatusFromRevisionStates(ext, revisionStates) | ||
| return ctx, nil, err | ||
| } | ||
| return context.WithValue(ctx, imageFSKey{}, imageFS), nil, nil | ||
| } | ||
| } | ||
|
|
||
| func ApplyBundle(a Applier) ReconcileStepFunc { |
Copilot
AI
Nov 13, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The exported functions HandleFinalizers, RetrieveRevisionStates, RetrieveRevisionMetadata, UnpackBundle, and ApplyBundle lack documentation. These are the core building blocks of the new reconciliation architecture and should have doc comments explaining their purpose, inputs, outputs, and any side effects.
Example for HandleFinalizers:
// HandleFinalizers returns a ReconcileStepFunc that manages finalizers for the ClusterExtension.
// It ensures finalizers are added on resource creation and properly handles cleanup on deletion.
// If finalizers are updated, reconciliation stops to allow the changes to be persisted.
func HandleFinalizers(f finalizer.Finalizer) ReconcileStepFunc {| return func(ctx context.Context, ext *ocv1.ClusterExtension) (context.Context, *ctrl.Result, error) { | ||
| l := log.FromContext(ctx) | ||
| l.Info("unpacking resolved bundle") | ||
| resolvedRevisionMetadata := ctx.Value(resolvedRevisionMetadataKey{}).(*RevisionMetadata) |
Copilot
AI
Nov 13, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The UnpackBundle step performs an unsafe type assertion on line 141: ctx.Value(resolvedRevisionMetadataKey{}).(*RevisionMetadata). If the previous RetrieveRevisionMetadata step is skipped or the context doesn't contain this key, this will panic.
Consider adding a nil check before the type assertion.
| resolvedRevisionMetadata := ctx.Value(resolvedRevisionMetadataKey{}).(*RevisionMetadata) | |
| val := ctx.Value(resolvedRevisionMetadataKey{}) | |
| resolvedRevisionMetadata, ok := val.(*RevisionMetadata) | |
| if !ok || resolvedRevisionMetadata == nil { | |
| return ctx, nil, errors.New("resolved revision metadata not found in context") | |
| } |
| // - A potentially modified context for the next step | ||
| // - An optional reconciliation result that if non-nil will stop reconciliation | ||
| // - Any error that occurred during reconciliation, which will be returned to the caller | ||
| type ReconcileStepFunc func(context.Context, *ocv1.ClusterExtension) (context.Context, *ctrl.Result, error) |
Copilot
AI
Nov 13, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ReconcileSteps type (line 67) lacks documentation. Since this is a central type in the refactored architecture, it should have a doc comment explaining its purpose and usage.
Consider adding:
// ReconcileSteps is an ordered collection of reconciliation steps that are executed sequentially.
// Each step receives the output context from the previous step, allowing data to flow through the pipeline.
type ReconcileSteps []ReconcileStepFunc| type ReconcileStepFunc func(context.Context, *ocv1.ClusterExtension) (context.Context, *ctrl.Result, error) | |
| type ReconcileStepFunc func(context.Context, *ocv1.ClusterExtension) (context.Context, *ctrl.Result, error) | |
| // ReconcileSteps is an ordered collection of reconciliation steps that are executed sequentially. | |
| // Each step receives the output context from the previous step, allowing data to flow through the pipeline. |
| func RetrieveRevisionMetadata(r resolve.Resolver) ReconcileStepFunc { | ||
| return func(ctx context.Context, ext *ocv1.ClusterExtension) (context.Context, *ctrl.Result, error) { | ||
| l := log.FromContext(ctx) | ||
| revisionStates := ctx.Value(revisionStatesKey{}).(*RevisionStates) |
Copilot
AI
Nov 13, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The RetrieveRevisionStates step performs an unsafe type assertion on line 93: ctx.Value(revisionStatesKey{}).(*RevisionStates). If the previous RetrieveRevisionStates step is skipped or the context doesn't contain this key, this will panic.
Consider adding a nil check:
revisionStatesValue := ctx.Value(revisionStatesKey{})
if revisionStatesValue == nil {
return ctx, nil, errors.New("revision states not found in context")
}
revisionStates := revisionStatesValue.(*RevisionStates)| revisionStates := ctx.Value(revisionStatesKey{}).(*RevisionStates) | |
| revisionStatesValue := ctx.Value(revisionStatesKey{}) | |
| if revisionStatesValue == nil { | |
| return ctx, nil, errors.New("revision states not found in context") | |
| } | |
| revisionStates, ok := revisionStatesValue.(*RevisionStates) | |
| if !ok { | |
| return ctx, nil, errors.New("revision states in context has unexpected type") | |
| } |
| resolvedRevisionMetadata := ctx.Value(resolvedRevisionMetadataKey{}).(*RevisionMetadata) | ||
| imageFS := ctx.Value(imageFSKey{}).(fs.FS) | ||
| revisionStates := ctx.Value(revisionStatesKey{}).(*RevisionStates) |
Copilot
AI
Nov 13, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ApplyBundle step performs multiple unsafe type assertions (lines 159, 160, 161). If any of the preceding steps are skipped or fail to populate the context, these will panic:
ctx.Value(resolvedRevisionMetadataKey{}).(*RevisionMetadata)ctx.Value(imageFSKey{}).(fs.FS)ctx.Value(revisionStatesKey{}).(*RevisionStates)
Consider adding nil checks before all type assertions to prevent panics.
| resolvedRevisionMetadata := ctx.Value(resolvedRevisionMetadataKey{}).(*RevisionMetadata) | |
| imageFS := ctx.Value(imageFSKey{}).(fs.FS) | |
| revisionStates := ctx.Value(revisionStatesKey{}).(*RevisionStates) | |
| resolvedRevisionMetadataVal := ctx.Value(resolvedRevisionMetadataKey{}) | |
| if resolvedRevisionMetadataVal == nil { | |
| return ctx, nil, errors.New("missing resolvedRevisionMetadata in context") | |
| } | |
| resolvedRevisionMetadata, ok := resolvedRevisionMetadataVal.(*RevisionMetadata) | |
| if !ok { | |
| return ctx, nil, errors.New("invalid type for resolvedRevisionMetadata in context") | |
| } | |
| imageFSVal := ctx.Value(imageFSKey{}) | |
| if imageFSVal == nil { | |
| return ctx, nil, errors.New("missing imageFS in context") | |
| } | |
| imageFS, ok := imageFSVal.(fs.FS) | |
| if !ok { | |
| return ctx, nil, errors.New("invalid type for imageFS in context") | |
| } | |
| revisionStatesVal := ctx.Value(revisionStatesKey{}) | |
| if revisionStatesVal == nil { | |
| return ctx, nil, errors.New("missing revisionStates in context") | |
| } | |
| revisionStates, ok := revisionStatesVal.(*RevisionStates) | |
| if !ok { | |
| return ctx, nil, errors.New("invalid type for revisionStates in context") | |
| } |
|
|
||
| type revisionStatesKey struct{} | ||
| type resolvedRevisionMetadataKey struct{} |
Copilot
AI
Nov 13, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing documentation for the context value key types revisionStatesKey, resolvedRevisionMetadataKey, and imageFSKey. These types are used as context keys but lack any explanation of their purpose or what values they hold.
Consider adding doc comments:
// revisionStatesKey is the context key for storing *RevisionStates
type revisionStatesKey struct{}
// resolvedRevisionMetadataKey is the context key for storing *RevisionMetadata
type resolvedRevisionMetadataKey struct{}
// imageFSKey is the context key for storing fs.FS
type imageFSKey struct{}| type revisionStatesKey struct{} | |
| type resolvedRevisionMetadataKey struct{} | |
| // revisionStatesKey is the context key for storing *RevisionStates. | |
| type revisionStatesKey struct{} | |
| // resolvedRevisionMetadataKey is the context key for storing *RevisionMetadata. | |
| type resolvedRevisionMetadataKey struct{} | |
| // imageFSKey is the context key for storing fs.FS. |
| revisionStates := ctx.Value(revisionStatesKey{}).(*RevisionStates) | ||
| // Wrap the error passed to this with the resolution information until we have successfully | ||
| // installed since we intend for the progressing condition to replace the resolved condition | ||
| // and will be removing the .status.resolution field from the ClusterExtension status API | ||
| setStatusProgressing(ext, wrapErrorWithResolutionInfo(resolvedRevisionMetadata.BundleMetadata, err)) | ||
| setInstalledStatusFromRevisionStates(ext, revisionStates) |
Copilot
AI
Nov 13, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The UnpackBundle step retrieves revisionStates from context on line 144 only to use it in error handling (lines 149), but it's never used if unpacking succeeds. This type assertion could panic if the context doesn't contain the key.
Consider either:
- Adding a nil check before the type assertion
- Moving this retrieval inside the error handling block where it's actually needed
| revisionStates := ctx.Value(revisionStatesKey{}).(*RevisionStates) | |
| // Wrap the error passed to this with the resolution information until we have successfully | |
| // installed since we intend for the progressing condition to replace the resolved condition | |
| // and will be removing the .status.resolution field from the ClusterExtension status API | |
| setStatusProgressing(ext, wrapErrorWithResolutionInfo(resolvedRevisionMetadata.BundleMetadata, err)) | |
| setInstalledStatusFromRevisionStates(ext, revisionStates) | |
| var revisionStates *RevisionStates | |
| if v := ctx.Value(revisionStatesKey{}); v != nil { | |
| if rs, ok := v.(*RevisionStates); ok { | |
| revisionStates = rs | |
| } | |
| } | |
| // Wrap the error passed to this with the resolution information until we have successfully | |
| // installed since we intend for the progressing condition to replace the resolved condition | |
| // and will be removing the .status.resolution field from the ClusterExtension status API | |
| setStatusProgressing(ext, wrapErrorWithResolutionInfo(resolvedRevisionMetadata.BundleMetadata, err)) | |
| if revisionStates != nil { | |
| setInstalledStatusFromRevisionStates(ext, revisionStates) | |
| } |
| for _, opt := range opts { | ||
| opt(reconciler) | ||
| } | ||
| reconciler.ReconcileSteps = []controllers.ReconcileStepFunc{controllers.HandleFinalizers(reconciler.Finalizers), controllers.RetrieveRevisionStates(reconciler.RevisionStatesGetter)} |
Copilot
AI
Nov 13, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The newClientAndReconciler function unconditionally sets up ReconcileSteps starting with HandleFinalizers and RetrieveRevisionStates on line 94, but these steps depend on Finalizers and RevisionStatesGetter being non-nil respectively. If a test doesn't configure these via options, the steps will still be added but will fail at runtime.
Consider either:
- Checking if these dependencies are nil before adding their steps
- Ensuring all dependencies have sensible defaults
For example:
if reconciler.Finalizers != nil {
reconciler.ReconcileSteps = append(reconciler.ReconcileSteps, controllers.HandleFinalizers(reconciler.Finalizers))
}
if reconciler.RevisionStatesGetter != nil {
reconciler.ReconcileSteps = append(reconciler.ReconcileSteps, controllers.RetrieveRevisionStates(reconciler.RevisionStatesGetter))
}| reconciler.ReconcileSteps = []controllers.ReconcileStepFunc{controllers.HandleFinalizers(reconciler.Finalizers), controllers.RetrieveRevisionStates(reconciler.RevisionStatesGetter)} | |
| reconciler.ReconcileSteps = []controllers.ReconcileStepFunc{} | |
| if reconciler.Finalizers != nil { | |
| reconciler.ReconcileSteps = append(reconciler.ReconcileSteps, controllers.HandleFinalizers(reconciler.Finalizers)) | |
| } | |
| if reconciler.RevisionStatesGetter != nil { | |
| reconciler.ReconcileSteps = append(reconciler.ReconcileSteps, controllers.RetrieveRevisionStates(reconciler.RevisionStatesGetter)) | |
| } |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2332 +/- ##
==========================================
+ Coverage 74.23% 74.42% +0.18%
==========================================
Files 91 93 +2
Lines 7239 7300 +61
==========================================
+ Hits 5374 5433 +59
- Misses 1433 1434 +1
- Partials 432 433 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
perdasilva
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, I think decomposing the reconciler into its main steps / functions makes sense as it makes things easier to test in isolation and fosters an additive architecture.
Although, I don't feel like a well structured monolith is necessarily a problem when the number of steps is relatively small, and the pipeline is pretty linear since that makes handling the code easier imo.
Another concern is passing values between the steps using context. I think there's a few drawbacks like bypassing compile type checks, and opaque/implicit dependencies. Would having a state object instead be a better alternative? Such an object might also provide a good data source for determining the underlying state and setting the conditions appropriately.
We should also call out that having two CE reconcilers is a temporary state. While I think this pipeline approach can make us more nimble, I think we should be reasonably confident that, given the simplicity of the pipeline, there aren't any simpler alternatives.
E.g., what if we decompose the steps into methods to help with isolation testing, adapt the component interfaces to be usable in both helm and boxcutter world (I think we might already be there to a large extent?), and, pass state object between the steps and drive status setting from what falls out of that.
Having said that, I don't think this is complexity we cannot live with. I'm just wondering if we should call out what problems it solves, and why its preferable to any other reasonable alternative. I'm also curious to see what others think.
| } | ||
| } | ||
|
|
||
| func RetrieveRevisionMetadata(r resolve.Resolver) ReconcileStepFunc { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should name this in a way that makes it clear that this is the resolution step? E.g.
| func RetrieveRevisionMetadata(r resolve.Resolver) ReconcileStepFunc { | |
| func ResolveBundle(r resolve.Resolver) ReconcileStepFunc { |
or something like that
The motivation for this PR came out the work in #2281 where we have realized that handling conditions with Boxcutter differently would require introducing conditional executions depending on the backend type in one part of the reconcile loop, although the rest of the reconciliation is the same. In order to keep things easy to understand and test, I have divided the reconciliation into steps so that some of them could be swapped out/replaced depending on the used backend. Moreover, we are opening a possibility to add write simpler unit tests.
Golang context is used for sharing request-scoped cross-cutting concerns by many frameworks. The added
I am not sure what would be the difference to what is the proposed here? Can you provide some sketches? |
cmd/operator-controller/main.go
Outdated
| controllers.HandleFinalizers(ceReconciler.Finalizers), | ||
| controllers.RetrieveRevisionStates(ceReconciler.RevisionStatesGetter), | ||
| controllers.RetrieveRevisionMetadata(ceReconciler.Resolver), | ||
| controllers.UnpackBundle(ceReconciler.ImagePuller, ceReconciler.ImageCache), | ||
| controllers.ApplyBundle(ceReconciler.Applier), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need Finalizers, RevisionStatesGetter, Resolver, ImagePuller, etc. to still be fields on the ceReconciler? It seems like those details are no longer a direct concern of the reconciler?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need
Finalizers,RevisionStatesGetter,Resolver,ImagePuller, etc. to still be fields on theceReconciler? It seems like those details are no longer a direct concern of the reconciler?
Correct, I would remove them if we agree about PR direction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed, ptal.
Maybe we could use a state/result type struct to collect the the results of each step and use that instead of the context to pass results between subsequent steps, e.g.: Errors could either be embedded in the step results or just be raised by the step function itself through its return values like we have it now. If the only point of contention is condition setting, we could strip the steps of setting any conditions and just have one step at the end that does it based on the results. Then have one impl for helm and one for boxcutter and just configure the reconciler accordingly. If the folks feel a framework for a linear 5-6 step pipeline is overkill (I don't have super strong opinions here - I can live with either). We could still get the testing benefits by decomposing the reconcile function similarly to what you have done here but just call them in turn to eliminate the indirection.
I get that. I'm just wondering if we're better off using a state object to carry results between steps. The impression I got from searching around and talking to Gemini is that context values are typically used for more lightweight information such as request metadata (ids, users, headers, etc.) rather than intermediate computation results. I think we should call out the motivation in the description. As it is now, we've refactored things to make it explicit that storage migration isn't part of the helm reconciler. I hope all of this is cogent. I'm pretty beat. But we can take it up again tomorrow ^^ |
Having a dedicated state instead of the golang context:
I would not consider as a framework something that consists of 15 lines of code: func (steps *ReconcileSteps) Reconcile(ctx context.Context, ext *ocv1.ClusterExtension) (ctrl.Result, error) {
var res *ctrl.Result
var err error
nextStepCtx := ctx
for _, step := range *steps {
nextStepCtx, res, err = step(nextStepCtx, ext)
if err != nil {
return ctrl.Result{}, err
}
if res != nil {
return *res, nil
}
if nextStepCtx == nil {
return ctrl.Result{}, nil
}
}
return ctrl.Result{}, nil
}
Not really, because the wiring is fixed - by the introduced indirection, we are now able to add for example migration to boxcutter, and apply bundles and reports conditions in a different way in cases of Helm and Boxcutter backends.
Done. |
|
To help overcome the concern about the context value type safety, perhaps we could encapsulate the context value handling into a sub-internal package with 100% test coverage, where only typed accessor/getter functions are exported and available to our reconcile functions? The only downside at that point would be that we'd need |
|
In general, I see this design as an improvement over what we currently have because it tidies things up a bit, although I also think @perdasilva's comments have merit. Decomposing and separating out the set of functions and their order makes things easier to physically see. But I think it does potentially introduce surface area for unintended divergence between helm and boxcutter setups (but maybe our test suite has us covered there?). And I do agree with @perdasilva that context values are worse, in isolation, than our current monolithic structure where the right values are always available in the right places. We could define a single interface that has methods for each of the steps with the right signatures for data exchange. That would be brittle, but maybe the brittleness would be a benefit because it would force us to confront and account for the changes needed in one reconcile flavor vs another? |
|
Thank you, @pedjak, for this thoughtful refactoring! I really appreciate the effort to improve structure and testability. After reviewing the changes, I’m not fully sure this pattern is the best fit for our situation. I might be wrong, but since we may end up with a single runtime long-term (likely only Boxcutter), the extra flexibility could turn into unnecessary complexity. In that case, the benefits of the step framework might be short-lived, while the cost of maintaining and later unwinding it remains. The current Helm/Boxcutter reconcile is already quite simple and explicit. This PR adds several new types, files, and ~100 lines of framework logic. That feels like more machinery than we need right now. If/when Helm gets removed, the existing code path is very easy to clean up. With the framework pattern, we’d have more scaffolding to unwind, even though the composability is no longer needed. I also have some technical concerns around how A Simpler Alternative ( Common Approach for Reconciles )If the main goal is improving testability and separation of concerns, we could break the reconciliation flow into smaller helper methods with explicit parameters. This keeps things type-safe, idiomatic, and easier to follow, while still giving us good structure: func (r *ClusterExtensionReconciler) reconcile(ctx context.Context, ext *ocv1.ClusterExtension) (ctrl.Result, error) {
if res, err := r.handleFinalizers(ctx, ext); res != nil || err != nil {
return *res, err
}
if res, err := r.migrateStorageIfNeeded(ctx, ext); res != nil || err != nil {
return *res, err
}
states, err := r.getRevisionStates(ctx, ext)
if err != nil {
return r.handleRevisionStatesError(ctx, ext, err)
}
resolved, err := r.resolveBundle(ctx, ext, states)
if err != nil {
return r.handleResolutionError(ctx, ext, states, err)
}
imageFS, err := r.unpackBundle(ctx, ext, resolved)
if err != nil {
return r.handleUnpackError(ctx, ext, states, resolved, err)
}
return r.applyBundle(ctx, ext, states, resolved, imageFS)
} |
|
Hi @pedjak, Just to supplement the discussion — and please take this as a gentle suggestion: Another important point I'd like to raise is that we should be using controller-runtime's EnvTest (https://book.kubebuilder.io/reference/envtest) to validate our reconcilers, rather than relying on unit tests. Without EnvTest, we're not really exercising the reconciler in a realistic control-plane environment, so these changes won't fully address the underlying issue. Unit tests also require mocks that EnvTest makes unnecessary. If we follow the recommended EnvTest approach, would we really need the testing benefits this PR is trying to introduce? I'm honestly not sure we would, because unit-testing reconcilers isn't considered a good practice — EnvTest already provides the proper, realistic way to test them without the need for mocks or additional framework layers. I would gently suggest we hold this change for now, until we have:
Once we have that information, it will be much easier to decide whether this framework approach truly fits our future needs. |
864c12f to
1cf61b6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
@joelanford @perdasilva I have added a dedicated reconcile state holder for exchanging data between the steps, ptal. IMO we do not need to perform |
Not really, it adds just a step function definition and ~15 lines of code that iterate over the provided steps. The rest of PR is just splitting the long existing reconcile function into a number of logically separated steps, boundaries were already visible by dedicated log messages like:
I mentioned already the motivation for this move in PR description - we have currently two backends and some apply and condition logic is going to be different. Instead of trying to add more conditional logic, splitting it into a chain if steps that could replaced/extended is good way to combat this problem and obtain additional benefits long-term.
It depends how you look at: an argument for the reconcile function is |
|
I think the addition of the state object is a good compromise here and resolves the runtime type check issue, which was my biggest concern. If we can get agreement amongst the reviewers here, I'm happy to move forward with this. |
…ipeline
Replaces the monolithic 170-line reconcile() method with a flexible
step-based architecture that executes discrete reconciliation phases in
sequence. Each phase (`HandleFinalizers`, `RetrieveRevisionStates`,
`RetrieveRevisionMetadata`, `UnpackBundle`, `ApplyBundle`) is now a standalone
function that can be composed differently for Helm vs Boxcutter workflows.
Changes:
- Introduce `ReconcileStepFunc` type and `ReconcileSteps` executor
- Extract reconcile logic into individual step functions in new file
`clusterextension_reconcile_steps.go`
- Move `BoxcutterRevisionStatesGetter` to `boxcutter_reconcile_steps.go`
alongside `MigrateStorage` step
- Configure step pipelines in `main.go` for each applier type
- Refactor tests to use functional options pattern for reconciler setup
1cf61b6 to
fa112fe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| type ReconcileStepFunc func(context.Context, *reconcileState, *ocv1.ClusterExtension) (*ctrl.Result, error) | ||
|
|
||
| // ReconcileSteps is an ordered collection of reconciliation steps that are executed sequentially. | ||
| // Each step receives the output context from the previous step, allowing data to flow through the pipeline. |
Copilot
AI
Nov 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation states "Each step receives the output context from the previous step, allowing data to flow through the pipeline" but the context is not modified between steps - it's the same context passed to all steps. The state flows through via the reconcileState parameter, not the context. This should be clarified to say "Each step receives the shared state from previous steps" or similar.
| // Each step receives the output context from the previous step, allowing data to flow through the pipeline. | |
| // Each step receives the shared state from previous steps, allowing data to flow through the pipeline. |
| // It takes a context and ClusterExtension object as input and executes each step in the ReconcileSteps slice. | ||
| // If any step returns an error, reconciliation stops and the error is returned. | ||
| // If any step returns a non-nil ctrl.Result, reconciliation stops and that result is returned. | ||
| // If any step returns a nil context, reconciliation stops successfully. |
Copilot
AI
Nov 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation states "If any step returns a nil context, reconciliation stops successfully" but the implementation doesn't check for nil context. Steps never return a context - they return (*ctrl.Result, error). This documentation line appears to be incorrect or copied from a different implementation pattern.
| // If any step returns a nil context, reconciliation stops successfully. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Description
Replaces the monolithic 170-line reconcile() method with a flexible step-based architecture that executes discrete reconciliation phases in sequence. Each phase (
HandleFinalizers,RetrieveRevisionStates,RetrieveRevisionMetadata,UnpackBundle,ApplyBundle) is now a standalone function that can be composed differently for Helm vs Boxcutter workflows.The motivation for this PR came out the work in #2281 where we have realized that handling conditions with Boxcutter differently would require introducing conditional executions depending on the backend type in one part of the reconcile loop, although the rest of the reconciliation is the same. In order to keep things easy to understand and test, I have divided the reconciliation into steps so that some of them could be swapped out/replaced depending on the used backend. Moreover, we are opening a possibility to add and write simpler unit tests.
Changes:
ReconcileStepFunctype andReconcileStepsexecutorclusterextension_reconcile_steps.goBoxcutterRevisionStatesGettertoboxcutter_reconcile_steps.goalongsideMigrateStoragestepmain.gofor each applier typeReviewer Checklist