-
Notifications
You must be signed in to change notification settings - Fork 68
⚠️ ClusterExtentionRevision conditions #2340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
✅ Deploy Preview for olmv1 ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors the ClusterExtensionRevision conditions system to better distinguish between availability and progression states. The changes introduce a new "Progressing" condition type and update condition reasons to be more descriptive of the revision lifecycle states.
Key changes:
- Introduces
Progressingcondition type alongside refactoredAvailableandSucceededconditions - Replaces generic condition reasons with specific lifecycle states (e.g.,
RollingOut,RolledOut,Retrying,ProbesSucceeded) - Adds comprehensive E2E tests for ClusterExtensionRevision condition behavior
- Removes enum validation for
CollisionProtectionfield in CRDs
Reviewed Changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| api/v1/clusterextensionrevision_types.go | Refactors condition type constants and adds Progressing type, updates condition reason constants to reflect specific lifecycle states |
| internal/operator-controller/controllers/clusterextensionrevision_controller.go | Implements condition setting logic with new helper functions and updates reconciliation flow to properly set Progressing/Available conditions |
| internal/operator-controller/controllers/clusterextensionrevision_controller_test.go | Updates unit tests to validate new condition reasons and adds test coverage for error scenarios |
| internal/operator-controller/applier/boxcutter.go | Adds handling for Retrying reason in progressing condition |
| test/e2e/cluster_extension_revision_test.go | Adds comprehensive E2E test covering revision conditions, availability probes, and archiving |
| test/e2e/e2e_suite_test.go | Adds Kubernetes clientset for pod exec operations in E2E tests |
| manifests/experimental.yaml, manifests/experimental-e2e.yaml, helm/olmv1/base/operator-controller/crd/experimental/olm.operatorframework.io_clusterextensionrevisions.yaml | Adds Progressing column to kubectl output and removes CollisionProtection enum validation |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
793f821 to
ae7d78d
Compare
ae7d78d to
8a492cb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
internal/operator-controller/controllers/clusterextensionrevision_controller.go
Show resolved
Hide resolved
8a492cb to
033aedf
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2340 +/- ##
==========================================
- Coverage 74.23% 70.99% -3.25%
==========================================
Files 91 91
Lines 7239 7199 -40
==========================================
- Hits 5374 5111 -263
- Misses 1433 1656 +223
Partials 432 432
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
033aedf to
537ad96
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 9 out of 9 changed files in this pull request and generated no new comments.
Comments suppressed due to low confidence (2)
internal/operator-controller/controllers/clusterextensionrevision_controller_test.go:915
- Corrected spelling of 'InTransistion' to 'InTransition'.
func (m mockRevisionResult) InTransistion() bool {
internal/operator-controller/controllers/clusterextensionrevision_controller_test.go:953
- Corrected spelling of 'InTransistion' to 'InTransition'.
func (m mockPhaseResult) InTransistion() bool {
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
537ad96 to
d9d9924
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 15 out of 15 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
internal/operator-controller/controllers/clusterextensionrevision_controller.go
Outdated
Show resolved
Hide resolved
internal/operator-controller/controllers/clusterextensionrevision_controller.go
Show resolved
Hide resolved
internal/operator-controller/controllers/clusterextensionrevision_controller.go
Show resolved
Hide resolved
internal/operator-controller/controllers/clusterextensionrevision_controller.go
Show resolved
Hide resolved
|
|
||
| func setRetryingConditions(cer *ocv1.ClusterExtensionRevision, message string) { | ||
| markAsProgressing(cer, ocv1.ClusterExtensionRevisionReasonRetrying, message) | ||
| markAsAvailableUnknown(cer, message) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if Available is not already present on status we should not introduce it with unknown status.
0d09d95 to
31dfacc
Compare
internal/operator-controller/controllers/clusterextensionrevision_controller.go
Show resolved
Hide resolved
31dfacc to
12cfcc5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 15 out of 15 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
internal/operator-controller/controllers/clusterextensionrevision_controller.go
Outdated
Show resolved
Hide resolved
internal/operator-controller/controllers/clusterextensionrevision_controller.go
Outdated
Show resolved
Hide resolved
12cfcc5 to
c3c0e14
Compare
c3c0e14 to
fcb2a0f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 15 out of 15 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if err != nil { | ||
| if rres != nil { | ||
| l.Error(err, "revision reconcile failed") | ||
| l.V(1).Info("reconcile failure report", "report", rres.String()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@perdasilva here we need keep debug logs.
I will clarify that in the code and create a test
OR we must remove those reports they are too verbose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pedjak @perdasilva You both have been working with this more recently.
I also don’t find the report very valuable in the logs — it’s confusing and hard to read.
What do you think about removing the report entirely?
And instead, formatting the message in a clearer way, similar to what we do in other places, so it becomes more meaningful? It would either solve the bug above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've re-added it (but only in debug mode). Let's leave it like that for this PR and address this separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got up and was looking at this today.
I’m thinking the best solution might be similar to what we do for crddiff.
We need a format to extract the report information and normalize it into both machine-readable messages and human-readable logs. Then we can use it for any situation—info or error—while normalizing and stripping out only the data that is actually validatable.
This definitely needs to be done in a separate PR, and we’ll need to discuss a better long-term approach for handling this.
Thank you for the attention !!!
c/c @pedjak
fcb2a0f to
2239a2c
Compare
2239a2c to
68a0700
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 15 out of 15 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Per Goncalves da Silva <pegoncal@redhat.com> Co-authored-by: Predrag Knezevic <pknezevi@redhat.com Signed-off-by: Per Goncalves da Silva <pegoncal@redhat.com>
68a0700 to
98b7083
Compare
| // When Progressing is True and the Reason is Retrying, the ClusterExtension has encountered an error that could be resolved on subsequent reconciliation attempts. | ||
| // When Progressing is False and the Reason is Blocked, the ClusterExtension has encountered an error that requires manual intervention for recovery. | ||
| // <opcon:experimental:description> | ||
| // When Progressing is True and Reason is RolloutInProgress, the ClusterExtension has one or more ClusterExtensionRevisions in active roll out. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not unifying here to have the same reason on the extension and revision? Right now we have RolloutInProgress on the extension and on the revision we have RollingOut,
|
|
||
| // ClusterExtensionRevision is the Schema for the clusterextensionrevisions API | ||
| // +kubebuilder:printcolumn:name="Available",type=string,JSONPath=`.status.conditions[?(@.type=='Available')].status` | ||
| // +kubebuilder:printcolumn:name="Progressing",type=string,JSONPath=`.status.conditions[?(@.type=='Progressing')].status` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
without exposing the reason as well, status alone does not tell us enough.
| case ocv1.ClusterExtensionRevisionReasonRetrying: | ||
| return false, "", errors.New(progressingCondition.Message) | ||
| } | ||
| return false, progressingCondition.Message, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please move this line into the switch above as detault option.
Description
Summary
This PR refactors the ClusterExtensionRevision status conditions to provide clearer, more actionable feedback about revision lifecycle states. The changes introduce a new Progressing condition, simplify condition reasons, and ensure that revision-level errors (retrying states) are properly surfaced to the parent ClusterExtension.
This PR does introduce API breaking changes. But, only to experimental APIs. It should be ok to override.
Key Changes
Simplified condition types:
Updated condition reasons to be more semantic and actionable:
API Changes:
Modified the Boxcutter applier to propagate retrying errors from ClusterExtensionRevision to the parent ClusterExtension. When a revision is in a Progressing=True state with Reason=Retrying, the error is now surfaced to the ClusterExtension, providing better visibility into failed reconciliation attempts. When the revision is Progressing=True/Succeeded the applier returns success.
Added end-to-end tests for ClusterExtensionRevision lifecycle scenarios to validate the new condition states and transitions.
ProgressionreasonClusterExtensionRevision Status Conditions
Available Condition
Indicates whether the revision's objects are available and passing the object status probes.
Progressing Condition
Indicates whether the revision is progressing to its next state. It follows the same semantic as the ClusterExtension and Deployment
Progressingcondition.Not a part of this PR yet, but
Progressingcan also beBlockedfor non-retryable errors.Succeeded Condition
Indicates whether the revision has successfully completed its rollout.
Migration Notes
Reviewer Checklist