-
Notifications
You must be signed in to change notification settings - Fork 13
fix: respect failure limits and fix completion hooks #725
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughRefines per-metric failure evaluation and completion behavior, prevents duplicate OnVerificationComplete invocations via per-verification tracking, adjusts FailureLimit usage in health-check setup/tests, and extends certain test loops to append one extra measurement. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Disabled knowledge base sources:
📒 Files selected for processing (1)
🧰 Additional context used📓 Path-based instructions (1)apps/workspace-engine/**/*.go📄 CodeRabbit inference engine (apps/workspace-engine/CLAUDE.md)
Files:
🧠 Learnings (1)📓 Common learnings🧬 Code graph analysis (1)apps/workspace-engine/pkg/oapi/oapi.go (2)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
🔇 Additional comments (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
apps/workspace-engine/pkg/oapi/oapi.go (1)
156-191: Dead code:anyFailedis never set to true.The
anyFailedvariable is declared on line 157 but is never assignedtrueanywhere in the function. The check on lines 186-188 is therefore unreachable. If this behavior is intentional (i.e., verification passes as long as no metric exceeds its failure limit), remove the dead code for clarity.func (rv *ReleaseVerification) Status() ReleaseVerificationStatus { if len(rv.Metrics) == 0 { return ReleaseVerificationStatusRunning } allCompleted := true - anyFailed := false for _, metric := range rv.Metrics { // Check if this metric has hit its failure limit failureLimit := metric.GetFailureLimit() failedCount := 0 for _, m := range metric.Measurements { if !m.Passed { failedCount++ } } if failedCount > failureLimit { return ReleaseVerificationStatusFailed } // Check if metric is complete if len(metric.Measurements) < metric.Count { allCompleted = false continue } } // If any metric is incomplete, still running if !allCompleted { return ReleaseVerificationStatusRunning } - // All metrics complete - if anyFailed { - return ReleaseVerificationStatusFailed - } - return ReleaseVerificationStatusPassed }
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (5)
apps/workspace-engine/pkg/oapi/oapi.go(1 hunks)apps/workspace-engine/pkg/workspace/jobdispatch/argocd.go(1 hunks)apps/workspace-engine/pkg/workspace/releasemanager/verification/manager_test.go(2 hunks)apps/workspace-engine/pkg/workspace/releasemanager/verification/scheduler.go(3 hunks)apps/workspace-engine/pkg/workspace/store/release_targets_test.go(0 hunks)
💤 Files with no reviewable changes (1)
- apps/workspace-engine/pkg/workspace/store/release_targets_test.go
🧰 Additional context used
📓 Path-based instructions (2)
apps/workspace-engine/**/*.go
📄 CodeRabbit inference engine (apps/workspace-engine/CLAUDE.md)
apps/workspace-engine/**/*.go: Do not add extraneous inline comments that state the obvious
Do not add comments that simply restate what the code does
Do not add comments for standard Go patterns (e.g., noting WaitGroup or semaphore usage)
Write comments that explain why, document complex logic/algorithms, provide non-obvious context, include TODO/FIXME, and document exported functions/types/methods
Files:
apps/workspace-engine/pkg/workspace/releasemanager/verification/manager_test.goapps/workspace-engine/pkg/workspace/jobdispatch/argocd.goapps/workspace-engine/pkg/workspace/releasemanager/verification/scheduler.goapps/workspace-engine/pkg/oapi/oapi.go
apps/workspace-engine/**/*_test.go
📄 CodeRabbit inference engine (apps/workspace-engine/CLAUDE.md)
Follow the existing test structure used in *_test.go files
Files:
apps/workspace-engine/pkg/workspace/releasemanager/verification/manager_test.go
🧠 Learnings (6)
📓 Common learnings
Learnt from: adityachoudhari26
Repo: ctrlplanedev/ctrlplane PR: 637
File: packages/events/src/kafka/client.ts:10-16
Timestamp: 2025-08-01T04:41:41.345Z
Learning: User adityachoudhari26 prefers not to add null safety checks for required environment variables when they are guaranteed to be present in their deployment configuration, similar to their preference for simplicity over defensive programming in test code.
Learnt from: adityachoudhari26
Repo: ctrlplanedev/ctrlplane PR: 601
File: e2e/tests/api/policies/retry-policy.spec.ts:23-24
Timestamp: 2025-06-24T23:52:50.732Z
Learning: The user adityachoudhari26 prefers not to add null safety checks or defensive programming in test code, particularly in e2e tests, as they prioritize simplicity and focus on the main functionality being tested rather than comprehensive error handling within the test itself.
📚 Learning: 2025-08-12T18:13:54.630Z
Learnt from: CR
Repo: ctrlplanedev/ctrlplane PR: 0
File: apps/workspace-engine/CLAUDE.md:0-0
Timestamp: 2025-08-12T18:13:54.630Z
Learning: Applies to apps/workspace-engine/**/*_test.go : Follow the existing test structure used in *_test.go files
Applied to files:
apps/workspace-engine/pkg/workspace/releasemanager/verification/manager_test.go
📚 Learning: 2025-08-12T18:13:54.630Z
Learnt from: CR
Repo: ctrlplanedev/ctrlplane PR: 0
File: apps/workspace-engine/CLAUDE.md:0-0
Timestamp: 2025-08-12T18:13:54.630Z
Learning: Applies to apps/workspace-engine/pkg/model/selector/**/*_test.go : Write comprehensive, data-driven tests for new condition types
Applied to files:
apps/workspace-engine/pkg/workspace/releasemanager/verification/manager_test.go
📚 Learning: 2025-08-12T18:13:54.630Z
Learnt from: CR
Repo: ctrlplanedev/ctrlplane PR: 0
File: apps/workspace-engine/CLAUDE.md:0-0
Timestamp: 2025-08-12T18:13:54.630Z
Learning: Applies to apps/workspace-engine/pkg/model/selector/**/*_test.go : Use table-driven tests for all condition types
Applied to files:
apps/workspace-engine/pkg/workspace/releasemanager/verification/manager_test.go
📚 Learning: 2025-08-12T18:13:54.630Z
Learnt from: CR
Repo: ctrlplanedev/ctrlplane PR: 0
File: apps/workspace-engine/CLAUDE.md:0-0
Timestamp: 2025-08-12T18:13:54.630Z
Learning: Applies to apps/workspace-engine/pkg/model/selector/**/*_test.go : Include edge cases in tests (empty values, special characters, unicode) for condition types
Applied to files:
apps/workspace-engine/pkg/workspace/releasemanager/verification/manager_test.go
📚 Learning: 2025-08-12T18:13:54.630Z
Learnt from: CR
Repo: ctrlplanedev/ctrlplane PR: 0
File: apps/workspace-engine/CLAUDE.md:0-0
Timestamp: 2025-08-12T18:13:54.630Z
Learning: Applies to apps/workspace-engine/pkg/model/selector/**/*_test.go : Test validation and matching logic separately for condition types
Applied to files:
apps/workspace-engine/pkg/workspace/releasemanager/verification/manager_test.go
🧬 Code graph analysis (3)
apps/workspace-engine/pkg/workspace/jobdispatch/argocd.go (1)
apps/workspace-engine/pkg/oapi/oapi.gen.go (1)
VerificationMetricSpec(756-772)
apps/workspace-engine/pkg/workspace/releasemanager/verification/scheduler.go (2)
apps/workspace-engine/pkg/workspace/store/store.go (1)
Store(42-68)apps/workspace-engine/pkg/workspace/releasemanager/verification/hooks.go (1)
VerificationHooks(8-23)
apps/workspace-engine/pkg/oapi/oapi.go (2)
apps/workspace-engine/pkg/oapi/oapi.gen.go (1)
ReleaseVerificationStatusFailed(118-118)apps/workspace-engine/pkg/workspace/releasemanager/verification/metrics/measurements.go (1)
Measurements(8-8)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Format
- GitHub Check: Typecheck
- GitHub Check: Lint
- GitHub Check: tests
- GitHub Check: build (linux/amd64)
- GitHub Check: workspace-engine-tests
🔇 Additional comments (7)
apps/workspace-engine/pkg/workspace/jobdispatch/argocd.go (1)
338-348: LGTM - Sensible default failure limit for ArgoCD health verification.The addition of
failureLimit = 5provides reasonable tolerance for transient failures during ArgoCD application health checks. WithCount: 5andFailureLimit: 5, this allows all measurements to potentially fail without triggering an immediate verification failure (since the status logic usesfailedCount > failureLimit).apps/workspace-engine/pkg/workspace/releasemanager/verification/manager_test.go (2)
382-390: Test correctly updated to match new failure threshold logic.The loop now iterates
<= FailureLimitto produceFailureLimit + 1failed measurements, which correctly triggers theFailedstatus under the updatedfailedCount > failureLimitcomparison inoapi.go.
433-441: Consistent update for mixed states test.Same adjustment as above, ensuring the failed verification has enough failures to trigger the
Failedstatus with the new threshold logic.apps/workspace-engine/pkg/oapi/oapi.go (1)
169-171: Verify the off-by-one boundary for failure limit.The change from
>=to>means that withFailureLimit: 0(the default meaning "no limit"), the checkfailedCount > 0will fail on the first failure. This may not be the intended behavior for "no limit".Consider whether
FailureLimit: 0should truly mean "no limit" (never fail due to failure count) or "fail on first failure". If "no limit" is intended, this check should skip entirely whenfailureLimit == 0:- if failedCount > failureLimit { + if failureLimit > 0 && failedCount > failureLimit { return ReleaseVerificationStatusFailed }apps/workspace-engine/pkg/workspace/releasemanager/verification/scheduler.go (3)
22-24: Good addition of per-verification completion tracking.The
completionHookFiredmap properly ensuresOnVerificationCompleteis invoked at most once per verification lifecycle, preventing duplicate notifications when multiple metrics complete concurrently.
93-93: Correct cleanup of completion hook state.Removing the
completionHookFiredentry when stopping a verification ensures that if the verification is restarted, the completion hook can fire again appropriately.
313-322: Thread-safe idempotent completion hook invocation.The check and set of
completionHookFired[verificationID]is properly protected by the mutex acquired at line 240, ensuring the hook fires exactly once even with concurrent metric completions.
52c0cc7 to
a9fd4d9
Compare
Summary by CodeRabbit
Bug Fixes
Changes
Tests
✏️ Tip: You can customize this high-level summary in your review settings.