fix(control-plane): retry handler errors with exponential backoff in informer#1182
Merged
markturansky merged 5 commits intoalphafrom Apr 3, 2026
Merged
Conversation
…ovision namespace on project delete - Drop ensureCredentialRoleBindings from kube_reconciler: runner authenticates via BOT_TOKEN (control-plane JWT), not K8s SA token, so binding a non-existent credential:token-reader ClusterRole served no purpose and blocked session start - Fix project_reconciler EventDeleted to call DeprovisionNamespace instead of logging "namespace retained for safety" no-op 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Adds `credentials` as a valid resource type for `acpctl get`, `acpctl delete`, and `acpctl describe`, alongside existing verbs. Aliases: credential, cred, creds. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Adds `kind: Credential` handling to `acpctl apply -f / -k`. Supports create and patch semantics (created/configured/unchanged). Token field expands env vars ($VAR syntax) matching spec usage. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- acpctl agent start now supports -o json returning the session object - demo-github.sh: GitHub credential end-to-end demo script 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
…informer Transient errors (e.g. TSA RoleBinding not yet propagated when the project reconciler runs ensureRunnerSecrets) caused events to be permanently dropped. Add a retryLoop goroutine alongside dispatchLoop. Failed handlers are requeued onto a buffered retryCh with exponential backoff (2s base, 2x per attempt, 30s cap). After retryMaxAttempts (5) the error is logged as permanent and dropped. Retry schedule for a single failure: 2s → 4s → 8s → 16s → 30s. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Contributor
|
Caution Review failedPull request was closed or merged during review 📝 WalkthroughWalkthroughThis PR adds credential management support across acpctl (apply, get, delete, describe commands), introduces event handler retry logic with exponential backoff in the control plane informer, implements namespace deprovisioning on project deletion, and removes credential RoleBinding provisioning during session reconciliation. Changes
Sequence Diagram(s)sequenceDiagram
participant Handler as Event Handler
participant Dispatch as dispatchLoop
participant Retry as retryLoop
participant Scheduler as Timer/Scheduler
Dispatch->>Handler: Execute handler
alt Handler succeeds
Handler-->>Dispatch: nil
else Handler fails
Handler-->>Dispatch: error
Dispatch->>Dispatch: Calculate backoff delay<br/>(exponential with cap)
Dispatch->>Retry: Send retryEvent<br/>(attempt count, fireAt)
Retry->>Scheduler: Wait until fireAt
Scheduler-->>Retry: Timer fires
Retry->>Dispatch: Re-invoke dispatchEvent
Dispatch->>Handler: Execute handler (retry attempt)
end
Important Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 error, 2 warnings)
✅ Passed checks (3 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
✨ Simplify code
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
ensureRunnerSecrets) caused events to be permanently dropped — the informer logged the error and moved on with no retry.retryLoopgoroutine alongsidedispatchLoop. Failed handlers are requeued onto a bufferedretryChwith exponential backoff: 2s → 4s → 8s → 16s → 30s (cap).retryMaxAttempts(5) the error is logged as permanent.RoleBindingisn't propagated by the timeensureRunnerSecretsruns.Test plan
namespace provisionedandambient-runner-secrets createdwithout permanent failurehandler failed, will retrywithattemptandretry_infields, followed by eventual successhandler failed after max retries🤖 Generated with Claude Code
Summary by CodeRabbit
Release Notes
New Features
Improvements