HYPERFLEET-849 - feat: lifecycle delete for resources phase#104
Conversation
|
Skipping CI for Draft Pull Request. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds conditional resource deletion via a new resource lifecycle ( Sequence Diagram(s)sequenceDiagram
participant Exec as Executor
participant CELEval as CEL Evaluator
participant Transport as TransportClient
participant API as K8s/Maestro API
rect rgba(100,150,200,0.5)
Exec->>Transport: DiscoverResources()
Transport->>API: GET resources
API-->>Transport: objects / NotFound
Transport-->>Exec: discovered map
end
rect rgba(150,100,200,0.5)
Exec->>CELEval: evaluateLifecycleDeleteWhen(expression, execCtx)
CELEval-->>Exec: true / false / error
end
rect rgba(200,150,100,0.5)
alt CEL true (delete)
Exec->>Transport: DeleteResource(gvk, ns, name, DeleteOptions, target)
Transport->>API: DELETE (uses propagationPolicy for K8s)
API-->>Transport: OK / NotFound / async state
Exec->>Transport: DiscoverResources() (post-delete verify)
else CEL false (apply)
Exec->>Transport: ApplyResource(manifest)
Transport->>API: CREATE/PATCH
API-->>Transport: OK
end
end
rect rgba(150,200,100,0.5)
Transport-->>Exec: final discovered state (nil if removed)
Exec->>Exec: update execCtx.Resources, record ResourceErrors, set adapter.executionError (first failure)
Exec->>API: POST status update (Health/Finalized/resourceErrors)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
84afe44 to
5b129e3
Compare
There was a problem hiding this comment.
Actionable comments posted: 14
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/executor/types.go (1)
364-390:⚠️ Potential issue | 🟠 MajorRemove unreachable nil handling from trace or preserve nil values in the CEL map.
The code intentionally omits deleted resources (nil values) from the CEL
resourcesmap, making missing keys returnOptional.none()for optional access. However,internal/dryrun/trace.go:223explicitly handlesval == nil, which is unreachable sinceGetCELVariables()never includes nil values. Either remove the dead nil branch in trace.go, or preserve nil values in the resources map to match the trace.go expectations and enableresources.X == nullsemantics.The current omission-based design is consistent with all existing CEL expressions (
!resources.?clusterManifestWork.hasValue()throughout the config examples), but the trace.go code suggests an intention to support nil values that isn't actually fulfilled.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/executor/types.go` around lines 364 - 390, GetCELVariables currently omits nil entries so trace.go's nil branch is unreachable; update the resources map construction in GetCELVariables (the loop building "resources" and the assignment result["resources"]) to preserve deleted resources as explicit nil values (i.e., when val == nil set resources[name] = nil, and for nested maps include nested[nestedName] = nil when nestedRes == nil) so internal/dryrun/trace.go's val == nil handling works, or alternatively remove the dead nil-check in trace.go—prefer preserving nils to keep existing CEL semantics and enable resources.X == null checks.
🧹 Nitpick comments (1)
internal/maestroclient/client.go (1)
456-469: Add an explicit empty-name guard before deleting ManifestWork.
nameis used directly as the ManifestWork identifier; validating it early prevents ambiguous backend errors and makes failure mode deterministic.Proposed patch
func (c *Client) DeleteResource( ctx context.Context, _ schema.GroupVersionKind, _, name string, _ *transportclient.DeleteOptions, target transportclient.TransportContext, ) error { + if strings.TrimSpace(name) == "" { + return fmt.Errorf("resource name is required for DeleteResource") + } + transportCtx := c.resolveTransportContext(target) if transportCtx == nil || transportCtx.ConsumerName == "" { return fmt.Errorf( "maestro TransportContext with ConsumerName is required for DeleteResource") } return c.DeleteManifestWork(ctx, transportCtx.ConsumerName, name) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/maestroclient/client.go` around lines 456 - 469, The DeleteResource implementation uses the incoming name directly as the ManifestWork identifier; add an explicit guard at the top of DeleteResource to return a clear error when name is empty (e.g., "ManifestWork name is required") before resolving transport context or calling DeleteManifestWork so you fail fast and avoid ambiguous backend errors; reference the DeleteResource function, the name parameter, and the DeleteManifestWork call when applying this check.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@charts/examples/maestro/adapter-task-config.yaml`:
- Around line 187-202: The Deleted gate/expressions currently treat absence of
agentNamespaceManifestWork as Deleted even when no deletion was requested;
update the three expressions (status, reason, message) that use
resources.?agentNamespaceManifestWork.hasValue() so they also require the
deletion flag (resources.?is_deleting) to be true before emitting
Deleted/ManifestWorkDeleted messages—e.g., change the ternary condition to check
both not hasValue() AND resources.?is_deleting (instead of just not hasValue());
keep the existing True/False, ManifestWorkDeleted/WaitingForMaestroDeletion and
the two message strings but only return the Deleted branch when
resources.?is_deleting is true.
In `@docs/adapter-authoring-guide.md`:
- Around line 739-747: The fenced code block containing the Reconciliation
example is missing a language label, triggering markdownlint rule MD040; update
the block's opening fence to include a language (use "text") so it becomes
```text instead of just ```, i.e. modify the fenced block that starts with the
lines "Reconciliation 1 (is_deleting=true):" / "Reconciliation 2
(configMapResource gone):" to use ```text as the opener.
In `@internal/configloader/validator.go`:
- Around line 618-620: The code calls v.validateCELExpression(...) during
ValidateSemantic() even when v.celEnv may be nil; guard that call by checking
v.celEnv before invoking v.validateCELExpression (e.g., if v.celEnv == nil then
record/append a validation error for the lifecycle when-expression or skip CEL
validation), referencing ValidateSemantic, v.celEnv, v.validateCELExpression,
and the lifecycle fields (FieldLifecycleWhen, FieldExpression, basePath) so the
method never dereferences a nil CEL environment.
In `@internal/criteria/evaluator.go`:
- Around line 221-224: Calls to ExtractValue() in post_action_executor.go and
precondition_executor.go currently only inspect the parse error and ignore the
in-band extraction error stored in result.Error; update both call sites so that
immediately after calling ExtractValue(...) you check if result.Error != nil and
propagate or return that error (same behavior as you do for the parse error)
instead of continuing with result.Value == nil, ensuring the caller handles
missing-field conditions consistently; reference the ExtractValue(...) call and
the result variable at each call site and align the error handling path with
existing parse-error handling logic.
In `@internal/dryrun/recording_transport_client.go`:
- Around line 61-67: The code stores unstructured.Unstructured objects backed by
the caller's map (overrides) and later DeleteResource mutates those stored
objects; fix by storing deep copies instead of the original backing map: after
building the unstructured with u := &unstructured.Unstructured{Object: obj} call
u = u.DeepCopy() (or store u.DeepCopy() into the resources map) so the stored
value is independent, and apply the same DeepCopy change to the other creation
sites mentioned (the other overrides population paths that create
unstructured.Unstructured before insertion) to prevent shared state bleeding
between runs; reference symbols: resources map, unstructured.Unstructured,
resourceKey, DeleteResource, and the overrides variable/DiscoveryOverrides
usage.
In `@internal/dryrun/trace.go`:
- Around line 190-194: The code is serializing and printing
rr.DiscoveredState.Object unconditionally, which can leak secrets; change the
logic to only marshal and pretty-print DiscoveredState when the tracer is in
verbose mode (check the tracer's Verbose flag) and skip or redact this output
otherwise; update the block that references rr.DiscoveredState,
DiscoveredState.Object and prettyJSON to be guarded by that Verbose check, and
make the same change to the other occurrence that prints pre-delete state (the
block at the other similar location).
In `@internal/executor/resource_executor.go`:
- Around line 647-663: The ResourceExecutor.recordResourceError helper currently
populates execCtx.Adapter.ResourceErrors but is only invoked by delete paths;
update the apply paths to call this helper so apply-time failures are recorded
too: when ApplyResource (or its wrapper like
executeResourceApply/executeResource) returns an error, call
re.recordResourceError(execCtx, resource, err) before returning, and likewise
call it when post-apply discovery fails so
execCtx.Adapter.ResourceErrors[resource.Name] is populated; ensure you reference
the existing recordResourceError, ExecutionContext.Adapter.ResourceErrors, and
the ApplyResource/error-handling sites to add these calls.
- Around line 507-514: The evaluateLifecycleDeleteWhen method in
ResourceExecutor should fail fast when a delete.when block is present but
missing an expression: instead of returning false, nil when
resource.Lifecycle.Delete.When is non-nil but
resource.Lifecycle.Delete.When.Expression == "" (current behavior), return a
descriptive error indicating the invalid lifecycle.delete.when configuration
(include resource identity from the provided resource or execCtx to aid
debugging). Update evaluateLifecycleDeleteWhen to check for a non-nil
Delete.When with an empty Expression and return an error rather than silently
treating it as no-delete.
- Around line 567-586: discoverResource can return (nil, nil) to indicate “no
discovery configured,” but current code treats any discovered==nil as already
deleted and skips DeleteResource; change the branching so only an actual
NotFound error (isNotFound == true) triggers marking the resource deleted and
returning success. In the block around discoverResource / isNotFound, keep the
existing behavior for discoverErr != nil && !isNotFound (fail), handle
isNotFound as the "already gone" case (set execCtx.Resources[resource.Name] =
nil, set OperationReason, log, return), and do NOT treat discovered == nil &&
discoverErr == nil as deleted—allow execution to continue to the DeleteResource
flow. Update the logic that references discoverResource in resource_executor.go
(the discoverResource call and the returned discovered/isNotFound checks)
accordingly.
In `@internal/k8sclient/apply.go`:
- Around line 130-132: The case for manifest.OperationDelete inside
ApplyManifest currently no-ops and can mask routing bugs; change it to fail fast
by returning a clear error instead of silent success. In the switch case for
manifest.OperationDelete within ApplyManifest, remove the empty block and return
an error (e.g. using fmt.Errorf) stating that delete operations should be
handled via DeleteResource, including relevant context (resource name/kind) if
available; do not attempt to perform deletion here. This ensures misrouted
delete requests are surfaced rather than silently ignored.
In `@test/testdata/dryrun/cel-showcase/dryrun-cel-showcase.sh`:
- Around line 1-2: Add a UNIX shebang to the top of the dryrun-cel-showcase.sh
script so it can be executed directly; edit the file containing the line
starting with "go run ./cmd/adapter/main.go serve \" (dryrun-cel-showcase.sh) to
insert a shebang like "#!/usr/bin/env bash" as the first line and ensure the
file is made executable (chmod +x) in the repository or CI as needed.
In `@test/testdata/dryrun/kubernetes/dryrun-kubernetes.sh`:
- Around line 1-9: Add a shebang as the very first line of the
dryrun-kubernetes.sh script (use #!/usr/bin/env bash) so the script can be
executed directly, ensure it remains before the existing go run command, and
then mark the file executable (chmod +x dryrun-kubernetes.sh) as part of the
fix; no other changes to the go run invocation or flags are required.
In `@test/testdata/dryrun/maestro-delete/dryrun-maestro-delete.sh`:
- Around line 1-3: This script lacks a shebang, so make it executable by adding
a proper interpreter header (e.g., a bash/sh shebang) at the very top of the
file before the existing comments and the `go run ./cmd/adapter/main.go serve`
invocation; ensure the shebang is the first line and use a portable form such as
`/usr/bin/env bash` so the script can be run directly.
In `@test/testdata/dryrun/maestro/dryrun-maestro.sh`:
- Around line 1-3: Add a POSIX-compatible shebang as the first line of the
dryrun-maestro.sh script (use a portable form that invokes bash via env) so the
file runs reliably when executed, then ensure the script is executable (chmod
+x); leave the existing comment lines and the "go run ./cmd/adapter/main.go
serve" invocation unchanged.
---
Outside diff comments:
In `@internal/executor/types.go`:
- Around line 364-390: GetCELVariables currently omits nil entries so trace.go's
nil branch is unreachable; update the resources map construction in
GetCELVariables (the loop building "resources" and the assignment
result["resources"]) to preserve deleted resources as explicit nil values (i.e.,
when val == nil set resources[name] = nil, and for nested maps include
nested[nestedName] = nil when nestedRes == nil) so internal/dryrun/trace.go's
val == nil handling works, or alternatively remove the dead nil-check in
trace.go—prefer preserving nils to keep existing CEL semantics and enable
resources.X == null checks.
---
Nitpick comments:
In `@internal/maestroclient/client.go`:
- Around line 456-469: The DeleteResource implementation uses the incoming name
directly as the ManifestWork identifier; add an explicit guard at the top of
DeleteResource to return a clear error when name is empty (e.g., "ManifestWork
name is required") before resolving transport context or calling
DeleteManifestWork so you fail fast and avoid ambiguous backend errors;
reference the DeleteResource function, the name parameter, and the
DeleteManifestWork call when applying this check.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: a9399057-fb52-4dd4-b0ed-692bb9496b15
📒 Files selected for processing (59)
charts/README.mdcharts/examples/kubernetes/adapter-task-config.yamlcharts/examples/kubernetes/values.yamlcharts/examples/maestro-two-resources/README.mdcharts/examples/maestro-two-resources/adapter-config.yamlcharts/examples/maestro-two-resources/adapter-task-config.yamlcharts/examples/maestro-two-resources/adapter-task-resource-manifestwork-configmap.yamlcharts/examples/maestro-two-resources/adapter-task-resource-manifestwork-namespace.yamlcharts/examples/maestro-two-resources/values.yamlcharts/examples/maestro/adapter-task-config.yamlcharts/examples/maestro/values.yamlconfigs/adapter-task-config-template.yamldocs/adapter-authoring-guide.mdinternal/configloader/constants.gointernal/configloader/types.gointernal/configloader/validator.gointernal/configloader/validator_test.gointernal/criteria/evaluator.gointernal/criteria/evaluator_test.gointernal/dryrun/recording_transport_client.gointernal/dryrun/recording_transport_client_test.gointernal/dryrun/trace.gointernal/dryrun/trace_test.gointernal/executor/README.mdinternal/executor/executor_test.gointernal/executor/post_action_executor.gointernal/executor/precondition_executor.gointernal/executor/resource_executor.gointernal/executor/resource_executor_test.gointernal/executor/types.gointernal/executor/utils.gointernal/k8sclient/apply.gointernal/k8sclient/client.gointernal/k8sclient/interface.gointernal/k8sclient/mock.gointernal/maestroclient/client.gointernal/manifest/generation.gointernal/transportclient/interface.gointernal/transportclient/types.gotest/integration/executor/setup_test.gotest/integration/k8sclient/client_integration_test.gotest/testdata/dryrun/cel-showcase/dryrun-cel-showcase-api-responses.jsontest/testdata/dryrun/cel-showcase/dryrun-cel-showcase-discovery.jsontest/testdata/dryrun/cel-showcase/dryrun-cel-showcase-task-config.yamltest/testdata/dryrun/cel-showcase/dryrun-cel-showcase.shtest/testdata/dryrun/dryrun-delete-api-responses.jsontest/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-discovery.jsontest/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-task-config.yamltest/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete.shtest/testdata/dryrun/kubernetes/dryrun-kubernetes-adatepr-task-config-invalid.yamltest/testdata/dryrun/kubernetes/dryrun-kubernetes-discovery.jsontest/testdata/dryrun/kubernetes/dryrun-kubernetes-task-config.yamltest/testdata/dryrun/kubernetes/dryrun-kubernetes.shtest/testdata/dryrun/maestro-delete/dryrun-maestro-delete-discovery.jsontest/testdata/dryrun/maestro-delete/dryrun-maestro-delete-task-config.yamltest/testdata/dryrun/maestro-delete/dryrun-maestro-delete.shtest/testdata/dryrun/maestro/dryrun-maestro-adapter-task-config.yamltest/testdata/dryrun/maestro/dryrun-maestro-discovery.jsontest/testdata/dryrun/maestro/dryrun-maestro.sh
💤 Files with no reviewable changes (1)
- internal/k8sclient/interface.go
5b129e3 to
2b61ef5
Compare
There was a problem hiding this comment.
Actionable comments posted: 6
♻️ Duplicate comments (2)
internal/executor/resource_executor.go (2)
97-105:⚠️ Potential issue | 🟠 MajorPopulate
adapter.resourceErrorson apply-side failures too.Only delete failures call
recordResourceError(). Render/transport-target/apply/post-discovery/collision failures still only setadapter.executionError, so post-action CEL loses per-resource detail for the common resource-phase failures.Suggested change
if tplErr != nil { result.Status = StatusFailed result.Error = tplErr + re.recordResourceError(execCtx, resource, tplErr) return result, NewExecutorError(PhaseResources, resource.Name, "failed to render targetCluster template", tplErr) } @@ if err != nil { result.Status = StatusFailed result.Error = err + re.recordResourceError(execCtx, resource, err) return result, NewExecutorError(PhaseResources, resource.Name, "failed to render manifest", err) } @@ if err != nil { result.Status = StatusFailed result.Error = err + re.recordResourceError(execCtx, resource, err) execCtx.Adapter.ExecutionError = &ExecutionError{ @@ if discoverErr != nil { result.Status = StatusFailed result.Error = discoverErr + re.recordResourceError(execCtx, resource, discoverErr) execCtx.Adapter.ExecutionError = &ExecutionError{ @@ result.Status = StatusFailed result.Error = collisionErr + re.recordResourceError(execCtx, resource, collisionErr) execCtx.Adapter.ExecutionError = &ExecutionError{Also applies to: 131-164, 176-190, 216-227
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/executor/resource_executor.go` around lines 97 - 105, The review points out apply-side failures (rendering transport target, transport target creation, apply, post-discovery, collision) only set adapter.executionError instead of recording per-resource details; update the failure handling in resource_executor.go (the blocks around the TargetCluster render where tplErr is handled, and the other error paths at lines referenced for apply/post-discovery/collision) to call recordResourceError(adapter, resource.Name, error) (or the existing recordResourceError invocation used by deletes) whenever an error occurs, and ensure adapter.resourceErrors is populated consistently instead of only setting adapter.executionError; keep existing error returns and NewExecutorError usages but add the recordResourceError call before returning so post-action CEL can see per-resource errors.
512-516:⚠️ Potential issue | 🟠 MajorTreat
delete:withoutwhen:as invalid too.
Delete.When == nilstill falls back to apply here. If semantic validation is skipped, a malformed delete config recreates the resource during teardown instead of failing fast.Suggested change
if resource.Lifecycle.Delete.When == nil { - return false, nil + return false, fmt.Errorf( + "resource %q has lifecycle.delete configured but when.expression is missing", + resource.Name, + ) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/executor/resource_executor.go` around lines 512 - 516, The code currently only errors when Delete.When exists but Expression is empty, allowing a Delete block without When to be treated as valid; update the validation in the lifecycle handling (the code referencing resource.Lifecycle.Delete and resource.Lifecycle.Delete.When/Expression) to treat a non-nil resource.Lifecycle.Delete with a nil When as an error — i.e., if resource.Lifecycle.Delete != nil && resource.Lifecycle.Delete.When == nil return an error (similar to the existing message) so malformed delete configs fail fast instead of falling back to apply.
🧹 Nitpick comments (2)
internal/executor/README.md (1)
323-323: Clarify the exact DSL key path for delete conditions.Line 323 should explicitly reference
lifecycle.delete.when.expression(instead oflifecycle.delete.when) to avoid config ambiguity.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/executor/README.md` at line 323, Update the README table entry that currently references the delete condition key path `lifecycle.delete.when` to the more explicit `lifecycle.delete.when.expression`; change the line describing the delete rule so it reads that the `delete` action triggers when `lifecycle.delete.when.expression` evaluates to true, ensuring the docs use the exact DSL key path (`lifecycle.delete.when.expression`) to avoid ambiguity.charts/examples/maestro-two-resources/README.md (1)
26-37: Consider adding language identifier to fenced code block.The reconciliation flow diagram could use a language specifier (e.g.,
```text) for consistency with markdown lint rules, though this is purely stylistic for non-code content.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@charts/examples/maestro-two-resources/README.md` around lines 26 - 37, Add a language identifier to the fenced code block containing the reconciliation flow (the block that starts with "Reconciliation 1 (is_deleting=true):") by changing the opening fence from ``` to ```text so markdown linters recognize it as a plain-text block; ensure the closing fence remains ``` and leave the block contents unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@charts/examples/kubernetes/adapter-task-config.yaml`:
- Around line 259-290: The "Deleted" status expression currently returns True
whenever resource references (clusterNamespace, jobServiceAccount, job_role,
job_rolebinding, jobNamespace, deploymentNamespace) are absent; update the
status, reason, and message expressions in the "Deleted" condition to also
require a deletion guard (e.g., an is_deleting boolean) so the condition only
becomes True when deletion was actually requested; specifically, add an explicit
check like resources.?is_deleting.hasValue() && resources.is_deleting == true
(or the equivalent path your schema uses) combined with the existing all-empty
checks in the expressions for type "Deleted" so the status, reason
("ResourcesDeleted"/"DeletionInProgress"), and message only flip to the deleted
state when is_deleting is true.
In `@internal/dryrun/recording_transport_client.go`:
- Around line 175-216: DeleteResource currently ignores opts and always removes
K8s-targeted resources immediately; update DeleteResource on
DryrunTransportClient to honor opts.PropagationPolicy by treating a K8s delete
as asynchronous when the propagation policy is Foreground: if target == nil and
opts != nil and opts.PropagationPolicy (pointer) equals
metav1.DeletePropagationForeground, set deletionTimestamp on the stored object
(using resourceKey and c.resources) and keep it in the store; otherwise
(Background/Orphan or nil) perform the immediate delete as before and still
append the TransportRecord.
In `@internal/dryrun/trace_test.go`:
- Around line 150-151: The current asserts use compact JSON literals and can
miss differently formatted bodies; replace the two assert.NotContains(t, output,
`{"request":"data"}`) and assert.NotContains(t, output, `{"response":"data"}`)
checks with formatting-independent assertions that match the JSON keys
regardless of whitespace/formatting — for example use assert.NotRegexp with
regexp.MustCompile(`"request"\s*:\s*"data"`) and
regexp.MustCompile(`"response"\s*:\s*"data"`) against the output (referencing
the output variable and the existing assert.NotContains calls) so the test fails
if those body keys/values appear in any spacing/indentation.
In `@internal/executor/resource_executor.go`:
- Around line 47-52: preDiscoverAll is being called unconditionally and causes
an extra discovery roundtrip for every reconciliation; change the caller so
preDiscoverAll(ctx, resources, execCtx) runs only when any resource actually
needs delete-capable evaluation. Implement a short predicate before the call
that scans the resources slice (e.g., for _, r := range resources) and returns
true if a resource has a delete lifecycle configured (check r.Lifecycle != nil
&& r.Lifecycle.Delete != nil) or otherwise indicates delete-capable execution in
execCtx; only invoke re.preDiscoverAll(...) when that predicate is true. Keep
the preDiscoverAll implementation and its non-fatal error behavior unchanged.
- Around line 487-494: The pre-discovery block in re.discoverResource currently
swallows non-NotFound errors (logging and continuing), which causes
resources.?X.hasValue() to be false on unknown/error and can wrongly allow
deletes; change re.discoverResource (and its callers) to return non-NotFound
errors up instead of continuing, and update ExecuteAll to check for and handle
discovery errors before evaluating lifecycle.delete.when so execution
fails-closed on RBAC/network/other errors; reference re.discoverResource,
ExecuteAll, and lifecycle.delete.when when making the change.
In
`@test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-task-config.yaml`:
- Around line 11-15: Update the dry-run example command in the header comment so
its --task-config, --dry-run-api-responses and --dry-run-discovery flags point
to the files that live in this scenario's directory (kubernetes-delete) rather
than the old pre-reorg paths; replace the three example paths with the matching
filenames in this directory (dryrun-kubernetes-delete-task-config.yaml,
dryrun-delete-api-responses.json, dryrun-kubernetes-delete-discovery.json) so
the example command correctly locates the scenario files.
---
Duplicate comments:
In `@internal/executor/resource_executor.go`:
- Around line 97-105: The review points out apply-side failures (rendering
transport target, transport target creation, apply, post-discovery, collision)
only set adapter.executionError instead of recording per-resource details;
update the failure handling in resource_executor.go (the blocks around the
TargetCluster render where tplErr is handled, and the other error paths at lines
referenced for apply/post-discovery/collision) to call
recordResourceError(adapter, resource.Name, error) (or the existing
recordResourceError invocation used by deletes) whenever an error occurs, and
ensure adapter.resourceErrors is populated consistently instead of only setting
adapter.executionError; keep existing error returns and NewExecutorError usages
but add the recordResourceError call before returning so post-action CEL can see
per-resource errors.
- Around line 512-516: The code currently only errors when Delete.When exists
but Expression is empty, allowing a Delete block without When to be treated as
valid; update the validation in the lifecycle handling (the code referencing
resource.Lifecycle.Delete and resource.Lifecycle.Delete.When/Expression) to
treat a non-nil resource.Lifecycle.Delete with a nil When as an error — i.e., if
resource.Lifecycle.Delete != nil && resource.Lifecycle.Delete.When == nil return
an error (similar to the existing message) so malformed delete configs fail fast
instead of falling back to apply.
---
Nitpick comments:
In `@charts/examples/maestro-two-resources/README.md`:
- Around line 26-37: Add a language identifier to the fenced code block
containing the reconciliation flow (the block that starts with "Reconciliation 1
(is_deleting=true):") by changing the opening fence from ``` to ```text so
markdown linters recognize it as a plain-text block; ensure the closing fence
remains ``` and leave the block contents unchanged.
In `@internal/executor/README.md`:
- Line 323: Update the README table entry that currently references the delete
condition key path `lifecycle.delete.when` to the more explicit
`lifecycle.delete.when.expression`; change the line describing the delete rule
so it reads that the `delete` action triggers when
`lifecycle.delete.when.expression` evaluates to true, ensuring the docs use the
exact DSL key path (`lifecycle.delete.when.expression`) to avoid ambiguity.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: dcc4f53e-a2df-4ad4-af76-6a6ba775ebf8
📒 Files selected for processing (59)
charts/README.mdcharts/examples/kubernetes/adapter-task-config.yamlcharts/examples/kubernetes/values.yamlcharts/examples/maestro-two-resources/README.mdcharts/examples/maestro-two-resources/adapter-config.yamlcharts/examples/maestro-two-resources/adapter-task-config.yamlcharts/examples/maestro-two-resources/adapter-task-resource-manifestwork-configmap.yamlcharts/examples/maestro-two-resources/adapter-task-resource-manifestwork-namespace.yamlcharts/examples/maestro-two-resources/values.yamlcharts/examples/maestro/adapter-task-config.yamlcharts/examples/maestro/values.yamlconfigs/adapter-task-config-template.yamldocs/adapter-authoring-guide.mdinternal/configloader/constants.gointernal/configloader/types.gointernal/configloader/validator.gointernal/configloader/validator_test.gointernal/criteria/evaluator.gointernal/criteria/evaluator_test.gointernal/dryrun/recording_transport_client.gointernal/dryrun/recording_transport_client_test.gointernal/dryrun/trace.gointernal/dryrun/trace_test.gointernal/executor/README.mdinternal/executor/executor_test.gointernal/executor/post_action_executor.gointernal/executor/precondition_executor.gointernal/executor/resource_executor.gointernal/executor/resource_executor_test.gointernal/executor/types.gointernal/executor/utils.gointernal/k8sclient/apply.gointernal/k8sclient/client.gointernal/k8sclient/interface.gointernal/k8sclient/mock.gointernal/maestroclient/client.gointernal/manifest/generation.gointernal/transportclient/interface.gointernal/transportclient/types.gotest/integration/executor/setup_test.gotest/integration/k8sclient/client_integration_test.gotest/testdata/dryrun/cel-showcase/dryrun-cel-showcase-api-responses.jsontest/testdata/dryrun/cel-showcase/dryrun-cel-showcase-discovery.jsontest/testdata/dryrun/cel-showcase/dryrun-cel-showcase-task-config.yamltest/testdata/dryrun/cel-showcase/dryrun-cel-showcase.shtest/testdata/dryrun/dryrun-delete-api-responses.jsontest/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-discovery.jsontest/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-task-config.yamltest/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete.shtest/testdata/dryrun/kubernetes/dryrun-kubernetes-adatepr-task-config-invalid.yamltest/testdata/dryrun/kubernetes/dryrun-kubernetes-discovery.jsontest/testdata/dryrun/kubernetes/dryrun-kubernetes-task-config.yamltest/testdata/dryrun/kubernetes/dryrun-kubernetes.shtest/testdata/dryrun/maestro-delete/dryrun-maestro-delete-discovery.jsontest/testdata/dryrun/maestro-delete/dryrun-maestro-delete-task-config.yamltest/testdata/dryrun/maestro-delete/dryrun-maestro-delete.shtest/testdata/dryrun/maestro/dryrun-maestro-adapter-task-config.yamltest/testdata/dryrun/maestro/dryrun-maestro-discovery.jsontest/testdata/dryrun/maestro/dryrun-maestro.sh
💤 Files with no reviewable changes (1)
- internal/k8sclient/interface.go
2b61ef5 to
4032a9a
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (1)
internal/executor/resource_executor.go (1)
487-494:⚠️ Potential issue | 🟠 MajorPre-discovery swallows non-NotFound errors, potentially causing incorrect deletion.
When pre-discovery fails due to RBAC, network, or API errors (not NotFound), the resource is left absent from
execCtx.Resources. This makesresources.?X.hasValue()returnfalse, which could incorrectly unlock dependent deletes or allowFinalized=Trueto be reported when the actual state is unknown.Consider returning an error from
preDiscoverAllwhen any non-NotFound error occurs, or at minimum logging at WARN level instead of DEBUG.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/executor/resource_executor.go` around lines 487 - 494, The pre-discovery loop in preDiscoverAll currently swallows non-NotFound errors returned by re.discoverResource (checked via apierrors.IsNotFound) and leaves resources absent from execCtx.Resources, which can incorrectly allow deletions or report Finalized=True; modify preDiscoverAll to propagate any non-NotFound error returned by re.discoverResource up to the caller (i.e., return the error instead of continue), or at minimum change the log level to WARN and record the failure in execCtx so resources.?X.hasValue() reflects the unknown state; update references to discoverResource, preDiscoverAll, execCtx.Resources, and the apierrors.IsNotFound check accordingly so callers can handle transient RBAC/network/API errors rather than assuming absence.
🧹 Nitpick comments (1)
charts/examples/maestro-two-resources/README.md (1)
26-37: Add language specifier to fenced code block.The reconciliation sequence diagram should have a language specifier for consistent rendering. Static analysis flagged this.
📝 Proposed fix
-``` +```text Reconciliation 1 (is_deleting=true):🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@charts/examples/maestro-two-resources/README.md` around lines 26 - 37, The fenced code block showing the reconciliation sequence lacks a language specifier; update the opening fence to include "text" (i.e., change the triple backticks preceding "Reconciliation 1 (is_deleting=true):" to ```text) so the block renders consistently and satisfies static analysis; leave the rest of the block content unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@internal/executor/resource_executor.go`:
- Around line 487-494: The pre-discovery loop in preDiscoverAll currently
swallows non-NotFound errors returned by re.discoverResource (checked via
apierrors.IsNotFound) and leaves resources absent from execCtx.Resources, which
can incorrectly allow deletions or report Finalized=True; modify preDiscoverAll
to propagate any non-NotFound error returned by re.discoverResource up to the
caller (i.e., return the error instead of continue), or at minimum change the
log level to WARN and record the failure in execCtx so resources.?X.hasValue()
reflects the unknown state; update references to discoverResource,
preDiscoverAll, execCtx.Resources, and the apierrors.IsNotFound check
accordingly so callers can handle transient RBAC/network/API errors rather than
assuming absence.
---
Nitpick comments:
In `@charts/examples/maestro-two-resources/README.md`:
- Around line 26-37: The fenced code block showing the reconciliation sequence
lacks a language specifier; update the opening fence to include "text" (i.e.,
change the triple backticks preceding "Reconciliation 1 (is_deleting=true):" to
```text) so the block renders consistently and satisfies static analysis; leave
the rest of the block content unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 0c179216-f30a-4d62-bbd3-d5271df27f76
📒 Files selected for processing (60)
charts/README.mdcharts/examples/kubernetes/adapter-task-config.yamlcharts/examples/kubernetes/values.yamlcharts/examples/maestro-two-resources/README.mdcharts/examples/maestro-two-resources/adapter-config.yamlcharts/examples/maestro-two-resources/adapter-task-config.yamlcharts/examples/maestro-two-resources/adapter-task-resource-manifestwork-configmap.yamlcharts/examples/maestro-two-resources/adapter-task-resource-manifestwork-namespace.yamlcharts/examples/maestro-two-resources/values.yamlcharts/examples/maestro/adapter-task-config.yamlcharts/examples/maestro/values.yamlconfigs/adapter-task-config-template.yamldocs/adapter-authoring-guide.mdinternal/configloader/constants.gointernal/configloader/types.gointernal/configloader/validator.gointernal/configloader/validator_test.gointernal/criteria/evaluator.gointernal/criteria/evaluator_test.gointernal/dryrun/recording_transport_client.gointernal/dryrun/recording_transport_client_test.gointernal/dryrun/trace.gointernal/dryrun/trace_test.gointernal/executor/README.mdinternal/executor/executor_test.gointernal/executor/post_action_executor.gointernal/executor/precondition_executor.gointernal/executor/resource_executor.gointernal/executor/resource_executor_test.gointernal/executor/types.gointernal/executor/utils.gointernal/executor/utils_test.gointernal/k8sclient/apply.gointernal/k8sclient/client.gointernal/k8sclient/interface.gointernal/k8sclient/mock.gointernal/maestroclient/client.gointernal/manifest/generation.gointernal/transportclient/interface.gointernal/transportclient/types.gotest/integration/executor/setup_test.gotest/integration/k8sclient/client_integration_test.gotest/testdata/dryrun/cel-showcase/dryrun-cel-showcase-api-responses.jsontest/testdata/dryrun/cel-showcase/dryrun-cel-showcase-discovery.jsontest/testdata/dryrun/cel-showcase/dryrun-cel-showcase-task-config.yamltest/testdata/dryrun/cel-showcase/dryrun-cel-showcase.shtest/testdata/dryrun/dryrun-delete-api-responses.jsontest/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-discovery.jsontest/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-task-config.yamltest/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete.shtest/testdata/dryrun/kubernetes/dryrun-kubernetes-adatepr-task-config-invalid.yamltest/testdata/dryrun/kubernetes/dryrun-kubernetes-discovery.jsontest/testdata/dryrun/kubernetes/dryrun-kubernetes-task-config.yamltest/testdata/dryrun/kubernetes/dryrun-kubernetes.shtest/testdata/dryrun/maestro-delete/dryrun-maestro-delete-discovery.jsontest/testdata/dryrun/maestro-delete/dryrun-maestro-delete-task-config.yamltest/testdata/dryrun/maestro-delete/dryrun-maestro-delete.shtest/testdata/dryrun/maestro/dryrun-maestro-adapter-task-config.yamltest/testdata/dryrun/maestro/dryrun-maestro-discovery.jsontest/testdata/dryrun/maestro/dryrun-maestro.sh
💤 Files with no reviewable changes (2)
- internal/executor/utils_test.go
- internal/k8sclient/interface.go
✅ Files skipped from review due to trivial changes (20)
- charts/examples/maestro/values.yaml
- test/testdata/dryrun/kubernetes/dryrun-kubernetes.sh
- test/testdata/dryrun/cel-showcase/dryrun-cel-showcase-task-config.yaml
- test/testdata/dryrun/maestro/dryrun-maestro-adapter-task-config.yaml
- internal/manifest/generation.go
- test/testdata/dryrun/dryrun-delete-api-responses.json
- test/testdata/dryrun/cel-showcase/dryrun-cel-showcase.sh
- test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-discovery.json
- test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete.sh
- test/testdata/dryrun/kubernetes/dryrun-kubernetes-task-config.yaml
- test/testdata/dryrun/maestro-delete/dryrun-maestro-delete-discovery.json
- internal/configloader/constants.go
- charts/examples/maestro-two-resources/values.yaml
- charts/examples/maestro-two-resources/adapter-config.yaml
- charts/examples/maestro-two-resources/adapter-task-resource-manifestwork-configmap.yaml
- charts/examples/maestro-two-resources/adapter-task-resource-manifestwork-namespace.yaml
- test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-task-config.yaml
- internal/k8sclient/mock.go
- charts/examples/maestro-two-resources/adapter-task-config.yaml
- internal/executor/types.go
🚧 Files skipped from review as they are similar to previous changes (23)
- test/integration/executor/setup_test.go
- internal/executor/README.md
- internal/k8sclient/apply.go
- internal/transportclient/types.go
- internal/criteria/evaluator_test.go
- internal/executor/utils.go
- test/integration/k8sclient/client_integration_test.go
- test/testdata/dryrun/maestro/dryrun-maestro.sh
- internal/configloader/validator.go
- internal/transportclient/interface.go
- internal/dryrun/trace_test.go
- charts/examples/kubernetes/values.yaml
- test/testdata/dryrun/maestro-delete/dryrun-maestro-delete.sh
- charts/README.md
- internal/maestroclient/client.go
- charts/examples/kubernetes/adapter-task-config.yaml
- test/testdata/dryrun/maestro-delete/dryrun-maestro-delete-task-config.yaml
- internal/executor/executor_test.go
- internal/dryrun/recording_transport_client.go
- internal/dryrun/recording_transport_client_test.go
- internal/executor/resource_executor_test.go
- internal/configloader/types.go
- docs/adapter-authoring-guide.md
4032a9a to
e0279ae
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
charts/examples/kubernetes/adapter-task-config.yaml (1)
259-290:⚠️ Potential issue | 🟡 MinorGuard
Deletedcondition withis_deletingto prevent false positives.The
Deletedcondition reportsTruewhenever all resources are absent, but this can occur on first reconciliation before any resources are created, or when preconditions skip the resource phase. This would incorrectly signalDeleted=Truewhen deletion was never requested.The
Finalizedcondition correctly guards withis_deleting(line 302), butDeleteddoes not.Suggested fix
- type: "Deleted" status: expression: | - !resources.?clusterNamespace.hasValue() + is_deleting + && !resources.?clusterNamespace.hasValue() && !resources.?jobServiceAccount.hasValue()Apply the same
is_deleting &&prefix to thereasonandmessageexpressions, similar to howFinalizedhandles it.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@charts/examples/kubernetes/adapter-task-config.yaml` around lines 259 - 290, The Deleted condition currently marks resources as deleted whenever all resource references are absent, which can be a false positive; update the Deleted condition expressions (status, reason, message) to require the deletion phase by prefixing each boolean expression with is_deleting && (same guard used by the Finalized condition) so the Deleted condition only becomes True/ResourcesDeleted/"All cluster resources successfully deleted" when is_deleting is true and all resource values are absent.internal/executor/resource_executor.go (1)
156-170:⚠️ Potential issue | 🟡 MinorPopulate
adapter.resourceErrorsfor apply failures too.
recordResourceErroris only called from delete paths (lines 590, 625), but apply failures at lines 161-165 and 186-190 only setExecutionErrorwithout populatingResourceErrors. This meansadapter.?resourceErrors.?<name>in post-action CEL will be empty for the most common failure mode (apply errors).For consistency with the documented behavior in the adapter authoring guide, apply failures should also call
recordResourceError:Suggested fix
if err != nil { result.Status = StatusFailed result.Error = err - execCtx.Adapter.ExecutionError = &ExecutionError{ - Phase: string(PhaseResources), - Step: resource.Name, - Message: err.Error(), - } + re.recordResourceError(execCtx, resource, err) errCtx := logger.WithK8sResult(ctx, "FAILED")Apply the same change to the post-apply discovery failure path at lines 186-190.
Also applies to: 182-196, 663-679
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/executor/resource_executor.go` around lines 156 - 170, The apply failure paths that call transportClient.ApplyResource (and the post-apply discovery failure path) currently set execCtx.Adapter.ExecutionError but do not populate adapter.resourceErrors; update those failure blocks to also call recordResourceError(re.execCtx.Adapter, resource.Name, err) (or the existing recordResourceError helper signature used in delete paths) whenever an apply or post-apply discovery error occurs so adapter.resourceErrors.<resourceName> is populated for CEL; keep the existing setting of result.Status/ result.Error and ExecutionError, and ensure you add the recordResourceError call in the same branches that now log via re.log.Errorf.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@charts/examples/maestro-two-resources/README.md`:
- Around line 26-37: Update the fenced code block that starts with triple
backticks and the "Reconciliation 1 (is_deleting=true):" sequence to include a
language specifier (e.g., change the opening ``` to ```text) so the block is
marked as plain text; locate the triple-backtick fence surrounding the
reconciliation sequence and add "text" (or "plaintext") immediately after the
opening backticks.
In `@test/testdata/dryrun/maestro-delete/dryrun-maestro-delete-task-config.yaml`:
- Around line 16-18: Update the commented example command paths so they point to
files inside the maestro-delete subfolder; specifically change the values for
the flags --task-config, --dry-run-api-responses, and --dry-run-discovery in the
commented block to use the corresponding files under
test/testdata/dryrun/maestro-delete (so the example becomes runnable when
copy-pasted).
---
Duplicate comments:
In `@charts/examples/kubernetes/adapter-task-config.yaml`:
- Around line 259-290: The Deleted condition currently marks resources as
deleted whenever all resource references are absent, which can be a false
positive; update the Deleted condition expressions (status, reason, message) to
require the deletion phase by prefixing each boolean expression with is_deleting
&& (same guard used by the Finalized condition) so the Deleted condition only
becomes True/ResourcesDeleted/"All cluster resources successfully deleted" when
is_deleting is true and all resource values are absent.
In `@internal/executor/resource_executor.go`:
- Around line 156-170: The apply failure paths that call
transportClient.ApplyResource (and the post-apply discovery failure path)
currently set execCtx.Adapter.ExecutionError but do not populate
adapter.resourceErrors; update those failure blocks to also call
recordResourceError(re.execCtx.Adapter, resource.Name, err) (or the existing
recordResourceError helper signature used in delete paths) whenever an apply or
post-apply discovery error occurs so adapter.resourceErrors.<resourceName> is
populated for CEL; keep the existing setting of result.Status/ result.Error and
ExecutionError, and ensure you add the recordResourceError call in the same
branches that now log via re.log.Errorf.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 97ea9965-aa68-4a72-bd70-d537bf606479
📒 Files selected for processing (60)
charts/README.mdcharts/examples/kubernetes/adapter-task-config.yamlcharts/examples/kubernetes/values.yamlcharts/examples/maestro-two-resources/README.mdcharts/examples/maestro-two-resources/adapter-config.yamlcharts/examples/maestro-two-resources/adapter-task-config.yamlcharts/examples/maestro-two-resources/adapter-task-resource-manifestwork-configmap.yamlcharts/examples/maestro-two-resources/adapter-task-resource-manifestwork-namespace.yamlcharts/examples/maestro-two-resources/values.yamlcharts/examples/maestro/adapter-task-config.yamlcharts/examples/maestro/values.yamlconfigs/adapter-task-config-template.yamldocs/adapter-authoring-guide.mdinternal/configloader/constants.gointernal/configloader/types.gointernal/configloader/validator.gointernal/configloader/validator_test.gointernal/criteria/evaluator.gointernal/criteria/evaluator_test.gointernal/dryrun/recording_transport_client.gointernal/dryrun/recording_transport_client_test.gointernal/dryrun/trace.gointernal/dryrun/trace_test.gointernal/executor/README.mdinternal/executor/executor_test.gointernal/executor/post_action_executor.gointernal/executor/precondition_executor.gointernal/executor/resource_executor.gointernal/executor/resource_executor_test.gointernal/executor/types.gointernal/executor/utils.gointernal/executor/utils_test.gointernal/k8sclient/apply.gointernal/k8sclient/client.gointernal/k8sclient/interface.gointernal/k8sclient/mock.gointernal/maestroclient/client.gointernal/manifest/generation.gointernal/transportclient/interface.gointernal/transportclient/types.gotest/integration/executor/setup_test.gotest/integration/k8sclient/client_integration_test.gotest/testdata/dryrun/cel-showcase/dryrun-cel-showcase-api-responses.jsontest/testdata/dryrun/cel-showcase/dryrun-cel-showcase-discovery.jsontest/testdata/dryrun/cel-showcase/dryrun-cel-showcase-task-config.yamltest/testdata/dryrun/cel-showcase/dryrun-cel-showcase.shtest/testdata/dryrun/dryrun-delete-api-responses.jsontest/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-discovery.jsontest/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-task-config.yamltest/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete.shtest/testdata/dryrun/kubernetes/dryrun-kubernetes-adatepr-task-config-invalid.yamltest/testdata/dryrun/kubernetes/dryrun-kubernetes-discovery.jsontest/testdata/dryrun/kubernetes/dryrun-kubernetes-task-config.yamltest/testdata/dryrun/kubernetes/dryrun-kubernetes.shtest/testdata/dryrun/maestro-delete/dryrun-maestro-delete-discovery.jsontest/testdata/dryrun/maestro-delete/dryrun-maestro-delete-task-config.yamltest/testdata/dryrun/maestro-delete/dryrun-maestro-delete.shtest/testdata/dryrun/maestro/dryrun-maestro-adapter-task-config.yamltest/testdata/dryrun/maestro/dryrun-maestro-discovery.jsontest/testdata/dryrun/maestro/dryrun-maestro.sh
💤 Files with no reviewable changes (2)
- internal/executor/utils_test.go
- internal/k8sclient/interface.go
✅ Files skipped from review due to trivial changes (22)
- test/testdata/dryrun/cel-showcase/dryrun-cel-showcase-task-config.yaml
- internal/executor/README.md
- internal/criteria/evaluator.go
- internal/dryrun/trace_test.go
- test/testdata/dryrun/dryrun-delete-api-responses.json
- test/testdata/dryrun/maestro-delete/dryrun-maestro-delete.sh
- charts/README.md
- charts/examples/kubernetes/values.yaml
- internal/configloader/constants.go
- test/testdata/dryrun/maestro-delete/dryrun-maestro-delete-discovery.json
- test/testdata/dryrun/kubernetes/dryrun-kubernetes.sh
- test/testdata/dryrun/maestro/dryrun-maestro.sh
- test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-discovery.json
- test/testdata/dryrun/cel-showcase/dryrun-cel-showcase.sh
- charts/examples/maestro-two-resources/adapter-config.yaml
- internal/executor/executor_test.go
- test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete.sh
- charts/examples/maestro-two-resources/adapter-task-resource-manifestwork-configmap.yaml
- test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-task-config.yaml
- charts/examples/maestro-two-resources/values.yaml
- internal/executor/resource_executor_test.go
- charts/examples/maestro-two-resources/adapter-task-config.yaml
🚧 Files skipped from review as they are similar to previous changes (16)
- charts/examples/maestro/values.yaml
- internal/executor/utils.go
- internal/criteria/evaluator_test.go
- internal/transportclient/types.go
- test/integration/executor/setup_test.go
- internal/executor/post_action_executor.go
- test/testdata/dryrun/maestro/dryrun-maestro-adapter-task-config.yaml
- test/testdata/dryrun/kubernetes/dryrun-kubernetes-task-config.yaml
- test/integration/k8sclient/client_integration_test.go
- internal/maestroclient/client.go
- internal/k8sclient/mock.go
- internal/configloader/validator.go
- internal/k8sclient/apply.go
- internal/configloader/validator_test.go
- internal/k8sclient/client.go
- charts/examples/maestro-two-resources/adapter-task-resource-manifestwork-namespace.yaml
a678f72 to
b470fc5
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
charts/examples/maestro-two-resources/README.md (1)
26-37:⚠️ Potential issue | 🟡 MinorAdd a language identifier to the fenced block (MD040).
Line 26 opens a fenced code block without a language specifier; markdownlint flags this and CI/docs lint may fail.
Proposed fix
-``` +```text Reconciliation 1 (is_deleting=true): → Delete configMapManifestWork ✓ (condition met) → Delete namespaceManifestWork ✗ (configMapManifestWork still present) @@ Reconciliation 3 (both gone): → Finalized = True → API hard-delete triggered</details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against the current code and only fix it if needed.
In
@charts/examples/maestro-two-resources/README.mdaround lines 26 - 37, The
fenced code block showing the "Reconciliation 1 (is_deleting=true)" example in
README.md lacks a language identifier which triggers markdownlint MD040; update
the opening fence to include a language (e.g., changetotext) so the
block becomes a proper fenced code block, locating the block by the line that
starts with "Reconciliation 1 (is_deleting=true)" and the surrounding
triple-backtick fences.</details> </blockquote></details> <details> <summary>internal/k8sclient/apply.go (1)</summary><blockquote> `130-132`: _⚠️ Potential issue_ | _🟠 Major_ **Fail fast if a delete reaches `ApplyManifest`.** This still returns success without deleting anything, so a routing bug leaves stale resources behind and hides the real failure. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@internal/k8sclient/apply.go` around lines 130 - 132, The ApplyManifest code currently ignores manifest.OperationDelete and returns success; change the manifest.OperationDelete case in ApplyManifest to fail fast by returning a clear error (e.g., fmt.Errorf or wrapped errors) stating deletes must be handled via DeleteResource so callers can detect the routing bug; reference the manifest.OperationDelete constant, the ApplyManifest function, and the DeleteResource path when composing the error message to make debugging straightforward. ``` </details> </blockquote></details> </blockquote></details> <details> <summary>🧹 Nitpick comments (1)</summary><blockquote> <details> <summary>internal/k8sclient/client.go (1)</summary><blockquote> `268-290`: **Consolidate duplicated delete flow to reduce drift risk.** `deleteResource` and `DeleteResource` duplicate object setup + NotFound/error wrapping. Consider a shared internal delete primitive (with optional propagation argument) so semantics stay aligned. As per coding guidelines, `-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.` Also applies to: 302-325 <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@internal/k8sclient/client.go` around lines 268 - 290, The delete logic in deleteResource and DeleteResource is duplicated; extract a single internal helper (e.g., func (c *Client) doDelete(ctx context.Context, gvk schema.GroupVersionKind, namespace, name string, propagation *metav1.DeletePropagation) error) that centralizes creating the unstructured object, setting GroupVersionKind/Namespace/Name, calling c.client.Delete with an optional propagation policy, and normalizing apierrors.IsNotFound to nil or wrapping other errors into apperrors.K8sOperationError; then have both deleteResource and DeleteResource call this helper (pass nil or a specific propagation parameter as needed) to keep semantics aligned and avoid drift. ``` </details> </blockquote></details> </blockquote></details> <details> <summary>🤖 Prompt for all review comments with AI agents</summary>Verify each finding against the current code and only fix it if needed.
Inline comments:
In@internal/executor/precondition_executor.go:
- Around line 140-145: Update the struct tag on ActionBase.Name to enforce the
existing resourceNamePattern by adding validate:"required,resourcename" so
action names are valid CEL identifiers; this prevents silent failures when
buildCELOptions registers ActionBase.Name as a CEL variable (see
buildCELOptions) and avoids capture being skipped (see
precondition_executor.captureCtx handling). Locate the ActionBase struct
definition and replace the current validate:"required" tag on the Name field
with validate:"required,resourcename", then run/adjust existing unit/validation
tests that exercise ActionBase.Name to ensure the new constraint is enforced.In
@internal/k8sclient/client.go:
- Around line 307-310: Validate the user-supplied PropagationPolicy in
DeleteResource before casting: in DeleteResource, check the incoming opts != nil
and opts.PropagationPolicy string against the allowed metav1.DeletionPropagation
values ("Foreground","Background","Orphan") (or use a switch/lookup) and return
a clear error if it is invalid; only then convert with
metav1.DeletionPropagation(opts.PropagationPolicy) and proceed. Ensure the
validation targets the DeleteOptions.PropagationPolicy field and returns an
explicit error rather than letting the API server reject the request.
Duplicate comments:
In@charts/examples/maestro-two-resources/README.md:
- Around line 26-37: The fenced code block showing the "Reconciliation 1
(is_deleting=true)" example in README.md lacks a language identifier which
triggers markdownlint MD040; update the opening fence to include a language
(e.g., changetotext) so the block becomes a proper fenced code block,
locating the block by the line that starts with "Reconciliation 1
(is_deleting=true)" and the surrounding triple-backtick fences.In
@internal/k8sclient/apply.go:
- Around line 130-132: The ApplyManifest code currently ignores
manifest.OperationDelete and returns success; change the
manifest.OperationDelete case in ApplyManifest to fail fast by returning a clear
error (e.g., fmt.Errorf or wrapped errors) stating deletes must be handled via
DeleteResource so callers can detect the routing bug; reference the
manifest.OperationDelete constant, the ApplyManifest function, and the
DeleteResource path when composing the error message to make debugging
straightforward.
Nitpick comments:
In@internal/k8sclient/client.go:
- Around line 268-290: The delete logic in deleteResource and DeleteResource is
duplicated; extract a single internal helper (e.g., func (c *Client)
doDelete(ctx context.Context, gvk schema.GroupVersionKind, namespace, name
string, propagation *metav1.DeletePropagation) error) that centralizes creating
the unstructured object, setting GroupVersionKind/Namespace/Name, calling
c.client.Delete with an optional propagation policy, and normalizing
apierrors.IsNotFound to nil or wrapping other errors into
apperrors.K8sOperationError; then have both deleteResource and DeleteResource
call this helper (pass nil or a specific propagation parameter as needed) to
keep semantics aligned and avoid drift.</details> <details> <summary>🪄 Autofix (Beta)</summary> Fix all unresolved CodeRabbit comments on this PR: - [ ] <!-- {"checkboxId": "4b0d0e0a-96d7-4f10-b296-3a18ea78f0b9"} --> Push a commit to this branch (recommended) - [ ] <!-- {"checkboxId": "ff5b1114-7d8c-49e6-8ac1-43f82af23a33"} --> Create a new PR with the fixes </details> --- <details> <summary>ℹ️ Review info</summary> <details> <summary>⚙️ Run configuration</summary> **Configuration used**: Organization UI **Review profile**: CHILL **Plan**: Pro Plus **Run ID**: `5bf8e9fb-71c6-407a-b82f-313105d93a7d` </details> <details> <summary>📥 Commits</summary> Reviewing files that changed from the base of the PR and between e0279aeabb4143e585716250cef39b96bf2dfb1b and a678f72f665d3051fbc97048a7d5a0aacd3968b7. </details> <details> <summary>📒 Files selected for processing (60)</summary> * `charts/README.md` * `charts/examples/kubernetes/adapter-task-config.yaml` * `charts/examples/kubernetes/values.yaml` * `charts/examples/maestro-two-resources/README.md` * `charts/examples/maestro-two-resources/adapter-config.yaml` * `charts/examples/maestro-two-resources/adapter-task-config.yaml` * `charts/examples/maestro-two-resources/adapter-task-resource-manifestwork-configmap.yaml` * `charts/examples/maestro-two-resources/adapter-task-resource-manifestwork-namespace.yaml` * `charts/examples/maestro-two-resources/values.yaml` * `charts/examples/maestro/adapter-task-config.yaml` * `charts/examples/maestro/values.yaml` * `configs/adapter-task-config-template.yaml` * `docs/adapter-authoring-guide.md` * `internal/configloader/constants.go` * `internal/configloader/types.go` * `internal/configloader/validator.go` * `internal/configloader/validator_test.go` * `internal/criteria/evaluator.go` * `internal/criteria/evaluator_test.go` * `internal/dryrun/recording_transport_client.go` * `internal/dryrun/recording_transport_client_test.go` * `internal/dryrun/trace.go` * `internal/dryrun/trace_test.go` * `internal/executor/README.md` * `internal/executor/executor_test.go` * `internal/executor/post_action_executor.go` * `internal/executor/precondition_executor.go` * `internal/executor/resource_executor.go` * `internal/executor/resource_executor_test.go` * `internal/executor/types.go` * `internal/executor/utils.go` * `internal/executor/utils_test.go` * `internal/k8sclient/apply.go` * `internal/k8sclient/client.go` * `internal/k8sclient/interface.go` * `internal/k8sclient/mock.go` * `internal/maestroclient/client.go` * `internal/manifest/generation.go` * `internal/transportclient/interface.go` * `internal/transportclient/types.go` * `test/integration/executor/setup_test.go` * `test/integration/k8sclient/client_integration_test.go` * `test/testdata/dryrun/cel-showcase/dryrun-cel-showcase-api-responses.json` * `test/testdata/dryrun/cel-showcase/dryrun-cel-showcase-discovery.json` * `test/testdata/dryrun/cel-showcase/dryrun-cel-showcase-task-config.yaml` * `test/testdata/dryrun/cel-showcase/dryrun-cel-showcase.sh` * `test/testdata/dryrun/dryrun-delete-api-responses.json` * `test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-discovery.json` * `test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-task-config.yaml` * `test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete.sh` * `test/testdata/dryrun/kubernetes/dryrun-kubernetes-adatepr-task-config-invalid.yaml` * `test/testdata/dryrun/kubernetes/dryrun-kubernetes-discovery.json` * `test/testdata/dryrun/kubernetes/dryrun-kubernetes-task-config.yaml` * `test/testdata/dryrun/kubernetes/dryrun-kubernetes.sh` * `test/testdata/dryrun/maestro-delete/dryrun-maestro-delete-discovery.json` * `test/testdata/dryrun/maestro-delete/dryrun-maestro-delete-task-config.yaml` * `test/testdata/dryrun/maestro-delete/dryrun-maestro-delete.sh` * `test/testdata/dryrun/maestro/dryrun-maestro-adapter-task-config.yaml` * `test/testdata/dryrun/maestro/dryrun-maestro-discovery.json` * `test/testdata/dryrun/maestro/dryrun-maestro.sh` </details> <details> <summary>💤 Files with no reviewable changes (2)</summary> * internal/executor/utils_test.go * internal/k8sclient/interface.go </details> <details> <summary>✅ Files skipped from review due to trivial changes (21)</summary> * internal/dryrun/trace_test.go * test/testdata/dryrun/cel-showcase/dryrun-cel-showcase-task-config.yaml * test/testdata/dryrun/kubernetes/dryrun-kubernetes.sh * internal/executor/README.md * test/testdata/dryrun/maestro/dryrun-maestro.sh * internal/transportclient/types.go * test/testdata/dryrun/cel-showcase/dryrun-cel-showcase.sh * internal/configloader/constants.go * test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-discovery.json * test/testdata/dryrun/maestro-delete/dryrun-maestro-delete.sh * test/testdata/dryrun/maestro/dryrun-maestro-adapter-task-config.yaml * test/testdata/dryrun/dryrun-delete-api-responses.json * charts/examples/maestro-two-resources/adapter-config.yaml * internal/executor/executor_test.go * charts/examples/kubernetes/values.yaml * charts/examples/maestro-two-resources/adapter-task-resource-manifestwork-configmap.yaml * charts/examples/maestro-two-resources/values.yaml * test/testdata/dryrun/maestro-delete/dryrun-maestro-delete-discovery.json * charts/examples/maestro-two-resources/adapter-task-config.yaml * test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete.sh * test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-task-config.yaml </details> <details> <summary>🚧 Files skipped from review as they are similar to previous changes (11)</summary> * test/testdata/dryrun/kubernetes/dryrun-kubernetes-task-config.yaml * charts/examples/maestro/values.yaml * test/integration/executor/setup_test.go * internal/transportclient/interface.go * internal/criteria/evaluator.go * internal/configloader/validator_test.go * internal/maestroclient/client.go * internal/k8sclient/mock.go * charts/examples/kubernetes/adapter-task-config.yaml * internal/executor/resource_executor.go * internal/executor/resource_executor_test.go </details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (6)
internal/dryrun/recording_transport_client.go (1)
186-216:⚠️ Potential issue | 🟡 Minor
Foregroundpropagation policy is not honored in dry-run deletes.The previous review noted that this implementation ignores
optsand treats K8s deletes as immediate removal regardless ofPropagationPolicy. ForForegrounddeletes, Kubernetes blocks until dependents are removed, which should be simulated by keeping the resource withdeletionTimestampset (similar to Maestro async behavior).Current code only distinguishes K8s vs Maestro via
target, but K8sForegroundshould behave like Maestro (async).Suggested fix
func (c *DryrunTransportClient) DeleteResource( ... ) error { c.mu.Lock() defer c.mu.Unlock() key := resourceKey(gvk, namespace, name) - if target != nil { + asyncDelete := target != nil || (opts != nil && opts.PropagationPolicy == "Foreground") + if asyncDelete { // Maestro transport: async deletion — mark with deletionTimestamp, keep in store. if obj, exists := c.resources[key]; exists { now := metav1.NewTime(time.Now()) obj.SetDeletionTimestamp(&now) } } else { // K8s transport: synchronous deletion — remove from store immediately. delete(c.resources, key) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/dryrun/recording_transport_client.go` around lines 186 - 216, DeleteResource in DryrunTransportClient currently ignores opts and always removes K8s-targeted resources immediately; update it to honor transportclient.DeleteOptions. Specifically, inspect opts.PropagationPolicy (or opts if nil defaulting to Background) and if it's Foreground treat K8s deletes like Maestro async: if the resource exists in c.resources set a deletionTimestamp via obj.SetDeletionTimestamp(&now) instead of deleting from c.resources; only remove the entry for non-Foreground policies. Keep recording the TransportRecord (operationDelete, GVK, Namespace, Name) as before.internal/k8sclient/client.go (1)
307-310:⚠️ Potential issue | 🟡 MinorValidate
PropagationPolicyat the public API boundary.The previous review flagged that
DeleteResourceis a public API and should validateopts.PropagationPolicybefore casting tometav1.DeletionPropagation. While the config loader validates these values during configuration, direct callers of this method can constructDeleteOptionswith arbitrary strings, deferring failure to the API server with a less clear error.Suggested validation
propagationPolicy := metav1.DeletePropagationBackground if opts != nil && opts.PropagationPolicy != "" { + switch opts.PropagationPolicy { + case "Background", "Foreground", "Orphan": + propagationPolicy = metav1.DeletionPropagation(opts.PropagationPolicy) + default: + return apperrors.KubernetesError("invalid propagationPolicy %q: must be Background, Foreground, or Orphan", opts.PropagationPolicy) + } - propagationPolicy = metav1.DeletionPropagation(opts.PropagationPolicy) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/k8sclient/client.go` around lines 307 - 310, DeleteResource currently casts opts.PropagationPolicy directly to metav1.DeletionPropagation which allows callers to pass arbitrary strings; update DeleteResource to validate opts.PropagationPolicy at the public API boundary by checking (when opts != nil && opts.PropagationPolicy != "") that the value equals one of the known metav1.DeletionPropagation constants ("Background","Foreground","Orphan") before calling metav1.DeletionPropagation; if it’s not one of those, return a clear error from DeleteResource indicating the invalid PropagationPolicy value so invalid inputs fail fast instead of delegating to the API server.charts/examples/kubernetes/adapter-task-config.yaml (1)
259-290:⚠️ Potential issue | 🟠 MajorGuard
Deletedwithis_deleting.This still reports
Deleted=Truewhenever all managed resources are absent, including first-run and skipped reconciliations where no delete was requested.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@charts/examples/kubernetes/adapter-task-config.yaml` around lines 259 - 290, The Deleted condition currently evaluates to True whenever all managed resources are absent; update the three expressions under the Deleted block (status.expression, reason.expression, message.expression) to also require the deletion intent flag (is_deleting) so they only return True/"ResourcesDeleted"/success message when resources are absent AND is_deleting is true; locate the Deleted block (type: "Deleted") and add the is_deleting check to the combined boolean expression used in status, reason, and message so that absence alone on first-run or skipped reconciliations won't report Deleted.internal/executor/resource_executor.go (3)
47-56:⚠️ Potential issue | 🟠 MajorAvoid pre-discovery on apply-only reconciliations.
This adds an extra discovery roundtrip for every resource on every reconciliation, even when no resource has
lifecycle.delete. On larger tasks that doubles remote read traffic on the normal apply path.Suggested guard
- if err := re.preDiscoverAll(ctx, resources, execCtx); err != nil { - return nil, err - } + needsPreDiscovery := false + for _, resource := range resources { + if resource.Lifecycle != nil && resource.Lifecycle.Delete != nil { + needsPreDiscovery = true + break + } + } + if needsPreDiscovery { + if err := re.preDiscoverAll(ctx, resources, execCtx); err != nil { + return nil, err + } + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/executor/resource_executor.go` around lines 47 - 56, Pre-discovery is unconditionally invoked via re.preDiscoverAll(...) which doubles remote reads even when no resource uses lifecycle.delete; change the logic in resource_executor.go to first scan the incoming resources slice for any resource with a lifecycle.delete (or a lifecycle.when expression that can evaluate delete) or use execCtx flag indicating delete-capable reconciliation, and only call re.preDiscoverAll(ctx, resources, execCtx) when such delete semantics are present; otherwise skip the pre-discovery step to avoid the extra roundtrip.
157-169:⚠️ Potential issue | 🟠 MajorPopulate
adapter.resourceErrorsfor apply-path failures too.
recordResourceError()is only used on delete paths right now. Plain apply failures and post-apply discovery failures still leaveadapter.resourceErrorsempty, so post-action CEL misses the common resource-phase errors.Suggested fix
applyResult, err := transportClient.ApplyResource(ctx, renderedBytes, applyOpts, transportTarget) if err != nil { result.Status = StatusFailed result.Error = err - execCtx.Adapter.ExecutionError = &ExecutionError{ - Phase: string(PhaseResources), - Step: resource.Name, - Message: err.Error(), - } + re.recordResourceError(execCtx, resource, err) errCtx := logger.WithK8sResult(ctx, "FAILED") errCtx = logger.WithErrorField(errCtx, err) re.log.Errorf(errCtx, "Resource[%s] processed: FAILED", resource.Name) return result, NewExecutorError(PhaseResources, resource.Name, "failed to apply resource", err) } @@ if resource.Discovery != nil { discovered, discoverErr := re.discoverResource(ctx, resource, execCtx, transportTarget) if discoverErr != nil { result.Status = StatusFailed result.Error = discoverErr - execCtx.Adapter.ExecutionError = &ExecutionError{ - Phase: string(PhaseResources), - Step: resource.Name, - Message: discoverErr.Error(), - } + re.recordResourceError(execCtx, resource, discoverErr) errCtx := logger.WithK8sResult(ctx, "FAILED") errCtx = logger.WithErrorField(errCtx, discoverErr) re.log.Errorf(errCtx, "Resource[%s] discovery after apply failed: %v", resource.Name, discoverErr) return result, NewExecutorError( PhaseResources, resource.Name, "failed to discover resource after apply", discoverErr) }Also applies to: 180-195
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/executor/resource_executor.go` around lines 157 - 169, The apply-path error handling must also register the failure into the adapter's resourceErrors so post-action CEL can see resource-phase errors; in the transportClient.ApplyResource error branch (where result.Status is set to StatusFailed and execCtx.Adapter.ExecutionError is assigned) call recordResourceError(ctx, execCtx.Adapter, resource.Name, err) or otherwise append a ResourceError entry to execCtx.Adapter.resourceErrors with the same details; do the same in the post-apply discovery failure branch (the code handling discovery after apply around the block after applyResult) so both plain apply failures and post-apply discovery failures populate adapter.resourceErrors in addition to setting ExecutionError/Status.
485-492:⚠️ Potential issue | 🟠 MajorFail closed when Maestro pre-discovery cannot resolve
targetCluster.Skipping here leaves the resource absent from
execCtx.Resources, so sibling!resources.?name.hasValue()gates can evaluate against unknown state and unlock deletes or finalization incorrectly.Suggested fix
if resource.IsMaestroTransport() && resource.Transport.Maestro != nil { targetCluster, err := utils.RenderTemplate(resource.Transport.Maestro.TargetCluster, execCtx.Params) if err != nil { - re.log.Debugf(ctx, "Resource[%s] pre-discovery: failed to render targetCluster (skipping): %v", - resource.Name, err) - continue + return NewExecutorError( + PhaseResources, + resource.Name, + "pre-discovery failed to render Maestro targetCluster", + err, + ) } transportTarget = &maestroclient.TransportContext{ConsumerName: targetCluster} }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/executor/resource_executor.go` around lines 485 - 492, The current Maestro pre-discovery branch logs a RenderTemplate error for resource.Transport.Maestro.TargetCluster and continues, which silently drops the resource from execCtx.Resources and can allow incorrect gate evaluations; instead, in the block guarded by resource.IsMaestroTransport() (the code using transportTarget, resource.Transport.Maestro.TargetCluster, execCtx.Params and re.log.Debugf), stop processing and return an error (or propagate a wrapped error) when utils.RenderTemplate fails rather than calling continue — update the handler to return the error up the call stack so the executor fails closed and the resource is not silently omitted.
🧹 Nitpick comments (1)
charts/examples/kubernetes/adapter-task-config.yaml (1)
42-48: Short-circuitvalidationCheckwhen deletion is requested.
is_deletingis captured here, butvalidationCheckstill ignores it. If a cluster is marked for deletion beforeclusterReadyTTLbecomes true, the whole resources phase is skipped and none of the newlifecycle.deleterules run.Suggested update
- clusterNotReady || clusterReadyTTL + is_deleting || clusterNotReady || clusterReadyTTL🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@charts/examples/kubernetes/adapter-task-config.yaml` around lines 42 - 48, Update the validationCheck expression to short-circuit true when is_deleting is set so the resources phase still runs to execute lifecycle.delete rules during deletion; specifically, amend the validationCheck (where clusterReadyTTL is currently used) to OR in the is_deleting symbol (e.g., "is_deleting || <existing validation expression>") so deletion requests bypass the normal readiness gate and allow lifecycle.delete handlers to run.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@charts/examples/maestro/adapter-task-config.yaml`:
- Around line 27-33: The current readiness precondition is preventing
lifecycle.delete from running on a Ready cluster because it only allows
progression when readyConditionStatus == "False"; modify the precondition logic
to short-circuit when is_deleting is true so deletes are allowed, e.g., update
the precondition(s) that reference readyConditionStatus to include is_deleting
(use a logical OR: is_deleting OR readyConditionStatus == "False"), or add a
separate precondition that passes when is_deleting is true; target the
precondition block(s) that gate the resources/lifecycle.delete flow (references:
is_deleting, readyConditionStatus, lifecycle.delete).
In `@internal/executor/resource_executor_test.go`:
- Around line 1233-1258: The test
TestResourceExecutor_LifecycleDelete_CELUndeclaredVariable assumes undeclared
CEL variables evaluate to null, but cel-go treats them as evaluation errors;
update the test to reflect that EvaluateCEL/EvaluateSafe will set
CELResult.Error and the execution should fail. Specifically, keep using
newResourceExecutor and ExecuteAll but change assertions to require an error
(require.Error/require.NoNil on the CEL error or
results[0].Error/CELResult.Error), assert results[0].Status is a failure state
(e.g., StatusFailed), assert the operation is not treated as OperationCreate,
and still assert mock.DeleteCalled is false; alternatively remove the test if
your design intends to reject undefined variables altogether.
---
Duplicate comments:
In `@charts/examples/kubernetes/adapter-task-config.yaml`:
- Around line 259-290: The Deleted condition currently evaluates to True
whenever all managed resources are absent; update the three expressions under
the Deleted block (status.expression, reason.expression, message.expression) to
also require the deletion intent flag (is_deleting) so they only return
True/"ResourcesDeleted"/success message when resources are absent AND
is_deleting is true; locate the Deleted block (type: "Deleted") and add the
is_deleting check to the combined boolean expression used in status, reason, and
message so that absence alone on first-run or skipped reconciliations won't
report Deleted.
In `@internal/dryrun/recording_transport_client.go`:
- Around line 186-216: DeleteResource in DryrunTransportClient currently ignores
opts and always removes K8s-targeted resources immediately; update it to honor
transportclient.DeleteOptions. Specifically, inspect opts.PropagationPolicy (or
opts if nil defaulting to Background) and if it's Foreground treat K8s deletes
like Maestro async: if the resource exists in c.resources set a
deletionTimestamp via obj.SetDeletionTimestamp(&now) instead of deleting from
c.resources; only remove the entry for non-Foreground policies. Keep recording
the TransportRecord (operationDelete, GVK, Namespace, Name) as before.
In `@internal/executor/resource_executor.go`:
- Around line 47-56: Pre-discovery is unconditionally invoked via
re.preDiscoverAll(...) which doubles remote reads even when no resource uses
lifecycle.delete; change the logic in resource_executor.go to first scan the
incoming resources slice for any resource with a lifecycle.delete (or a
lifecycle.when expression that can evaluate delete) or use execCtx flag
indicating delete-capable reconciliation, and only call re.preDiscoverAll(ctx,
resources, execCtx) when such delete semantics are present; otherwise skip the
pre-discovery step to avoid the extra roundtrip.
- Around line 157-169: The apply-path error handling must also register the
failure into the adapter's resourceErrors so post-action CEL can see
resource-phase errors; in the transportClient.ApplyResource error branch (where
result.Status is set to StatusFailed and execCtx.Adapter.ExecutionError is
assigned) call recordResourceError(ctx, execCtx.Adapter, resource.Name, err) or
otherwise append a ResourceError entry to execCtx.Adapter.resourceErrors with
the same details; do the same in the post-apply discovery failure branch (the
code handling discovery after apply around the block after applyResult) so both
plain apply failures and post-apply discovery failures populate
adapter.resourceErrors in addition to setting ExecutionError/Status.
- Around line 485-492: The current Maestro pre-discovery branch logs a
RenderTemplate error for resource.Transport.Maestro.TargetCluster and continues,
which silently drops the resource from execCtx.Resources and can allow incorrect
gate evaluations; instead, in the block guarded by resource.IsMaestroTransport()
(the code using transportTarget, resource.Transport.Maestro.TargetCluster,
execCtx.Params and re.log.Debugf), stop processing and return an error (or
propagate a wrapped error) when utils.RenderTemplate fails rather than calling
continue — update the handler to return the error up the call stack so the
executor fails closed and the resource is not silently omitted.
In `@internal/k8sclient/client.go`:
- Around line 307-310: DeleteResource currently casts opts.PropagationPolicy
directly to metav1.DeletionPropagation which allows callers to pass arbitrary
strings; update DeleteResource to validate opts.PropagationPolicy at the public
API boundary by checking (when opts != nil && opts.PropagationPolicy != "") that
the value equals one of the known metav1.DeletionPropagation constants
("Background","Foreground","Orphan") before calling metav1.DeletionPropagation;
if it’s not one of those, return a clear error from DeleteResource indicating
the invalid PropagationPolicy value so invalid inputs fail fast instead of
delegating to the API server.
---
Nitpick comments:
In `@charts/examples/kubernetes/adapter-task-config.yaml`:
- Around line 42-48: Update the validationCheck expression to short-circuit true
when is_deleting is set so the resources phase still runs to execute
lifecycle.delete rules during deletion; specifically, amend the validationCheck
(where clusterReadyTTL is currently used) to OR in the is_deleting symbol (e.g.,
"is_deleting || <existing validation expression>") so deletion requests bypass
the normal readiness gate and allow lifecycle.delete handlers to run.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: eef87831-a2c8-4972-8c51-755bad61acdf
📒 Files selected for processing (60)
charts/README.mdcharts/examples/kubernetes/adapter-task-config.yamlcharts/examples/kubernetes/values.yamlcharts/examples/maestro-two-resources/README.mdcharts/examples/maestro-two-resources/adapter-config.yamlcharts/examples/maestro-two-resources/adapter-task-config.yamlcharts/examples/maestro-two-resources/adapter-task-resource-manifestwork-configmap.yamlcharts/examples/maestro-two-resources/adapter-task-resource-manifestwork-namespace.yamlcharts/examples/maestro-two-resources/values.yamlcharts/examples/maestro/adapter-task-config.yamlcharts/examples/maestro/values.yamlconfigs/adapter-task-config-template.yamldocs/adapter-authoring-guide.mdinternal/configloader/constants.gointernal/configloader/types.gointernal/configloader/validator.gointernal/configloader/validator_test.gointernal/criteria/evaluator.gointernal/criteria/evaluator_test.gointernal/dryrun/recording_transport_client.gointernal/dryrun/recording_transport_client_test.gointernal/dryrun/trace.gointernal/dryrun/trace_test.gointernal/executor/README.mdinternal/executor/executor_test.gointernal/executor/post_action_executor.gointernal/executor/precondition_executor.gointernal/executor/resource_executor.gointernal/executor/resource_executor_test.gointernal/executor/types.gointernal/executor/utils.gointernal/executor/utils_test.gointernal/k8sclient/apply.gointernal/k8sclient/client.gointernal/k8sclient/interface.gointernal/k8sclient/mock.gointernal/maestroclient/client.gointernal/manifest/generation.gointernal/transportclient/interface.gointernal/transportclient/types.gotest/integration/executor/setup_test.gotest/integration/k8sclient/client_integration_test.gotest/testdata/dryrun/cel-showcase/dryrun-cel-showcase-api-responses.jsontest/testdata/dryrun/cel-showcase/dryrun-cel-showcase-discovery.jsontest/testdata/dryrun/cel-showcase/dryrun-cel-showcase-task-config.yamltest/testdata/dryrun/cel-showcase/dryrun-cel-showcase.shtest/testdata/dryrun/dryrun-delete-api-responses.jsontest/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-discovery.jsontest/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-task-config.yamltest/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete.shtest/testdata/dryrun/kubernetes/dryrun-kubernetes-adatepr-task-config-invalid.yamltest/testdata/dryrun/kubernetes/dryrun-kubernetes-discovery.jsontest/testdata/dryrun/kubernetes/dryrun-kubernetes-task-config.yamltest/testdata/dryrun/kubernetes/dryrun-kubernetes.shtest/testdata/dryrun/maestro-delete/dryrun-maestro-delete-discovery.jsontest/testdata/dryrun/maestro-delete/dryrun-maestro-delete-task-config.yamltest/testdata/dryrun/maestro-delete/dryrun-maestro-delete.shtest/testdata/dryrun/maestro/dryrun-maestro-adapter-task-config.yamltest/testdata/dryrun/maestro/dryrun-maestro-discovery.jsontest/testdata/dryrun/maestro/dryrun-maestro.sh
💤 Files with no reviewable changes (2)
- internal/executor/utils_test.go
- internal/k8sclient/interface.go
✅ Files skipped from review due to trivial changes (23)
- test/testdata/dryrun/maestro/dryrun-maestro-adapter-task-config.yaml
- internal/transportclient/types.go
- test/testdata/dryrun/kubernetes/dryrun-kubernetes.sh
- internal/dryrun/trace_test.go
- internal/executor/README.md
- test/testdata/dryrun/cel-showcase/dryrun-cel-showcase.sh
- test/testdata/dryrun/maestro/dryrun-maestro.sh
- test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete.sh
- charts/README.md
- internal/configloader/constants.go
- internal/k8sclient/apply.go
- charts/examples/kubernetes/values.yaml
- internal/configloader/validator_test.go
- test/testdata/dryrun/maestro-delete/dryrun-maestro-delete.sh
- test/testdata/dryrun/dryrun-delete-api-responses.json
- test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-discovery.json
- charts/examples/maestro-two-resources/README.md
- charts/examples/maestro-two-resources/adapter-config.yaml
- test/testdata/dryrun/maestro-delete/dryrun-maestro-delete-discovery.json
- charts/examples/maestro-two-resources/values.yaml
- test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-task-config.yaml
- docs/adapter-authoring-guide.md
- charts/examples/maestro-two-resources/adapter-task-config.yaml
🚧 Files skipped from review as they are similar to previous changes (16)
- test/testdata/dryrun/kubernetes/dryrun-kubernetes-task-config.yaml
- internal/criteria/evaluator.go
- charts/examples/maestro/values.yaml
- internal/criteria/evaluator_test.go
- internal/manifest/generation.go
- test/testdata/dryrun/cel-showcase/dryrun-cel-showcase-task-config.yaml
- internal/maestroclient/client.go
- test/integration/k8sclient/client_integration_test.go
- internal/transportclient/interface.go
- internal/configloader/types.go
- internal/executor/precondition_executor.go
- internal/dryrun/trace.go
- internal/dryrun/recording_transport_client_test.go
- charts/examples/maestro-two-resources/adapter-task-resource-manifestwork-namespace.yaml
- charts/examples/maestro-two-resources/adapter-task-resource-manifestwork-configmap.yaml
- test/testdata/dryrun/maestro-delete/dryrun-maestro-delete-task-config.yaml
| if err != nil { | ||
| re.log.Debugf(ctx, "Resource[%s] pre-discovery: failed to render targetCluster (skipping): %v", | ||
| resource.Name, err) | ||
| continue |
There was a problem hiding this comment.
Warning
Blocking
Category: Bug
preDiscoverAll silently skips Maestro resources when RenderTemplate for targetCluster fails (line 489–491), leaving the resource absent from execCtx.Resources. If a downstream lifecycle.delete.when expression checks !resources.?X.hasValue(), this "absent because we couldn't render the template" state is indistinguishable from "absent because the resource doesn't exist" — potentially triggering an unwanted deletion.
The discovery error path already does the right thing (lines 497–506: propagate transient errors). The template rendering path should follow the same pattern:
targetCluster, err := utils.RenderTemplate(resource.Transport.Maestro.TargetCluster, execCtx.Params)
if err != nil {
re.log.Warnf(ctx, "Resource[%s] pre-discovery: failed to render targetCluster: %v",
resource.Name, err)
return NewExecutorError(PhaseResources, resource.Name, "pre-discovery: failed to render targetCluster", err)
}This way a broken template surfaces as a reconciliation error instead of silently becoming a deletion trigger.
|
Tip nit — non-blocking suggestion File: Category: Inconsistency The template example uses Consider aligning the template with the documented best practice: # before
expression: "preconditions.myPrecondition.deleted_time != null"
# after
expression: "has(preconditions.myPrecondition.deleted_time)" |
| // guarantees that the initial cluster state is visible to when expressions. | ||
| func (re *ResourceExecutor) preDiscoverAll( | ||
| ctx context.Context, | ||
| resources []configloader.Resource, |
There was a problem hiding this comment.
Tip
nit — non-blocking suggestion
Category: Improvement
preDiscoverAll runs on every reconciliation even when no resource has lifecycle.delete. For adapters that don't use this feature, it adds unnecessary discovery API calls per resource.
A guard at the call site would skip the pass entirely when not needed:
if re.hasLifecycleDelete(resources) {
if err := re.preDiscoverAll(ctx, resources, execCtx); err != nil {
return result, err
}
}Where hasLifecycleDelete is a simple loop checking if any resource has resource.Lifecycle != nil && resource.Lifecycle.Delete != nil.
| fmt.Fprintf(&b, " Kind: %-12s Namespace: %-12s Name: %s\n", rr.Kind, rr.Namespace, rr.ResourceName) | ||
|
|
||
| if rr.DiscoveredState != nil { | ||
| if stateBytes, err := json.Marshal(rr.DiscoveredState.Object); err == nil { |
There was a problem hiding this comment.
Tip
nit — non-blocking suggestion
Category: Bug
Lines 191 and 329 check rr.DiscoveredState != nil but don't guard against DiscoveredState.Object being nil (an *unstructured.Unstructured can have a nil inner map). Currently executeResourceDelete always populates .Object, but the trace shouldn't depend on that:
if rr.DiscoveredState != nil && rr.DiscoveredState.Object != nil {Same pattern applies at line 329.
| delete(c.resources, key) | ||
| } | ||
|
|
||
| c.Records = append(c.Records, TransportRecord{ |
There was a problem hiding this comment.
Tip
nit — non-blocking suggestion
Category: Improvement
DeleteResource unconditionally appends a TransportRecord even when the resource doesn't exist in the dry-run store (line 208–213). For K8s transport, the real API would return NotFound; here the delete silently succeeds and logs a record, which can make dry-run traces misleading.
Consider guarding the record or adding a status field:
if _, exists := c.resources[key]; !exists && target == nil {
// K8s transport: real API would return NotFound
return apierrors.NewNotFound(schema.GroupResource{Group: gvk.Group, Resource: gvk.Kind}, name)
}Alternatively, keep the current behavior but add an Exists field to TransportRecord so the trace can distinguish actual deletes from no-ops.
rafabene
left a comment
There was a problem hiding this comment.
Review Summary
1 blocking issue, 4 nits — overall a solid and well-tested implementation of lifecycle delete. The architecture is clean (pre-discovery pass, CEL evaluation, per-resource error tracking) and the test coverage is extensive.
Blocking
preDiscoverAllsilently swallows Maestro template rendering failures (resource_executor.go:489): WhenRenderTemplatefails fortargetCluster, the resource is silently skipped viacontinue, leaving it absent fromexecCtx.Resources. A downstream!resources.?X.hasValue()expression would then interpret this as "resource doesn't exist" and could trigger an unwanted deletion. The discovery error path (lines 497–506) already propagates transient errors correctly — the template rendering path should do the same.
Nits
- Template vs authoring guide CEL pattern mismatch (
adapter-task-config-template.yaml:98): Template suggestsdeleted_time != nullbut the authoring guide recommendshas(preconditions.myPrecondition.deleted_time)— semantically different in CEL. preDiscoverAllruns unconditionally (resource_executor.go:475): Adds unnecessary discovery API calls for adapters that don't uselifecycle.delete. A simple guard would skip the pass entirely.DiscoveredState.Objectnil guard missing (trace.go:191,329): Only the pointer is nil-checked, but the innermap[string]interface{}could also be nil.- Dry-run records delete for non-existent resources (
recording_transport_client.go:208): Real K8s API would return NotFound; dry-run silently succeeds and logs a misleading record.
See inline comments for details and suggested fixes.
| // instead of applying it. This enables dependency-ordered deletion driven by CEL expressions. | ||
| if resource.Lifecycle != nil && resource.Lifecycle.Delete != nil { | ||
| shouldDelete, delErr := re.evaluateLifecycleDeleteWhen(ctx, resource, execCtx) | ||
| if delErr != nil { |
There was a problem hiding this comment.
Warning
Blocking
Category: Bug
When evaluateLifecycleDeleteWhen returns an error (e.g., typo in CEL expression), executeResource returns with result.Operation still at zero value "". In ExecuteAll (lines 65-74), the check result.Operation == manifest.OperationDelete is false, so the error triggers fail-fast instead of continue:
if result.Operation == manifest.OperationDelete { // "" != "delete" → false
deleteErrs = append(deleteErrs, err)
continue // never reached
}
return results, errors.Join(...) // fail-fast instead of continueThis contradicts HYPERFLEET-849's requirement to "continue with the rest of resources deletion." A CEL evaluation error on resourceA's lifecycle.delete.when skips all subsequent resources.
Fix: set result.Operation before returning the error:
if delErr != nil {
result.Status = StatusFailed
result.Error = delErr
result.Operation = manifest.OperationDelete // ← add this
re.recordResourceError(execCtx, resource, delErr)
return result, NewExecutorError(...)
}| } | ||
| // Transient error (RBAC, network, API server): propagate so the reconciliation | ||
| // fails visibly rather than treating the resource as absent. | ||
| re.log.Warnf(ctx, "Resource[%s] pre-discovery failed: %v", resource.Name, err) |
There was a problem hiding this comment.
Tip
nit — non-blocking suggestion
Category: Bug
Upgrading my earlier comment about preDiscoverAll running unconditionally — the impact is a behavioral regression, not just wasted API calls.
preDiscoverAll runs before any apply, for ALL resources with discovery configured (not just those with lifecycle.delete). If a resource uses a CRD that doesn't exist yet (first reconciliation), the API server returns "no matches for kind 'X' in version 'v1alpha1'" — this is NOT apierrors.IsNotFound, so line 504 propagates it as a transient error, failing the entire reconciliation before any work is done.
Before this PR, the apply-first-then-discover flow would succeed: a prior resource in the list installs the CRD, then this resource is applied and discovered. Now, pre-discovery fails first.
The guard I suggested earlier (if re.hasLifecycleDelete(resources)) would also fix this regression — only pre-discover when lifecycle.delete is actually in use.
There was a problem hiding this comment.
Wow, this is an awesome finding I never though of
|
Warning Blocking File: Category: Bug
// SetError at types.go:334 — no guard, unconditional overwrite
ec.Adapter.ExecutionError = &ExecutionError{
Phase: reason, // "ResourceFailed" — not a valid phase constant
Message: message,
}
// Step field is lost (zero value "")This corrupts the error diagnostics. Post-action CEL expressions like the Health condition in the example configs: produce Fix options:
|
Add lifecycle.delete DSL block to per-resource config, enabling adapters to conditionally delete managed resources based on CEL expressions (e.g., deleted_at != null) with configurable K8s propagation policies. Key design decisions: - Post-delete rediscovery determines actual resource state after DeleteResource: NotFound → store nil (same-reconciliation cascade); still present (finalizers/Maestro async) → store object (next cycle) - TransportClient.DeleteResource added; K8s uses propagationPolicy, Maestro ignores it (ManifestWork handles cleanup semantics) - Nil entries in CEL resources map enable resources.X == null ordering expressions for dependency-ordered deletion Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> HYPERFLEET-849 - feat: chart examples for lifecycle delete Add and update chart examples demonstrating the lifecycle.delete DSL: - kubernetes: add adapter-task-config.yaml with delete example - maestro: add adapter-task-config.yaml with delete example - maestro-two-resources: new example showing dependency-ordered deletion of two resources (namespace + configmap) with Maestro ManifestWork - Update README and values files accordingly Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> HYPERFLEET-849 - feat: dry-run delete support and testdata reorganization - DryrunTransportClient.DeleteResource removes entries from in-memory store, accurately simulating instant deletion for single-pass testing - NewDryrunTransportClientWithOverrides pre-populates the store so overridden resources are discoverable before any ApplyResource call, enabling delete dry-runs where resources pre-exist on the cluster - GetResource returns apierrors.NewNotFound so post-delete rediscovery correctly detects NotFound via apierrors.IsNotFound - ResourceResult.DiscoveredState captures the pre-delete object; shown in text trace as "Pre-delete state:" block and in JSON as "discoveredState" field - Reorganize testdata into named scenario subdirectories with shell runner scripts; add kubernetes-delete and maestro-delete scenarios Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
b470fc5 to
c753ba0
Compare
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ciaranRoche The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
95b2701
into
openshift-hyperfleet:main
Add lifecycle.delete DSL block to per-resource config, enabling adapters to conditionally delete managed resources based on CEL expressions (e.g., deleted_at != null) with configurable K8s propagation policies.
There are a lot of files, but you can divide them mostly in:
whenCEL expressions to know if there is something to delete or not.Summary by CodeRabbit