Skip to content

HYPERFLEET-849 - feat: lifecycle delete for resources phase#104

Merged
openshift-merge-bot[bot] merged 1 commit into
openshift-hyperfleet:mainfrom
rh-amarin:HYPERFLEET-849
Apr 23, 2026
Merged

HYPERFLEET-849 - feat: lifecycle delete for resources phase#104
openshift-merge-bot[bot] merged 1 commit into
openshift-hyperfleet:mainfrom
rh-amarin:HYPERFLEET-849

Conversation

@rh-amarin
Copy link
Copy Markdown
Contributor

@rh-amarin rh-amarin commented Apr 14, 2026

Add lifecycle.delete DSL block to per-resource config, enabling adapters to conditionally delete managed resources based on CEL expressions (e.g., deleted_at != null) with configurable K8s propagation policies.

There are a lot of files, but you can divide them mostly in:

  • chart/examples
    • kubernetes
      • I changed how the namespace is deleted to depend on the other objects to be deleted to delete the namespace
      • This means the adapters need a second event cycle to delete completely the resources
    • maestro
      • This is as before, one single manifestwork is deleted, both the namespace and configmap within it are deleted
    • added a maestro-two-resources
      • This one splits into two manifestworks, one for the configmap one for the namespace
      • Deletion of the namespace waits for the first one to be deleted
      • So two cycles of events are required for deletion
  • test/tesdata/dryrun
    • refactored into its own subfolders and added shell scripts to trigger the examples easily
  • code
    • Reading the config from DSL and validating
    • Refactoring how errors work, since each resource in the resources phase can come with errors or not
    • Pre-discovery before any deletion action, so you can use when CEL expressions to know if there is something to delete or not.
    • Post-delete rediscovery determines actual resource state after DeleteResource: NotFound → store nil (same-reconciliation cascade); still present (finalizers/Maestro async) → store object (next cycle)

Summary by CodeRabbit

  • New Features
    • Conditional resource deletion with expression gating, configurable propagation policy, ordered multi-resource teardown, finalized-status reporting, and per-resource discovery state capture
  • Bug Fixes
    • Improved field-extraction/CEL error handling, clearer warnings/defaulting, and more robust precondition capture semantics
  • Documentation
    • Expanded authoring guide, new multi-resource examples, and Helm values/docs updated to camelCase
  • Tests
    • Added dry-run fixtures, helper scripts, and extensive deletion-flow unit/integration coverage
  • Chores
    • YAML/values key renames from snake_case to camelCase

@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Apr 14, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 14, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds conditional resource deletion via a new resource lifecycle (lifecycle.delete) with a CEL when.expression and optional propagationPolicy. Config loader types and validator were extended to parse/validate lifecycle blocks. Executor now pre-discovers resources, evaluates delete expressions, routes matched resources to a delete workflow that calls the new TransportClient.DeleteResource (with DeleteOptions/propagationPolicy), re-discovers post-delete state, records per-resource errors while preserving the first-failure adapter.executionError, and updates finalized/health payloads. Multiple example charts, dry-run fixtures, transports (Kubernetes/Maestro), tests, docs, and utilities were added or updated.

Sequence Diagram(s)

sequenceDiagram
    participant Exec as Executor
    participant CELEval as CEL Evaluator
    participant Transport as TransportClient
    participant API as K8s/Maestro API

    rect rgba(100,150,200,0.5)
        Exec->>Transport: DiscoverResources()
        Transport->>API: GET resources
        API-->>Transport: objects / NotFound
        Transport-->>Exec: discovered map
    end

    rect rgba(150,100,200,0.5)
        Exec->>CELEval: evaluateLifecycleDeleteWhen(expression, execCtx)
        CELEval-->>Exec: true / false / error
    end

    rect rgba(200,150,100,0.5)
        alt CEL true (delete)
            Exec->>Transport: DeleteResource(gvk, ns, name, DeleteOptions, target)
            Transport->>API: DELETE (uses propagationPolicy for K8s)
            API-->>Transport: OK / NotFound / async state
            Exec->>Transport: DiscoverResources() (post-delete verify)
        else CEL false (apply)
            Exec->>Transport: ApplyResource(manifest)
            Transport->>API: CREATE/PATCH
            API-->>Transport: OK
        end
    end

    rect rgba(150,200,100,0.5)
        Transport-->>Exec: final discovered state (nil if removed)
        Exec->>Exec: update execCtx.Resources, record ResourceErrors, set adapter.executionError (first failure)
        Exec->>API: POST status update (Health/Finalized/resourceErrors)
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 45.65% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly identifies the feature (lifecycle delete for resources phase) and references the ticket (HYPERFLEET-849), accurately summarizing the main change in the pull request.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@rh-amarin rh-amarin force-pushed the HYPERFLEET-849 branch 5 times, most recently from 84afe44 to 5b129e3 Compare April 17, 2026 06:53
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 14

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/executor/types.go (1)

364-390: ⚠️ Potential issue | 🟠 Major

Remove unreachable nil handling from trace or preserve nil values in the CEL map.

The code intentionally omits deleted resources (nil values) from the CEL resources map, making missing keys return Optional.none() for optional access. However, internal/dryrun/trace.go:223 explicitly handles val == nil, which is unreachable since GetCELVariables() never includes nil values. Either remove the dead nil branch in trace.go, or preserve nil values in the resources map to match the trace.go expectations and enable resources.X == null semantics.

The current omission-based design is consistent with all existing CEL expressions (!resources.?clusterManifestWork.hasValue() throughout the config examples), but the trace.go code suggests an intention to support nil values that isn't actually fulfilled.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/executor/types.go` around lines 364 - 390, GetCELVariables currently
omits nil entries so trace.go's nil branch is unreachable; update the resources
map construction in GetCELVariables (the loop building "resources" and the
assignment result["resources"]) to preserve deleted resources as explicit nil
values (i.e., when val == nil set resources[name] = nil, and for nested maps
include nested[nestedName] = nil when nestedRes == nil) so
internal/dryrun/trace.go's val == nil handling works, or alternatively remove
the dead nil-check in trace.go—prefer preserving nils to keep existing CEL
semantics and enable resources.X == null checks.
🧹 Nitpick comments (1)
internal/maestroclient/client.go (1)

456-469: Add an explicit empty-name guard before deleting ManifestWork.

name is used directly as the ManifestWork identifier; validating it early prevents ambiguous backend errors and makes failure mode deterministic.

Proposed patch
 func (c *Client) DeleteResource(
 	ctx context.Context,
 	_ schema.GroupVersionKind,
 	_, name string,
 	_ *transportclient.DeleteOptions,
 	target transportclient.TransportContext,
 ) error {
+	if strings.TrimSpace(name) == "" {
+		return fmt.Errorf("resource name is required for DeleteResource")
+	}
+
 	transportCtx := c.resolveTransportContext(target)
 	if transportCtx == nil || transportCtx.ConsumerName == "" {
 		return fmt.Errorf(
 			"maestro TransportContext with ConsumerName is required for DeleteResource")
 	}
 	return c.DeleteManifestWork(ctx, transportCtx.ConsumerName, name)
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/maestroclient/client.go` around lines 456 - 469, The DeleteResource
implementation uses the incoming name directly as the ManifestWork identifier;
add an explicit guard at the top of DeleteResource to return a clear error when
name is empty (e.g., "ManifestWork name is required") before resolving transport
context or calling DeleteManifestWork so you fail fast and avoid ambiguous
backend errors; reference the DeleteResource function, the name parameter, and
the DeleteManifestWork call when applying this check.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@charts/examples/maestro/adapter-task-config.yaml`:
- Around line 187-202: The Deleted gate/expressions currently treat absence of
agentNamespaceManifestWork as Deleted even when no deletion was requested;
update the three expressions (status, reason, message) that use
resources.?agentNamespaceManifestWork.hasValue() so they also require the
deletion flag (resources.?is_deleting) to be true before emitting
Deleted/ManifestWorkDeleted messages—e.g., change the ternary condition to check
both not hasValue() AND resources.?is_deleting (instead of just not hasValue());
keep the existing True/False, ManifestWorkDeleted/WaitingForMaestroDeletion and
the two message strings but only return the Deleted branch when
resources.?is_deleting is true.

In `@docs/adapter-authoring-guide.md`:
- Around line 739-747: The fenced code block containing the Reconciliation
example is missing a language label, triggering markdownlint rule MD040; update
the block's opening fence to include a language (use "text") so it becomes
```text instead of just ```, i.e. modify the fenced block that starts with the
lines "Reconciliation 1 (is_deleting=true):" / "Reconciliation 2
(configMapResource gone):" to use ```text as the opener.

In `@internal/configloader/validator.go`:
- Around line 618-620: The code calls v.validateCELExpression(...) during
ValidateSemantic() even when v.celEnv may be nil; guard that call by checking
v.celEnv before invoking v.validateCELExpression (e.g., if v.celEnv == nil then
record/append a validation error for the lifecycle when-expression or skip CEL
validation), referencing ValidateSemantic, v.celEnv, v.validateCELExpression,
and the lifecycle fields (FieldLifecycleWhen, FieldExpression, basePath) so the
method never dereferences a nil CEL environment.

In `@internal/criteria/evaluator.go`:
- Around line 221-224: Calls to ExtractValue() in post_action_executor.go and
precondition_executor.go currently only inspect the parse error and ignore the
in-band extraction error stored in result.Error; update both call sites so that
immediately after calling ExtractValue(...) you check if result.Error != nil and
propagate or return that error (same behavior as you do for the parse error)
instead of continuing with result.Value == nil, ensuring the caller handles
missing-field conditions consistently; reference the ExtractValue(...) call and
the result variable at each call site and align the error handling path with
existing parse-error handling logic.

In `@internal/dryrun/recording_transport_client.go`:
- Around line 61-67: The code stores unstructured.Unstructured objects backed by
the caller's map (overrides) and later DeleteResource mutates those stored
objects; fix by storing deep copies instead of the original backing map: after
building the unstructured with u := &unstructured.Unstructured{Object: obj} call
u = u.DeepCopy() (or store u.DeepCopy() into the resources map) so the stored
value is independent, and apply the same DeepCopy change to the other creation
sites mentioned (the other overrides population paths that create
unstructured.Unstructured before insertion) to prevent shared state bleeding
between runs; reference symbols: resources map, unstructured.Unstructured,
resourceKey, DeleteResource, and the overrides variable/DiscoveryOverrides
usage.

In `@internal/dryrun/trace.go`:
- Around line 190-194: The code is serializing and printing
rr.DiscoveredState.Object unconditionally, which can leak secrets; change the
logic to only marshal and pretty-print DiscoveredState when the tracer is in
verbose mode (check the tracer's Verbose flag) and skip or redact this output
otherwise; update the block that references rr.DiscoveredState,
DiscoveredState.Object and prettyJSON to be guarded by that Verbose check, and
make the same change to the other occurrence that prints pre-delete state (the
block at the other similar location).

In `@internal/executor/resource_executor.go`:
- Around line 647-663: The ResourceExecutor.recordResourceError helper currently
populates execCtx.Adapter.ResourceErrors but is only invoked by delete paths;
update the apply paths to call this helper so apply-time failures are recorded
too: when ApplyResource (or its wrapper like
executeResourceApply/executeResource) returns an error, call
re.recordResourceError(execCtx, resource, err) before returning, and likewise
call it when post-apply discovery fails so
execCtx.Adapter.ResourceErrors[resource.Name] is populated; ensure you reference
the existing recordResourceError, ExecutionContext.Adapter.ResourceErrors, and
the ApplyResource/error-handling sites to add these calls.
- Around line 507-514: The evaluateLifecycleDeleteWhen method in
ResourceExecutor should fail fast when a delete.when block is present but
missing an expression: instead of returning false, nil when
resource.Lifecycle.Delete.When is non-nil but
resource.Lifecycle.Delete.When.Expression == "" (current behavior), return a
descriptive error indicating the invalid lifecycle.delete.when configuration
(include resource identity from the provided resource or execCtx to aid
debugging). Update evaluateLifecycleDeleteWhen to check for a non-nil
Delete.When with an empty Expression and return an error rather than silently
treating it as no-delete.
- Around line 567-586: discoverResource can return (nil, nil) to indicate “no
discovery configured,” but current code treats any discovered==nil as already
deleted and skips DeleteResource; change the branching so only an actual
NotFound error (isNotFound == true) triggers marking the resource deleted and
returning success. In the block around discoverResource / isNotFound, keep the
existing behavior for discoverErr != nil && !isNotFound (fail), handle
isNotFound as the "already gone" case (set execCtx.Resources[resource.Name] =
nil, set OperationReason, log, return), and do NOT treat discovered == nil &&
discoverErr == nil as deleted—allow execution to continue to the DeleteResource
flow. Update the logic that references discoverResource in resource_executor.go
(the discoverResource call and the returned discovered/isNotFound checks)
accordingly.

In `@internal/k8sclient/apply.go`:
- Around line 130-132: The case for manifest.OperationDelete inside
ApplyManifest currently no-ops and can mask routing bugs; change it to fail fast
by returning a clear error instead of silent success. In the switch case for
manifest.OperationDelete within ApplyManifest, remove the empty block and return
an error (e.g. using fmt.Errorf) stating that delete operations should be
handled via DeleteResource, including relevant context (resource name/kind) if
available; do not attempt to perform deletion here. This ensures misrouted
delete requests are surfaced rather than silently ignored.

In `@test/testdata/dryrun/cel-showcase/dryrun-cel-showcase.sh`:
- Around line 1-2: Add a UNIX shebang to the top of the dryrun-cel-showcase.sh
script so it can be executed directly; edit the file containing the line
starting with "go run ./cmd/adapter/main.go serve \" (dryrun-cel-showcase.sh) to
insert a shebang like "#!/usr/bin/env bash" as the first line and ensure the
file is made executable (chmod +x) in the repository or CI as needed.

In `@test/testdata/dryrun/kubernetes/dryrun-kubernetes.sh`:
- Around line 1-9: Add a shebang as the very first line of the
dryrun-kubernetes.sh script (use #!/usr/bin/env bash) so the script can be
executed directly, ensure it remains before the existing go run command, and
then mark the file executable (chmod +x dryrun-kubernetes.sh) as part of the
fix; no other changes to the go run invocation or flags are required.

In `@test/testdata/dryrun/maestro-delete/dryrun-maestro-delete.sh`:
- Around line 1-3: This script lacks a shebang, so make it executable by adding
a proper interpreter header (e.g., a bash/sh shebang) at the very top of the
file before the existing comments and the `go run ./cmd/adapter/main.go serve`
invocation; ensure the shebang is the first line and use a portable form such as
`/usr/bin/env bash` so the script can be run directly.

In `@test/testdata/dryrun/maestro/dryrun-maestro.sh`:
- Around line 1-3: Add a POSIX-compatible shebang as the first line of the
dryrun-maestro.sh script (use a portable form that invokes bash via env) so the
file runs reliably when executed, then ensure the script is executable (chmod
+x); leave the existing comment lines and the "go run ./cmd/adapter/main.go
serve" invocation unchanged.

---

Outside diff comments:
In `@internal/executor/types.go`:
- Around line 364-390: GetCELVariables currently omits nil entries so trace.go's
nil branch is unreachable; update the resources map construction in
GetCELVariables (the loop building "resources" and the assignment
result["resources"]) to preserve deleted resources as explicit nil values (i.e.,
when val == nil set resources[name] = nil, and for nested maps include
nested[nestedName] = nil when nestedRes == nil) so internal/dryrun/trace.go's
val == nil handling works, or alternatively remove the dead nil-check in
trace.go—prefer preserving nils to keep existing CEL semantics and enable
resources.X == null checks.

---

Nitpick comments:
In `@internal/maestroclient/client.go`:
- Around line 456-469: The DeleteResource implementation uses the incoming name
directly as the ManifestWork identifier; add an explicit guard at the top of
DeleteResource to return a clear error when name is empty (e.g., "ManifestWork
name is required") before resolving transport context or calling
DeleteManifestWork so you fail fast and avoid ambiguous backend errors;
reference the DeleteResource function, the name parameter, and the
DeleteManifestWork call when applying this check.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: a9399057-fb52-4dd4-b0ed-692bb9496b15

📥 Commits

Reviewing files that changed from the base of the PR and between 0104564 and 5b129e3.

📒 Files selected for processing (59)
  • charts/README.md
  • charts/examples/kubernetes/adapter-task-config.yaml
  • charts/examples/kubernetes/values.yaml
  • charts/examples/maestro-two-resources/README.md
  • charts/examples/maestro-two-resources/adapter-config.yaml
  • charts/examples/maestro-two-resources/adapter-task-config.yaml
  • charts/examples/maestro-two-resources/adapter-task-resource-manifestwork-configmap.yaml
  • charts/examples/maestro-two-resources/adapter-task-resource-manifestwork-namespace.yaml
  • charts/examples/maestro-two-resources/values.yaml
  • charts/examples/maestro/adapter-task-config.yaml
  • charts/examples/maestro/values.yaml
  • configs/adapter-task-config-template.yaml
  • docs/adapter-authoring-guide.md
  • internal/configloader/constants.go
  • internal/configloader/types.go
  • internal/configloader/validator.go
  • internal/configloader/validator_test.go
  • internal/criteria/evaluator.go
  • internal/criteria/evaluator_test.go
  • internal/dryrun/recording_transport_client.go
  • internal/dryrun/recording_transport_client_test.go
  • internal/dryrun/trace.go
  • internal/dryrun/trace_test.go
  • internal/executor/README.md
  • internal/executor/executor_test.go
  • internal/executor/post_action_executor.go
  • internal/executor/precondition_executor.go
  • internal/executor/resource_executor.go
  • internal/executor/resource_executor_test.go
  • internal/executor/types.go
  • internal/executor/utils.go
  • internal/k8sclient/apply.go
  • internal/k8sclient/client.go
  • internal/k8sclient/interface.go
  • internal/k8sclient/mock.go
  • internal/maestroclient/client.go
  • internal/manifest/generation.go
  • internal/transportclient/interface.go
  • internal/transportclient/types.go
  • test/integration/executor/setup_test.go
  • test/integration/k8sclient/client_integration_test.go
  • test/testdata/dryrun/cel-showcase/dryrun-cel-showcase-api-responses.json
  • test/testdata/dryrun/cel-showcase/dryrun-cel-showcase-discovery.json
  • test/testdata/dryrun/cel-showcase/dryrun-cel-showcase-task-config.yaml
  • test/testdata/dryrun/cel-showcase/dryrun-cel-showcase.sh
  • test/testdata/dryrun/dryrun-delete-api-responses.json
  • test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-discovery.json
  • test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-task-config.yaml
  • test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete.sh
  • test/testdata/dryrun/kubernetes/dryrun-kubernetes-adatepr-task-config-invalid.yaml
  • test/testdata/dryrun/kubernetes/dryrun-kubernetes-discovery.json
  • test/testdata/dryrun/kubernetes/dryrun-kubernetes-task-config.yaml
  • test/testdata/dryrun/kubernetes/dryrun-kubernetes.sh
  • test/testdata/dryrun/maestro-delete/dryrun-maestro-delete-discovery.json
  • test/testdata/dryrun/maestro-delete/dryrun-maestro-delete-task-config.yaml
  • test/testdata/dryrun/maestro-delete/dryrun-maestro-delete.sh
  • test/testdata/dryrun/maestro/dryrun-maestro-adapter-task-config.yaml
  • test/testdata/dryrun/maestro/dryrun-maestro-discovery.json
  • test/testdata/dryrun/maestro/dryrun-maestro.sh
💤 Files with no reviewable changes (1)
  • internal/k8sclient/interface.go

Comment thread charts/examples/maestro/adapter-task-config.yaml Outdated
Comment thread docs/adapter-authoring-guide.md Outdated
Comment thread internal/configloader/validator.go
Comment thread internal/criteria/evaluator.go
Comment thread internal/dryrun/recording_transport_client.go
Comment thread internal/k8sclient/apply.go
Comment thread test/testdata/dryrun/cel-showcase/dryrun-cel-showcase.sh
Comment thread test/testdata/dryrun/kubernetes/dryrun-kubernetes.sh
Comment thread test/testdata/dryrun/maestro-delete/dryrun-maestro-delete.sh
Comment thread test/testdata/dryrun/maestro/dryrun-maestro.sh
@rh-amarin rh-amarin marked this pull request as ready for review April 17, 2026 12:15
@openshift-ci openshift-ci Bot requested review from aredenba-rh and crizzo71 April 17, 2026 12:15
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

♻️ Duplicate comments (2)
internal/executor/resource_executor.go (2)

97-105: ⚠️ Potential issue | 🟠 Major

Populate adapter.resourceErrors on apply-side failures too.

Only delete failures call recordResourceError(). Render/transport-target/apply/post-discovery/collision failures still only set adapter.executionError, so post-action CEL loses per-resource detail for the common resource-phase failures.

Suggested change
 		if tplErr != nil {
 			result.Status = StatusFailed
 			result.Error = tplErr
+			re.recordResourceError(execCtx, resource, tplErr)
 			return result, NewExecutorError(PhaseResources, resource.Name, "failed to render targetCluster template", tplErr)
 		}
@@
 	if err != nil {
 		result.Status = StatusFailed
 		result.Error = err
+		re.recordResourceError(execCtx, resource, err)
 		return result, NewExecutorError(PhaseResources, resource.Name, "failed to render manifest", err)
 	}
@@
 	if err != nil {
 		result.Status = StatusFailed
 		result.Error = err
+		re.recordResourceError(execCtx, resource, err)
 		execCtx.Adapter.ExecutionError = &ExecutionError{
@@
 		if discoverErr != nil {
 			result.Status = StatusFailed
 			result.Error = discoverErr
+			re.recordResourceError(execCtx, resource, discoverErr)
 			execCtx.Adapter.ExecutionError = &ExecutionError{
@@
 						result.Status = StatusFailed
 						result.Error = collisionErr
+						re.recordResourceError(execCtx, resource, collisionErr)
 						execCtx.Adapter.ExecutionError = &ExecutionError{

Also applies to: 131-164, 176-190, 216-227

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/executor/resource_executor.go` around lines 97 - 105, The review
points out apply-side failures (rendering transport target, transport target
creation, apply, post-discovery, collision) only set adapter.executionError
instead of recording per-resource details; update the failure handling in
resource_executor.go (the blocks around the TargetCluster render where tplErr is
handled, and the other error paths at lines referenced for
apply/post-discovery/collision) to call recordResourceError(adapter,
resource.Name, error) (or the existing recordResourceError invocation used by
deletes) whenever an error occurs, and ensure adapter.resourceErrors is
populated consistently instead of only setting adapter.executionError; keep
existing error returns and NewExecutorError usages but add the
recordResourceError call before returning so post-action CEL can see
per-resource errors.

512-516: ⚠️ Potential issue | 🟠 Major

Treat delete: without when: as invalid too.

Delete.When == nil still falls back to apply here. If semantic validation is skipped, a malformed delete config recreates the resource during teardown instead of failing fast.

Suggested change
 	if resource.Lifecycle.Delete.When == nil {
-		return false, nil
+		return false, fmt.Errorf(
+			"resource %q has lifecycle.delete configured but when.expression is missing",
+			resource.Name,
+		)
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/executor/resource_executor.go` around lines 512 - 516, The code
currently only errors when Delete.When exists but Expression is empty, allowing
a Delete block without When to be treated as valid; update the validation in the
lifecycle handling (the code referencing resource.Lifecycle.Delete and
resource.Lifecycle.Delete.When/Expression) to treat a non-nil
resource.Lifecycle.Delete with a nil When as an error — i.e., if
resource.Lifecycle.Delete != nil && resource.Lifecycle.Delete.When == nil return
an error (similar to the existing message) so malformed delete configs fail fast
instead of falling back to apply.
🧹 Nitpick comments (2)
internal/executor/README.md (1)

323-323: Clarify the exact DSL key path for delete conditions.

Line 323 should explicitly reference lifecycle.delete.when.expression (instead of lifecycle.delete.when) to avoid config ambiguity.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/executor/README.md` at line 323, Update the README table entry that
currently references the delete condition key path `lifecycle.delete.when` to
the more explicit `lifecycle.delete.when.expression`; change the line describing
the delete rule so it reads that the `delete` action triggers when
`lifecycle.delete.when.expression` evaluates to true, ensuring the docs use the
exact DSL key path (`lifecycle.delete.when.expression`) to avoid ambiguity.
charts/examples/maestro-two-resources/README.md (1)

26-37: Consider adding language identifier to fenced code block.

The reconciliation flow diagram could use a language specifier (e.g., ```text) for consistency with markdown lint rules, though this is purely stylistic for non-code content.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@charts/examples/maestro-two-resources/README.md` around lines 26 - 37, Add a
language identifier to the fenced code block containing the reconciliation flow
(the block that starts with "Reconciliation 1 (is_deleting=true):") by changing
the opening fence from ``` to ```text so markdown linters recognize it as a
plain-text block; ensure the closing fence remains ``` and leave the block
contents unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@charts/examples/kubernetes/adapter-task-config.yaml`:
- Around line 259-290: The "Deleted" status expression currently returns True
whenever resource references (clusterNamespace, jobServiceAccount, job_role,
job_rolebinding, jobNamespace, deploymentNamespace) are absent; update the
status, reason, and message expressions in the "Deleted" condition to also
require a deletion guard (e.g., an is_deleting boolean) so the condition only
becomes True when deletion was actually requested; specifically, add an explicit
check like resources.?is_deleting.hasValue() && resources.is_deleting == true
(or the equivalent path your schema uses) combined with the existing all-empty
checks in the expressions for type "Deleted" so the status, reason
("ResourcesDeleted"/"DeletionInProgress"), and message only flip to the deleted
state when is_deleting is true.

In `@internal/dryrun/recording_transport_client.go`:
- Around line 175-216: DeleteResource currently ignores opts and always removes
K8s-targeted resources immediately; update DeleteResource on
DryrunTransportClient to honor opts.PropagationPolicy by treating a K8s delete
as asynchronous when the propagation policy is Foreground: if target == nil and
opts != nil and opts.PropagationPolicy (pointer) equals
metav1.DeletePropagationForeground, set deletionTimestamp on the stored object
(using resourceKey and c.resources) and keep it in the store; otherwise
(Background/Orphan or nil) perform the immediate delete as before and still
append the TransportRecord.

In `@internal/dryrun/trace_test.go`:
- Around line 150-151: The current asserts use compact JSON literals and can
miss differently formatted bodies; replace the two assert.NotContains(t, output,
`{"request":"data"}`) and assert.NotContains(t, output, `{"response":"data"}`)
checks with formatting-independent assertions that match the JSON keys
regardless of whitespace/formatting — for example use assert.NotRegexp with
regexp.MustCompile(`"request"\s*:\s*"data"`) and
regexp.MustCompile(`"response"\s*:\s*"data"`) against the output (referencing
the output variable and the existing assert.NotContains calls) so the test fails
if those body keys/values appear in any spacing/indentation.

In `@internal/executor/resource_executor.go`:
- Around line 47-52: preDiscoverAll is being called unconditionally and causes
an extra discovery roundtrip for every reconciliation; change the caller so
preDiscoverAll(ctx, resources, execCtx) runs only when any resource actually
needs delete-capable evaluation. Implement a short predicate before the call
that scans the resources slice (e.g., for _, r := range resources) and returns
true if a resource has a delete lifecycle configured (check r.Lifecycle != nil
&& r.Lifecycle.Delete != nil) or otherwise indicates delete-capable execution in
execCtx; only invoke re.preDiscoverAll(...) when that predicate is true. Keep
the preDiscoverAll implementation and its non-fatal error behavior unchanged.
- Around line 487-494: The pre-discovery block in re.discoverResource currently
swallows non-NotFound errors (logging and continuing), which causes
resources.?X.hasValue() to be false on unknown/error and can wrongly allow
deletes; change re.discoverResource (and its callers) to return non-NotFound
errors up instead of continuing, and update ExecuteAll to check for and handle
discovery errors before evaluating lifecycle.delete.when so execution
fails-closed on RBAC/network/other errors; reference re.discoverResource,
ExecuteAll, and lifecycle.delete.when when making the change.

In
`@test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-task-config.yaml`:
- Around line 11-15: Update the dry-run example command in the header comment so
its --task-config, --dry-run-api-responses and --dry-run-discovery flags point
to the files that live in this scenario's directory (kubernetes-delete) rather
than the old pre-reorg paths; replace the three example paths with the matching
filenames in this directory (dryrun-kubernetes-delete-task-config.yaml,
dryrun-delete-api-responses.json, dryrun-kubernetes-delete-discovery.json) so
the example command correctly locates the scenario files.

---

Duplicate comments:
In `@internal/executor/resource_executor.go`:
- Around line 97-105: The review points out apply-side failures (rendering
transport target, transport target creation, apply, post-discovery, collision)
only set adapter.executionError instead of recording per-resource details;
update the failure handling in resource_executor.go (the blocks around the
TargetCluster render where tplErr is handled, and the other error paths at lines
referenced for apply/post-discovery/collision) to call
recordResourceError(adapter, resource.Name, error) (or the existing
recordResourceError invocation used by deletes) whenever an error occurs, and
ensure adapter.resourceErrors is populated consistently instead of only setting
adapter.executionError; keep existing error returns and NewExecutorError usages
but add the recordResourceError call before returning so post-action CEL can see
per-resource errors.
- Around line 512-516: The code currently only errors when Delete.When exists
but Expression is empty, allowing a Delete block without When to be treated as
valid; update the validation in the lifecycle handling (the code referencing
resource.Lifecycle.Delete and resource.Lifecycle.Delete.When/Expression) to
treat a non-nil resource.Lifecycle.Delete with a nil When as an error — i.e., if
resource.Lifecycle.Delete != nil && resource.Lifecycle.Delete.When == nil return
an error (similar to the existing message) so malformed delete configs fail fast
instead of falling back to apply.

---

Nitpick comments:
In `@charts/examples/maestro-two-resources/README.md`:
- Around line 26-37: Add a language identifier to the fenced code block
containing the reconciliation flow (the block that starts with "Reconciliation 1
(is_deleting=true):") by changing the opening fence from ``` to ```text so
markdown linters recognize it as a plain-text block; ensure the closing fence
remains ``` and leave the block contents unchanged.

In `@internal/executor/README.md`:
- Line 323: Update the README table entry that currently references the delete
condition key path `lifecycle.delete.when` to the more explicit
`lifecycle.delete.when.expression`; change the line describing the delete rule
so it reads that the `delete` action triggers when
`lifecycle.delete.when.expression` evaluates to true, ensuring the docs use the
exact DSL key path (`lifecycle.delete.when.expression`) to avoid ambiguity.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: dcc4f53e-a2df-4ad4-af76-6a6ba775ebf8

📥 Commits

Reviewing files that changed from the base of the PR and between 0104564 and 2b61ef5.

📒 Files selected for processing (59)
  • charts/README.md
  • charts/examples/kubernetes/adapter-task-config.yaml
  • charts/examples/kubernetes/values.yaml
  • charts/examples/maestro-two-resources/README.md
  • charts/examples/maestro-two-resources/adapter-config.yaml
  • charts/examples/maestro-two-resources/adapter-task-config.yaml
  • charts/examples/maestro-two-resources/adapter-task-resource-manifestwork-configmap.yaml
  • charts/examples/maestro-two-resources/adapter-task-resource-manifestwork-namespace.yaml
  • charts/examples/maestro-two-resources/values.yaml
  • charts/examples/maestro/adapter-task-config.yaml
  • charts/examples/maestro/values.yaml
  • configs/adapter-task-config-template.yaml
  • docs/adapter-authoring-guide.md
  • internal/configloader/constants.go
  • internal/configloader/types.go
  • internal/configloader/validator.go
  • internal/configloader/validator_test.go
  • internal/criteria/evaluator.go
  • internal/criteria/evaluator_test.go
  • internal/dryrun/recording_transport_client.go
  • internal/dryrun/recording_transport_client_test.go
  • internal/dryrun/trace.go
  • internal/dryrun/trace_test.go
  • internal/executor/README.md
  • internal/executor/executor_test.go
  • internal/executor/post_action_executor.go
  • internal/executor/precondition_executor.go
  • internal/executor/resource_executor.go
  • internal/executor/resource_executor_test.go
  • internal/executor/types.go
  • internal/executor/utils.go
  • internal/k8sclient/apply.go
  • internal/k8sclient/client.go
  • internal/k8sclient/interface.go
  • internal/k8sclient/mock.go
  • internal/maestroclient/client.go
  • internal/manifest/generation.go
  • internal/transportclient/interface.go
  • internal/transportclient/types.go
  • test/integration/executor/setup_test.go
  • test/integration/k8sclient/client_integration_test.go
  • test/testdata/dryrun/cel-showcase/dryrun-cel-showcase-api-responses.json
  • test/testdata/dryrun/cel-showcase/dryrun-cel-showcase-discovery.json
  • test/testdata/dryrun/cel-showcase/dryrun-cel-showcase-task-config.yaml
  • test/testdata/dryrun/cel-showcase/dryrun-cel-showcase.sh
  • test/testdata/dryrun/dryrun-delete-api-responses.json
  • test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-discovery.json
  • test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-task-config.yaml
  • test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete.sh
  • test/testdata/dryrun/kubernetes/dryrun-kubernetes-adatepr-task-config-invalid.yaml
  • test/testdata/dryrun/kubernetes/dryrun-kubernetes-discovery.json
  • test/testdata/dryrun/kubernetes/dryrun-kubernetes-task-config.yaml
  • test/testdata/dryrun/kubernetes/dryrun-kubernetes.sh
  • test/testdata/dryrun/maestro-delete/dryrun-maestro-delete-discovery.json
  • test/testdata/dryrun/maestro-delete/dryrun-maestro-delete-task-config.yaml
  • test/testdata/dryrun/maestro-delete/dryrun-maestro-delete.sh
  • test/testdata/dryrun/maestro/dryrun-maestro-adapter-task-config.yaml
  • test/testdata/dryrun/maestro/dryrun-maestro-discovery.json
  • test/testdata/dryrun/maestro/dryrun-maestro.sh
💤 Files with no reviewable changes (1)
  • internal/k8sclient/interface.go

Comment thread charts/examples/kubernetes/adapter-task-config.yaml
Comment thread internal/dryrun/recording_transport_client.go
Comment thread internal/dryrun/trace_test.go
Comment thread internal/executor/resource_executor.go Outdated
Comment thread internal/executor/resource_executor.go
Comment thread test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-task-config.yaml Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
internal/executor/resource_executor.go (1)

487-494: ⚠️ Potential issue | 🟠 Major

Pre-discovery swallows non-NotFound errors, potentially causing incorrect deletion.

When pre-discovery fails due to RBAC, network, or API errors (not NotFound), the resource is left absent from execCtx.Resources. This makes resources.?X.hasValue() return false, which could incorrectly unlock dependent deletes or allow Finalized=True to be reported when the actual state is unknown.

Consider returning an error from preDiscoverAll when any non-NotFound error occurs, or at minimum logging at WARN level instead of DEBUG.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/executor/resource_executor.go` around lines 487 - 494, The
pre-discovery loop in preDiscoverAll currently swallows non-NotFound errors
returned by re.discoverResource (checked via apierrors.IsNotFound) and leaves
resources absent from execCtx.Resources, which can incorrectly allow deletions
or report Finalized=True; modify preDiscoverAll to propagate any non-NotFound
error returned by re.discoverResource up to the caller (i.e., return the error
instead of continue), or at minimum change the log level to WARN and record the
failure in execCtx so resources.?X.hasValue() reflects the unknown state; update
references to discoverResource, preDiscoverAll, execCtx.Resources, and the
apierrors.IsNotFound check accordingly so callers can handle transient
RBAC/network/API errors rather than assuming absence.
🧹 Nitpick comments (1)
charts/examples/maestro-two-resources/README.md (1)

26-37: Add language specifier to fenced code block.

The reconciliation sequence diagram should have a language specifier for consistent rendering. Static analysis flagged this.

📝 Proposed fix
-```
+```text
 Reconciliation 1 (is_deleting=true):
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@charts/examples/maestro-two-resources/README.md` around lines 26 - 37, The
fenced code block showing the reconciliation sequence lacks a language
specifier; update the opening fence to include "text" (i.e., change the triple
backticks preceding "Reconciliation 1 (is_deleting=true):" to ```text) so the
block renders consistently and satisfies static analysis; leave the rest of the
block content unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@internal/executor/resource_executor.go`:
- Around line 487-494: The pre-discovery loop in preDiscoverAll currently
swallows non-NotFound errors returned by re.discoverResource (checked via
apierrors.IsNotFound) and leaves resources absent from execCtx.Resources, which
can incorrectly allow deletions or report Finalized=True; modify preDiscoverAll
to propagate any non-NotFound error returned by re.discoverResource up to the
caller (i.e., return the error instead of continue), or at minimum change the
log level to WARN and record the failure in execCtx so resources.?X.hasValue()
reflects the unknown state; update references to discoverResource,
preDiscoverAll, execCtx.Resources, and the apierrors.IsNotFound check
accordingly so callers can handle transient RBAC/network/API errors rather than
assuming absence.

---

Nitpick comments:
In `@charts/examples/maestro-two-resources/README.md`:
- Around line 26-37: The fenced code block showing the reconciliation sequence
lacks a language specifier; update the opening fence to include "text" (i.e.,
change the triple backticks preceding "Reconciliation 1 (is_deleting=true):" to
```text) so the block renders consistently and satisfies static analysis; leave
the rest of the block content unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 0c179216-f30a-4d62-bbd3-d5271df27f76

📥 Commits

Reviewing files that changed from the base of the PR and between 2b61ef5 and 4032a9a.

📒 Files selected for processing (60)
  • charts/README.md
  • charts/examples/kubernetes/adapter-task-config.yaml
  • charts/examples/kubernetes/values.yaml
  • charts/examples/maestro-two-resources/README.md
  • charts/examples/maestro-two-resources/adapter-config.yaml
  • charts/examples/maestro-two-resources/adapter-task-config.yaml
  • charts/examples/maestro-two-resources/adapter-task-resource-manifestwork-configmap.yaml
  • charts/examples/maestro-two-resources/adapter-task-resource-manifestwork-namespace.yaml
  • charts/examples/maestro-two-resources/values.yaml
  • charts/examples/maestro/adapter-task-config.yaml
  • charts/examples/maestro/values.yaml
  • configs/adapter-task-config-template.yaml
  • docs/adapter-authoring-guide.md
  • internal/configloader/constants.go
  • internal/configloader/types.go
  • internal/configloader/validator.go
  • internal/configloader/validator_test.go
  • internal/criteria/evaluator.go
  • internal/criteria/evaluator_test.go
  • internal/dryrun/recording_transport_client.go
  • internal/dryrun/recording_transport_client_test.go
  • internal/dryrun/trace.go
  • internal/dryrun/trace_test.go
  • internal/executor/README.md
  • internal/executor/executor_test.go
  • internal/executor/post_action_executor.go
  • internal/executor/precondition_executor.go
  • internal/executor/resource_executor.go
  • internal/executor/resource_executor_test.go
  • internal/executor/types.go
  • internal/executor/utils.go
  • internal/executor/utils_test.go
  • internal/k8sclient/apply.go
  • internal/k8sclient/client.go
  • internal/k8sclient/interface.go
  • internal/k8sclient/mock.go
  • internal/maestroclient/client.go
  • internal/manifest/generation.go
  • internal/transportclient/interface.go
  • internal/transportclient/types.go
  • test/integration/executor/setup_test.go
  • test/integration/k8sclient/client_integration_test.go
  • test/testdata/dryrun/cel-showcase/dryrun-cel-showcase-api-responses.json
  • test/testdata/dryrun/cel-showcase/dryrun-cel-showcase-discovery.json
  • test/testdata/dryrun/cel-showcase/dryrun-cel-showcase-task-config.yaml
  • test/testdata/dryrun/cel-showcase/dryrun-cel-showcase.sh
  • test/testdata/dryrun/dryrun-delete-api-responses.json
  • test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-discovery.json
  • test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-task-config.yaml
  • test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete.sh
  • test/testdata/dryrun/kubernetes/dryrun-kubernetes-adatepr-task-config-invalid.yaml
  • test/testdata/dryrun/kubernetes/dryrun-kubernetes-discovery.json
  • test/testdata/dryrun/kubernetes/dryrun-kubernetes-task-config.yaml
  • test/testdata/dryrun/kubernetes/dryrun-kubernetes.sh
  • test/testdata/dryrun/maestro-delete/dryrun-maestro-delete-discovery.json
  • test/testdata/dryrun/maestro-delete/dryrun-maestro-delete-task-config.yaml
  • test/testdata/dryrun/maestro-delete/dryrun-maestro-delete.sh
  • test/testdata/dryrun/maestro/dryrun-maestro-adapter-task-config.yaml
  • test/testdata/dryrun/maestro/dryrun-maestro-discovery.json
  • test/testdata/dryrun/maestro/dryrun-maestro.sh
💤 Files with no reviewable changes (2)
  • internal/executor/utils_test.go
  • internal/k8sclient/interface.go
✅ Files skipped from review due to trivial changes (20)
  • charts/examples/maestro/values.yaml
  • test/testdata/dryrun/kubernetes/dryrun-kubernetes.sh
  • test/testdata/dryrun/cel-showcase/dryrun-cel-showcase-task-config.yaml
  • test/testdata/dryrun/maestro/dryrun-maestro-adapter-task-config.yaml
  • internal/manifest/generation.go
  • test/testdata/dryrun/dryrun-delete-api-responses.json
  • test/testdata/dryrun/cel-showcase/dryrun-cel-showcase.sh
  • test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-discovery.json
  • test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete.sh
  • test/testdata/dryrun/kubernetes/dryrun-kubernetes-task-config.yaml
  • test/testdata/dryrun/maestro-delete/dryrun-maestro-delete-discovery.json
  • internal/configloader/constants.go
  • charts/examples/maestro-two-resources/values.yaml
  • charts/examples/maestro-two-resources/adapter-config.yaml
  • charts/examples/maestro-two-resources/adapter-task-resource-manifestwork-configmap.yaml
  • charts/examples/maestro-two-resources/adapter-task-resource-manifestwork-namespace.yaml
  • test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-task-config.yaml
  • internal/k8sclient/mock.go
  • charts/examples/maestro-two-resources/adapter-task-config.yaml
  • internal/executor/types.go
🚧 Files skipped from review as they are similar to previous changes (23)
  • test/integration/executor/setup_test.go
  • internal/executor/README.md
  • internal/k8sclient/apply.go
  • internal/transportclient/types.go
  • internal/criteria/evaluator_test.go
  • internal/executor/utils.go
  • test/integration/k8sclient/client_integration_test.go
  • test/testdata/dryrun/maestro/dryrun-maestro.sh
  • internal/configloader/validator.go
  • internal/transportclient/interface.go
  • internal/dryrun/trace_test.go
  • charts/examples/kubernetes/values.yaml
  • test/testdata/dryrun/maestro-delete/dryrun-maestro-delete.sh
  • charts/README.md
  • internal/maestroclient/client.go
  • charts/examples/kubernetes/adapter-task-config.yaml
  • test/testdata/dryrun/maestro-delete/dryrun-maestro-delete-task-config.yaml
  • internal/executor/executor_test.go
  • internal/dryrun/recording_transport_client.go
  • internal/dryrun/recording_transport_client_test.go
  • internal/executor/resource_executor_test.go
  • internal/configloader/types.go
  • docs/adapter-authoring-guide.md

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
charts/examples/kubernetes/adapter-task-config.yaml (1)

259-290: ⚠️ Potential issue | 🟡 Minor

Guard Deleted condition with is_deleting to prevent false positives.

The Deleted condition reports True whenever all resources are absent, but this can occur on first reconciliation before any resources are created, or when preconditions skip the resource phase. This would incorrectly signal Deleted=True when deletion was never requested.

The Finalized condition correctly guards with is_deleting (line 302), but Deleted does not.

Suggested fix
           - type: "Deleted"
             status:
               expression: |
-                !resources.?clusterNamespace.hasValue()
+                is_deleting
+                  && !resources.?clusterNamespace.hasValue()
                   && !resources.?jobServiceAccount.hasValue()

Apply the same is_deleting && prefix to the reason and message expressions, similar to how Finalized handles it.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@charts/examples/kubernetes/adapter-task-config.yaml` around lines 259 - 290,
The Deleted condition currently marks resources as deleted whenever all resource
references are absent, which can be a false positive; update the Deleted
condition expressions (status, reason, message) to require the deletion phase by
prefixing each boolean expression with is_deleting && (same guard used by the
Finalized condition) so the Deleted condition only becomes
True/ResourcesDeleted/"All cluster resources successfully deleted" when
is_deleting is true and all resource values are absent.
internal/executor/resource_executor.go (1)

156-170: ⚠️ Potential issue | 🟡 Minor

Populate adapter.resourceErrors for apply failures too.

recordResourceError is only called from delete paths (lines 590, 625), but apply failures at lines 161-165 and 186-190 only set ExecutionError without populating ResourceErrors. This means adapter.?resourceErrors.?<name> in post-action CEL will be empty for the most common failure mode (apply errors).

For consistency with the documented behavior in the adapter authoring guide, apply failures should also call recordResourceError:

Suggested fix
 	if err != nil {
 		result.Status = StatusFailed
 		result.Error = err
-		execCtx.Adapter.ExecutionError = &ExecutionError{
-			Phase:   string(PhaseResources),
-			Step:    resource.Name,
-			Message: err.Error(),
-		}
+		re.recordResourceError(execCtx, resource, err)
 		errCtx := logger.WithK8sResult(ctx, "FAILED")

Apply the same change to the post-apply discovery failure path at lines 186-190.

Also applies to: 182-196, 663-679

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/executor/resource_executor.go` around lines 156 - 170, The apply
failure paths that call transportClient.ApplyResource (and the post-apply
discovery failure path) currently set execCtx.Adapter.ExecutionError but do not
populate adapter.resourceErrors; update those failure blocks to also call
recordResourceError(re.execCtx.Adapter, resource.Name, err) (or the existing
recordResourceError helper signature used in delete paths) whenever an apply or
post-apply discovery error occurs so adapter.resourceErrors.<resourceName> is
populated for CEL; keep the existing setting of result.Status/ result.Error and
ExecutionError, and ensure you add the recordResourceError call in the same
branches that now log via re.log.Errorf.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@charts/examples/maestro-two-resources/README.md`:
- Around line 26-37: Update the fenced code block that starts with triple
backticks and the "Reconciliation 1 (is_deleting=true):" sequence to include a
language specifier (e.g., change the opening ``` to ```text) so the block is
marked as plain text; locate the triple-backtick fence surrounding the
reconciliation sequence and add "text" (or "plaintext") immediately after the
opening backticks.

In `@test/testdata/dryrun/maestro-delete/dryrun-maestro-delete-task-config.yaml`:
- Around line 16-18: Update the commented example command paths so they point to
files inside the maestro-delete subfolder; specifically change the values for
the flags --task-config, --dry-run-api-responses, and --dry-run-discovery in the
commented block to use the corresponding files under
test/testdata/dryrun/maestro-delete (so the example becomes runnable when
copy-pasted).

---

Duplicate comments:
In `@charts/examples/kubernetes/adapter-task-config.yaml`:
- Around line 259-290: The Deleted condition currently marks resources as
deleted whenever all resource references are absent, which can be a false
positive; update the Deleted condition expressions (status, reason, message) to
require the deletion phase by prefixing each boolean expression with is_deleting
&& (same guard used by the Finalized condition) so the Deleted condition only
becomes True/ResourcesDeleted/"All cluster resources successfully deleted" when
is_deleting is true and all resource values are absent.

In `@internal/executor/resource_executor.go`:
- Around line 156-170: The apply failure paths that call
transportClient.ApplyResource (and the post-apply discovery failure path)
currently set execCtx.Adapter.ExecutionError but do not populate
adapter.resourceErrors; update those failure blocks to also call
recordResourceError(re.execCtx.Adapter, resource.Name, err) (or the existing
recordResourceError helper signature used in delete paths) whenever an apply or
post-apply discovery error occurs so adapter.resourceErrors.<resourceName> is
populated for CEL; keep the existing setting of result.Status/ result.Error and
ExecutionError, and ensure you add the recordResourceError call in the same
branches that now log via re.log.Errorf.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 97ea9965-aa68-4a72-bd70-d537bf606479

📥 Commits

Reviewing files that changed from the base of the PR and between 4032a9a and e0279ae.

📒 Files selected for processing (60)
  • charts/README.md
  • charts/examples/kubernetes/adapter-task-config.yaml
  • charts/examples/kubernetes/values.yaml
  • charts/examples/maestro-two-resources/README.md
  • charts/examples/maestro-two-resources/adapter-config.yaml
  • charts/examples/maestro-two-resources/adapter-task-config.yaml
  • charts/examples/maestro-two-resources/adapter-task-resource-manifestwork-configmap.yaml
  • charts/examples/maestro-two-resources/adapter-task-resource-manifestwork-namespace.yaml
  • charts/examples/maestro-two-resources/values.yaml
  • charts/examples/maestro/adapter-task-config.yaml
  • charts/examples/maestro/values.yaml
  • configs/adapter-task-config-template.yaml
  • docs/adapter-authoring-guide.md
  • internal/configloader/constants.go
  • internal/configloader/types.go
  • internal/configloader/validator.go
  • internal/configloader/validator_test.go
  • internal/criteria/evaluator.go
  • internal/criteria/evaluator_test.go
  • internal/dryrun/recording_transport_client.go
  • internal/dryrun/recording_transport_client_test.go
  • internal/dryrun/trace.go
  • internal/dryrun/trace_test.go
  • internal/executor/README.md
  • internal/executor/executor_test.go
  • internal/executor/post_action_executor.go
  • internal/executor/precondition_executor.go
  • internal/executor/resource_executor.go
  • internal/executor/resource_executor_test.go
  • internal/executor/types.go
  • internal/executor/utils.go
  • internal/executor/utils_test.go
  • internal/k8sclient/apply.go
  • internal/k8sclient/client.go
  • internal/k8sclient/interface.go
  • internal/k8sclient/mock.go
  • internal/maestroclient/client.go
  • internal/manifest/generation.go
  • internal/transportclient/interface.go
  • internal/transportclient/types.go
  • test/integration/executor/setup_test.go
  • test/integration/k8sclient/client_integration_test.go
  • test/testdata/dryrun/cel-showcase/dryrun-cel-showcase-api-responses.json
  • test/testdata/dryrun/cel-showcase/dryrun-cel-showcase-discovery.json
  • test/testdata/dryrun/cel-showcase/dryrun-cel-showcase-task-config.yaml
  • test/testdata/dryrun/cel-showcase/dryrun-cel-showcase.sh
  • test/testdata/dryrun/dryrun-delete-api-responses.json
  • test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-discovery.json
  • test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-task-config.yaml
  • test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete.sh
  • test/testdata/dryrun/kubernetes/dryrun-kubernetes-adatepr-task-config-invalid.yaml
  • test/testdata/dryrun/kubernetes/dryrun-kubernetes-discovery.json
  • test/testdata/dryrun/kubernetes/dryrun-kubernetes-task-config.yaml
  • test/testdata/dryrun/kubernetes/dryrun-kubernetes.sh
  • test/testdata/dryrun/maestro-delete/dryrun-maestro-delete-discovery.json
  • test/testdata/dryrun/maestro-delete/dryrun-maestro-delete-task-config.yaml
  • test/testdata/dryrun/maestro-delete/dryrun-maestro-delete.sh
  • test/testdata/dryrun/maestro/dryrun-maestro-adapter-task-config.yaml
  • test/testdata/dryrun/maestro/dryrun-maestro-discovery.json
  • test/testdata/dryrun/maestro/dryrun-maestro.sh
💤 Files with no reviewable changes (2)
  • internal/executor/utils_test.go
  • internal/k8sclient/interface.go
✅ Files skipped from review due to trivial changes (22)
  • test/testdata/dryrun/cel-showcase/dryrun-cel-showcase-task-config.yaml
  • internal/executor/README.md
  • internal/criteria/evaluator.go
  • internal/dryrun/trace_test.go
  • test/testdata/dryrun/dryrun-delete-api-responses.json
  • test/testdata/dryrun/maestro-delete/dryrun-maestro-delete.sh
  • charts/README.md
  • charts/examples/kubernetes/values.yaml
  • internal/configloader/constants.go
  • test/testdata/dryrun/maestro-delete/dryrun-maestro-delete-discovery.json
  • test/testdata/dryrun/kubernetes/dryrun-kubernetes.sh
  • test/testdata/dryrun/maestro/dryrun-maestro.sh
  • test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-discovery.json
  • test/testdata/dryrun/cel-showcase/dryrun-cel-showcase.sh
  • charts/examples/maestro-two-resources/adapter-config.yaml
  • internal/executor/executor_test.go
  • test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete.sh
  • charts/examples/maestro-two-resources/adapter-task-resource-manifestwork-configmap.yaml
  • test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-task-config.yaml
  • charts/examples/maestro-two-resources/values.yaml
  • internal/executor/resource_executor_test.go
  • charts/examples/maestro-two-resources/adapter-task-config.yaml
🚧 Files skipped from review as they are similar to previous changes (16)
  • charts/examples/maestro/values.yaml
  • internal/executor/utils.go
  • internal/criteria/evaluator_test.go
  • internal/transportclient/types.go
  • test/integration/executor/setup_test.go
  • internal/executor/post_action_executor.go
  • test/testdata/dryrun/maestro/dryrun-maestro-adapter-task-config.yaml
  • test/testdata/dryrun/kubernetes/dryrun-kubernetes-task-config.yaml
  • test/integration/k8sclient/client_integration_test.go
  • internal/maestroclient/client.go
  • internal/k8sclient/mock.go
  • internal/configloader/validator.go
  • internal/k8sclient/apply.go
  • internal/configloader/validator_test.go
  • internal/k8sclient/client.go
  • charts/examples/maestro-two-resources/adapter-task-resource-manifestwork-namespace.yaml

Comment thread charts/examples/maestro-two-resources/README.md Outdated
Comment thread test/testdata/dryrun/maestro-delete/dryrun-maestro-delete-task-config.yaml Outdated
@rh-amarin rh-amarin force-pushed the HYPERFLEET-849 branch 2 times, most recently from a678f72 to b470fc5 Compare April 21, 2026 09:19
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
charts/examples/maestro-two-resources/README.md (1)

26-37: ⚠️ Potential issue | 🟡 Minor

Add a language identifier to the fenced block (MD040).

Line 26 opens a fenced code block without a language specifier; markdownlint flags this and CI/docs lint may fail.

Proposed fix
-```
+```text
 Reconciliation 1 (is_deleting=true):
   → Delete configMapManifestWork         ✓ (condition met)
   → Delete namespaceManifestWork         ✗ (configMapManifestWork still present)
@@
 Reconciliation 3 (both gone):
   → Finalized = True                     → API hard-delete triggered
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @charts/examples/maestro-two-resources/README.md around lines 26 - 37, The
fenced code block showing the "Reconciliation 1 (is_deleting=true)" example in
README.md lacks a language identifier which triggers markdownlint MD040; update
the opening fence to include a language (e.g., change totext) so the
block becomes a proper fenced code block, locating the block by the line that
starts with "Reconciliation 1 (is_deleting=true)" and the surrounding
triple-backtick fences.


</details>

</blockquote></details>
<details>
<summary>internal/k8sclient/apply.go (1)</summary><blockquote>

`130-132`: _⚠️ Potential issue_ | _🟠 Major_

**Fail fast if a delete reaches `ApplyManifest`.**

This still returns success without deleting anything, so a routing bug leaves stale resources behind and hides the real failure.

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@internal/k8sclient/apply.go` around lines 130 - 132, The ApplyManifest code
currently ignores manifest.OperationDelete and returns success; change the
manifest.OperationDelete case in ApplyManifest to fail fast by returning a clear
error (e.g., fmt.Errorf or wrapped errors) stating deletes must be handled via
DeleteResource so callers can detect the routing bug; reference the
manifest.OperationDelete constant, the ApplyManifest function, and the
DeleteResource path when composing the error message to make debugging
straightforward.
```

</details>

</blockquote></details>

</blockquote></details>

<details>
<summary>🧹 Nitpick comments (1)</summary><blockquote>

<details>
<summary>internal/k8sclient/client.go (1)</summary><blockquote>

`268-290`: **Consolidate duplicated delete flow to reduce drift risk.**

`deleteResource` and `DeleteResource` duplicate object setup + NotFound/error wrapping. Consider a shared internal delete primitive (with optional propagation argument) so semantics stay aligned.



As per coding guidelines, `-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.`


Also applies to: 302-325

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@internal/k8sclient/client.go` around lines 268 - 290, The delete logic in
deleteResource and DeleteResource is duplicated; extract a single internal
helper (e.g., func (c *Client) doDelete(ctx context.Context, gvk
schema.GroupVersionKind, namespace, name string, propagation
*metav1.DeletePropagation) error) that centralizes creating the unstructured
object, setting GroupVersionKind/Namespace/Name, calling c.client.Delete with an
optional propagation policy, and normalizing apierrors.IsNotFound to nil or
wrapping other errors into apperrors.K8sOperationError; then have both
deleteResource and DeleteResource call this helper (pass nil or a specific
propagation parameter as needed) to keep semantics aligned and avoid drift.
```

</details>

</blockquote></details>

</blockquote></details>

<details>
<summary>🤖 Prompt for all review comments with AI agents</summary>

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @internal/executor/precondition_executor.go:

  • Around line 140-145: Update the struct tag on ActionBase.Name to enforce the
    existing resourceNamePattern by adding validate:"required,resourcename" so
    action names are valid CEL identifiers; this prevents silent failures when
    buildCELOptions registers ActionBase.Name as a CEL variable (see
    buildCELOptions) and avoids capture being skipped (see
    precondition_executor.captureCtx handling). Locate the ActionBase struct
    definition and replace the current validate:"required" tag on the Name field
    with validate:"required,resourcename", then run/adjust existing unit/validation
    tests that exercise ActionBase.Name to ensure the new constraint is enforced.

In @internal/k8sclient/client.go:

  • Around line 307-310: Validate the user-supplied PropagationPolicy in
    DeleteResource before casting: in DeleteResource, check the incoming opts != nil
    and opts.PropagationPolicy string against the allowed metav1.DeletionPropagation
    values ("Foreground","Background","Orphan") (or use a switch/lookup) and return
    a clear error if it is invalid; only then convert with
    metav1.DeletionPropagation(opts.PropagationPolicy) and proceed. Ensure the
    validation targets the DeleteOptions.PropagationPolicy field and returns an
    explicit error rather than letting the API server reject the request.

Duplicate comments:
In @charts/examples/maestro-two-resources/README.md:

  • Around line 26-37: The fenced code block showing the "Reconciliation 1
    (is_deleting=true)" example in README.md lacks a language identifier which
    triggers markdownlint MD040; update the opening fence to include a language
    (e.g., change totext) so the block becomes a proper fenced code block,
    locating the block by the line that starts with "Reconciliation 1
    (is_deleting=true)" and the surrounding triple-backtick fences.

In @internal/k8sclient/apply.go:

  • Around line 130-132: The ApplyManifest code currently ignores
    manifest.OperationDelete and returns success; change the
    manifest.OperationDelete case in ApplyManifest to fail fast by returning a clear
    error (e.g., fmt.Errorf or wrapped errors) stating deletes must be handled via
    DeleteResource so callers can detect the routing bug; reference the
    manifest.OperationDelete constant, the ApplyManifest function, and the
    DeleteResource path when composing the error message to make debugging
    straightforward.

Nitpick comments:
In @internal/k8sclient/client.go:

  • Around line 268-290: The delete logic in deleteResource and DeleteResource is
    duplicated; extract a single internal helper (e.g., func (c *Client)
    doDelete(ctx context.Context, gvk schema.GroupVersionKind, namespace, name
    string, propagation *metav1.DeletePropagation) error) that centralizes creating
    the unstructured object, setting GroupVersionKind/Namespace/Name, calling
    c.client.Delete with an optional propagation policy, and normalizing
    apierrors.IsNotFound to nil or wrapping other errors into
    apperrors.K8sOperationError; then have both deleteResource and DeleteResource
    call this helper (pass nil or a specific propagation parameter as needed) to
    keep semantics aligned and avoid drift.

</details>

<details>
<summary>🪄 Autofix (Beta)</summary>

Fix all unresolved CodeRabbit comments on this PR:

- [ ] <!-- {"checkboxId": "4b0d0e0a-96d7-4f10-b296-3a18ea78f0b9"} --> Push a commit to this branch (recommended)
- [ ] <!-- {"checkboxId": "ff5b1114-7d8c-49e6-8ac1-43f82af23a33"} --> Create a new PR with the fixes

</details>

---

<details>
<summary>ℹ️ Review info</summary>

<details>
<summary>⚙️ Run configuration</summary>

**Configuration used**: Organization UI

**Review profile**: CHILL

**Plan**: Pro Plus

**Run ID**: `5bf8e9fb-71c6-407a-b82f-313105d93a7d`

</details>

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between e0279aeabb4143e585716250cef39b96bf2dfb1b and a678f72f665d3051fbc97048a7d5a0aacd3968b7.

</details>

<details>
<summary>📒 Files selected for processing (60)</summary>

* `charts/README.md`
* `charts/examples/kubernetes/adapter-task-config.yaml`
* `charts/examples/kubernetes/values.yaml`
* `charts/examples/maestro-two-resources/README.md`
* `charts/examples/maestro-two-resources/adapter-config.yaml`
* `charts/examples/maestro-two-resources/adapter-task-config.yaml`
* `charts/examples/maestro-two-resources/adapter-task-resource-manifestwork-configmap.yaml`
* `charts/examples/maestro-two-resources/adapter-task-resource-manifestwork-namespace.yaml`
* `charts/examples/maestro-two-resources/values.yaml`
* `charts/examples/maestro/adapter-task-config.yaml`
* `charts/examples/maestro/values.yaml`
* `configs/adapter-task-config-template.yaml`
* `docs/adapter-authoring-guide.md`
* `internal/configloader/constants.go`
* `internal/configloader/types.go`
* `internal/configloader/validator.go`
* `internal/configloader/validator_test.go`
* `internal/criteria/evaluator.go`
* `internal/criteria/evaluator_test.go`
* `internal/dryrun/recording_transport_client.go`
* `internal/dryrun/recording_transport_client_test.go`
* `internal/dryrun/trace.go`
* `internal/dryrun/trace_test.go`
* `internal/executor/README.md`
* `internal/executor/executor_test.go`
* `internal/executor/post_action_executor.go`
* `internal/executor/precondition_executor.go`
* `internal/executor/resource_executor.go`
* `internal/executor/resource_executor_test.go`
* `internal/executor/types.go`
* `internal/executor/utils.go`
* `internal/executor/utils_test.go`
* `internal/k8sclient/apply.go`
* `internal/k8sclient/client.go`
* `internal/k8sclient/interface.go`
* `internal/k8sclient/mock.go`
* `internal/maestroclient/client.go`
* `internal/manifest/generation.go`
* `internal/transportclient/interface.go`
* `internal/transportclient/types.go`
* `test/integration/executor/setup_test.go`
* `test/integration/k8sclient/client_integration_test.go`
* `test/testdata/dryrun/cel-showcase/dryrun-cel-showcase-api-responses.json`
* `test/testdata/dryrun/cel-showcase/dryrun-cel-showcase-discovery.json`
* `test/testdata/dryrun/cel-showcase/dryrun-cel-showcase-task-config.yaml`
* `test/testdata/dryrun/cel-showcase/dryrun-cel-showcase.sh`
* `test/testdata/dryrun/dryrun-delete-api-responses.json`
* `test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-discovery.json`
* `test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-task-config.yaml`
* `test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete.sh`
* `test/testdata/dryrun/kubernetes/dryrun-kubernetes-adatepr-task-config-invalid.yaml`
* `test/testdata/dryrun/kubernetes/dryrun-kubernetes-discovery.json`
* `test/testdata/dryrun/kubernetes/dryrun-kubernetes-task-config.yaml`
* `test/testdata/dryrun/kubernetes/dryrun-kubernetes.sh`
* `test/testdata/dryrun/maestro-delete/dryrun-maestro-delete-discovery.json`
* `test/testdata/dryrun/maestro-delete/dryrun-maestro-delete-task-config.yaml`
* `test/testdata/dryrun/maestro-delete/dryrun-maestro-delete.sh`
* `test/testdata/dryrun/maestro/dryrun-maestro-adapter-task-config.yaml`
* `test/testdata/dryrun/maestro/dryrun-maestro-discovery.json`
* `test/testdata/dryrun/maestro/dryrun-maestro.sh`

</details>

<details>
<summary>💤 Files with no reviewable changes (2)</summary>

* internal/executor/utils_test.go
* internal/k8sclient/interface.go

</details>

<details>
<summary>✅ Files skipped from review due to trivial changes (21)</summary>

* internal/dryrun/trace_test.go
* test/testdata/dryrun/cel-showcase/dryrun-cel-showcase-task-config.yaml
* test/testdata/dryrun/kubernetes/dryrun-kubernetes.sh
* internal/executor/README.md
* test/testdata/dryrun/maestro/dryrun-maestro.sh
* internal/transportclient/types.go
* test/testdata/dryrun/cel-showcase/dryrun-cel-showcase.sh
* internal/configloader/constants.go
* test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-discovery.json
* test/testdata/dryrun/maestro-delete/dryrun-maestro-delete.sh
* test/testdata/dryrun/maestro/dryrun-maestro-adapter-task-config.yaml
* test/testdata/dryrun/dryrun-delete-api-responses.json
* charts/examples/maestro-two-resources/adapter-config.yaml
* internal/executor/executor_test.go
* charts/examples/kubernetes/values.yaml
* charts/examples/maestro-two-resources/adapter-task-resource-manifestwork-configmap.yaml
* charts/examples/maestro-two-resources/values.yaml
* test/testdata/dryrun/maestro-delete/dryrun-maestro-delete-discovery.json
* charts/examples/maestro-two-resources/adapter-task-config.yaml
* test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete.sh
* test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-task-config.yaml

</details>

<details>
<summary>🚧 Files skipped from review as they are similar to previous changes (11)</summary>

* test/testdata/dryrun/kubernetes/dryrun-kubernetes-task-config.yaml
* charts/examples/maestro/values.yaml
* test/integration/executor/setup_test.go
* internal/transportclient/interface.go
* internal/criteria/evaluator.go
* internal/configloader/validator_test.go
* internal/maestroclient/client.go
* internal/k8sclient/mock.go
* charts/examples/kubernetes/adapter-task-config.yaml
* internal/executor/resource_executor.go
* internal/executor/resource_executor_test.go

</details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

Comment thread internal/executor/precondition_executor.go
Comment thread internal/k8sclient/client.go
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (6)
internal/dryrun/recording_transport_client.go (1)

186-216: ⚠️ Potential issue | 🟡 Minor

Foreground propagation policy is not honored in dry-run deletes.

The previous review noted that this implementation ignores opts and treats K8s deletes as immediate removal regardless of PropagationPolicy. For Foreground deletes, Kubernetes blocks until dependents are removed, which should be simulated by keeping the resource with deletionTimestamp set (similar to Maestro async behavior).

Current code only distinguishes K8s vs Maestro via target, but K8s Foreground should behave like Maestro (async).

Suggested fix
 func (c *DryrunTransportClient) DeleteResource(
 	...
 ) error {
 	c.mu.Lock()
 	defer c.mu.Unlock()

 	key := resourceKey(gvk, namespace, name)
-	if target != nil {
+	asyncDelete := target != nil || (opts != nil && opts.PropagationPolicy == "Foreground")
+	if asyncDelete {
 		// Maestro transport: async deletion — mark with deletionTimestamp, keep in store.
 		if obj, exists := c.resources[key]; exists {
 			now := metav1.NewTime(time.Now())
 			obj.SetDeletionTimestamp(&now)
 		}
 	} else {
 		// K8s transport: synchronous deletion — remove from store immediately.
 		delete(c.resources, key)
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/dryrun/recording_transport_client.go` around lines 186 - 216,
DeleteResource in DryrunTransportClient currently ignores opts and always
removes K8s-targeted resources immediately; update it to honor
transportclient.DeleteOptions. Specifically, inspect opts.PropagationPolicy (or
opts if nil defaulting to Background) and if it's Foreground treat K8s deletes
like Maestro async: if the resource exists in c.resources set a
deletionTimestamp via obj.SetDeletionTimestamp(&now) instead of deleting from
c.resources; only remove the entry for non-Foreground policies. Keep recording
the TransportRecord (operationDelete, GVK, Namespace, Name) as before.
internal/k8sclient/client.go (1)

307-310: ⚠️ Potential issue | 🟡 Minor

Validate PropagationPolicy at the public API boundary.

The previous review flagged that DeleteResource is a public API and should validate opts.PropagationPolicy before casting to metav1.DeletionPropagation. While the config loader validates these values during configuration, direct callers of this method can construct DeleteOptions with arbitrary strings, deferring failure to the API server with a less clear error.

Suggested validation
 propagationPolicy := metav1.DeletePropagationBackground
 if opts != nil && opts.PropagationPolicy != "" {
+	switch opts.PropagationPolicy {
+	case "Background", "Foreground", "Orphan":
+		propagationPolicy = metav1.DeletionPropagation(opts.PropagationPolicy)
+	default:
+		return apperrors.KubernetesError("invalid propagationPolicy %q: must be Background, Foreground, or Orphan", opts.PropagationPolicy)
+	}
-	propagationPolicy = metav1.DeletionPropagation(opts.PropagationPolicy)
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/k8sclient/client.go` around lines 307 - 310, DeleteResource
currently casts opts.PropagationPolicy directly to metav1.DeletionPropagation
which allows callers to pass arbitrary strings; update DeleteResource to
validate opts.PropagationPolicy at the public API boundary by checking (when
opts != nil && opts.PropagationPolicy != "") that the value equals one of the
known metav1.DeletionPropagation constants ("Background","Foreground","Orphan")
before calling metav1.DeletionPropagation; if it’s not one of those, return a
clear error from DeleteResource indicating the invalid PropagationPolicy value
so invalid inputs fail fast instead of delegating to the API server.
charts/examples/kubernetes/adapter-task-config.yaml (1)

259-290: ⚠️ Potential issue | 🟠 Major

Guard Deleted with is_deleting.

This still reports Deleted=True whenever all managed resources are absent, including first-run and skipped reconciliations where no delete was requested.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@charts/examples/kubernetes/adapter-task-config.yaml` around lines 259 - 290,
The Deleted condition currently evaluates to True whenever all managed resources
are absent; update the three expressions under the Deleted block
(status.expression, reason.expression, message.expression) to also require the
deletion intent flag (is_deleting) so they only return
True/"ResourcesDeleted"/success message when resources are absent AND
is_deleting is true; locate the Deleted block (type: "Deleted") and add the
is_deleting check to the combined boolean expression used in status, reason, and
message so that absence alone on first-run or skipped reconciliations won't
report Deleted.
internal/executor/resource_executor.go (3)

47-56: ⚠️ Potential issue | 🟠 Major

Avoid pre-discovery on apply-only reconciliations.

This adds an extra discovery roundtrip for every resource on every reconciliation, even when no resource has lifecycle.delete. On larger tasks that doubles remote read traffic on the normal apply path.

Suggested guard
-	if err := re.preDiscoverAll(ctx, resources, execCtx); err != nil {
-		return nil, err
-	}
+	needsPreDiscovery := false
+	for _, resource := range resources {
+		if resource.Lifecycle != nil && resource.Lifecycle.Delete != nil {
+			needsPreDiscovery = true
+			break
+		}
+	}
+	if needsPreDiscovery {
+		if err := re.preDiscoverAll(ctx, resources, execCtx); err != nil {
+			return nil, err
+		}
+	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/executor/resource_executor.go` around lines 47 - 56, Pre-discovery
is unconditionally invoked via re.preDiscoverAll(...) which doubles remote reads
even when no resource uses lifecycle.delete; change the logic in
resource_executor.go to first scan the incoming resources slice for any resource
with a lifecycle.delete (or a lifecycle.when expression that can evaluate
delete) or use execCtx flag indicating delete-capable reconciliation, and only
call re.preDiscoverAll(ctx, resources, execCtx) when such delete semantics are
present; otherwise skip the pre-discovery step to avoid the extra roundtrip.

157-169: ⚠️ Potential issue | 🟠 Major

Populate adapter.resourceErrors for apply-path failures too.

recordResourceError() is only used on delete paths right now. Plain apply failures and post-apply discovery failures still leave adapter.resourceErrors empty, so post-action CEL misses the common resource-phase errors.

Suggested fix
 	applyResult, err := transportClient.ApplyResource(ctx, renderedBytes, applyOpts, transportTarget)
 	if err != nil {
 		result.Status = StatusFailed
 		result.Error = err
-		execCtx.Adapter.ExecutionError = &ExecutionError{
-			Phase:   string(PhaseResources),
-			Step:    resource.Name,
-			Message: err.Error(),
-		}
+		re.recordResourceError(execCtx, resource, err)
 		errCtx := logger.WithK8sResult(ctx, "FAILED")
 		errCtx = logger.WithErrorField(errCtx, err)
 		re.log.Errorf(errCtx, "Resource[%s] processed: FAILED", resource.Name)
 		return result, NewExecutorError(PhaseResources, resource.Name, "failed to apply resource", err)
 	}
@@
 	if resource.Discovery != nil {
 		discovered, discoverErr := re.discoverResource(ctx, resource, execCtx, transportTarget)
 		if discoverErr != nil {
 			result.Status = StatusFailed
 			result.Error = discoverErr
-			execCtx.Adapter.ExecutionError = &ExecutionError{
-				Phase:   string(PhaseResources),
-				Step:    resource.Name,
-				Message: discoverErr.Error(),
-			}
+			re.recordResourceError(execCtx, resource, discoverErr)
 			errCtx := logger.WithK8sResult(ctx, "FAILED")
 			errCtx = logger.WithErrorField(errCtx, discoverErr)
 			re.log.Errorf(errCtx, "Resource[%s] discovery after apply failed: %v", resource.Name, discoverErr)
 			return result, NewExecutorError(
 				PhaseResources, resource.Name, "failed to discover resource after apply", discoverErr)
 		}

Also applies to: 180-195

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/executor/resource_executor.go` around lines 157 - 169, The
apply-path error handling must also register the failure into the adapter's
resourceErrors so post-action CEL can see resource-phase errors; in the
transportClient.ApplyResource error branch (where result.Status is set to
StatusFailed and execCtx.Adapter.ExecutionError is assigned) call
recordResourceError(ctx, execCtx.Adapter, resource.Name, err) or otherwise
append a ResourceError entry to execCtx.Adapter.resourceErrors with the same
details; do the same in the post-apply discovery failure branch (the code
handling discovery after apply around the block after applyResult) so both plain
apply failures and post-apply discovery failures populate adapter.resourceErrors
in addition to setting ExecutionError/Status.

485-492: ⚠️ Potential issue | 🟠 Major

Fail closed when Maestro pre-discovery cannot resolve targetCluster.

Skipping here leaves the resource absent from execCtx.Resources, so sibling !resources.?name.hasValue() gates can evaluate against unknown state and unlock deletes or finalization incorrectly.

Suggested fix
 		if resource.IsMaestroTransport() && resource.Transport.Maestro != nil {
 			targetCluster, err := utils.RenderTemplate(resource.Transport.Maestro.TargetCluster, execCtx.Params)
 			if err != nil {
-				re.log.Debugf(ctx, "Resource[%s] pre-discovery: failed to render targetCluster (skipping): %v",
-					resource.Name, err)
-				continue
+				return NewExecutorError(
+					PhaseResources,
+					resource.Name,
+					"pre-discovery failed to render Maestro targetCluster",
+					err,
+				)
 			}
 			transportTarget = &maestroclient.TransportContext{ConsumerName: targetCluster}
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/executor/resource_executor.go` around lines 485 - 492, The current
Maestro pre-discovery branch logs a RenderTemplate error for
resource.Transport.Maestro.TargetCluster and continues, which silently drops the
resource from execCtx.Resources and can allow incorrect gate evaluations;
instead, in the block guarded by resource.IsMaestroTransport() (the code using
transportTarget, resource.Transport.Maestro.TargetCluster, execCtx.Params and
re.log.Debugf), stop processing and return an error (or propagate a wrapped
error) when utils.RenderTemplate fails rather than calling continue — update the
handler to return the error up the call stack so the executor fails closed and
the resource is not silently omitted.
🧹 Nitpick comments (1)
charts/examples/kubernetes/adapter-task-config.yaml (1)

42-48: Short-circuit validationCheck when deletion is requested.

is_deleting is captured here, but validationCheck still ignores it. If a cluster is marked for deletion before clusterReadyTTL becomes true, the whole resources phase is skipped and none of the new lifecycle.delete rules run.

Suggested update
-      clusterNotReady || clusterReadyTTL
+      is_deleting || clusterNotReady || clusterReadyTTL
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@charts/examples/kubernetes/adapter-task-config.yaml` around lines 42 - 48,
Update the validationCheck expression to short-circuit true when is_deleting is
set so the resources phase still runs to execute lifecycle.delete rules during
deletion; specifically, amend the validationCheck (where clusterReadyTTL is
currently used) to OR in the is_deleting symbol (e.g., "is_deleting || <existing
validation expression>") so deletion requests bypass the normal readiness gate
and allow lifecycle.delete handlers to run.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@charts/examples/maestro/adapter-task-config.yaml`:
- Around line 27-33: The current readiness precondition is preventing
lifecycle.delete from running on a Ready cluster because it only allows
progression when readyConditionStatus == "False"; modify the precondition logic
to short-circuit when is_deleting is true so deletes are allowed, e.g., update
the precondition(s) that reference readyConditionStatus to include is_deleting
(use a logical OR: is_deleting OR readyConditionStatus == "False"), or add a
separate precondition that passes when is_deleting is true; target the
precondition block(s) that gate the resources/lifecycle.delete flow (references:
is_deleting, readyConditionStatus, lifecycle.delete).

In `@internal/executor/resource_executor_test.go`:
- Around line 1233-1258: The test
TestResourceExecutor_LifecycleDelete_CELUndeclaredVariable assumes undeclared
CEL variables evaluate to null, but cel-go treats them as evaluation errors;
update the test to reflect that EvaluateCEL/EvaluateSafe will set
CELResult.Error and the execution should fail. Specifically, keep using
newResourceExecutor and ExecuteAll but change assertions to require an error
(require.Error/require.NoNil on the CEL error or
results[0].Error/CELResult.Error), assert results[0].Status is a failure state
(e.g., StatusFailed), assert the operation is not treated as OperationCreate,
and still assert mock.DeleteCalled is false; alternatively remove the test if
your design intends to reject undefined variables altogether.

---

Duplicate comments:
In `@charts/examples/kubernetes/adapter-task-config.yaml`:
- Around line 259-290: The Deleted condition currently evaluates to True
whenever all managed resources are absent; update the three expressions under
the Deleted block (status.expression, reason.expression, message.expression) to
also require the deletion intent flag (is_deleting) so they only return
True/"ResourcesDeleted"/success message when resources are absent AND
is_deleting is true; locate the Deleted block (type: "Deleted") and add the
is_deleting check to the combined boolean expression used in status, reason, and
message so that absence alone on first-run or skipped reconciliations won't
report Deleted.

In `@internal/dryrun/recording_transport_client.go`:
- Around line 186-216: DeleteResource in DryrunTransportClient currently ignores
opts and always removes K8s-targeted resources immediately; update it to honor
transportclient.DeleteOptions. Specifically, inspect opts.PropagationPolicy (or
opts if nil defaulting to Background) and if it's Foreground treat K8s deletes
like Maestro async: if the resource exists in c.resources set a
deletionTimestamp via obj.SetDeletionTimestamp(&now) instead of deleting from
c.resources; only remove the entry for non-Foreground policies. Keep recording
the TransportRecord (operationDelete, GVK, Namespace, Name) as before.

In `@internal/executor/resource_executor.go`:
- Around line 47-56: Pre-discovery is unconditionally invoked via
re.preDiscoverAll(...) which doubles remote reads even when no resource uses
lifecycle.delete; change the logic in resource_executor.go to first scan the
incoming resources slice for any resource with a lifecycle.delete (or a
lifecycle.when expression that can evaluate delete) or use execCtx flag
indicating delete-capable reconciliation, and only call re.preDiscoverAll(ctx,
resources, execCtx) when such delete semantics are present; otherwise skip the
pre-discovery step to avoid the extra roundtrip.
- Around line 157-169: The apply-path error handling must also register the
failure into the adapter's resourceErrors so post-action CEL can see
resource-phase errors; in the transportClient.ApplyResource error branch (where
result.Status is set to StatusFailed and execCtx.Adapter.ExecutionError is
assigned) call recordResourceError(ctx, execCtx.Adapter, resource.Name, err) or
otherwise append a ResourceError entry to execCtx.Adapter.resourceErrors with
the same details; do the same in the post-apply discovery failure branch (the
code handling discovery after apply around the block after applyResult) so both
plain apply failures and post-apply discovery failures populate
adapter.resourceErrors in addition to setting ExecutionError/Status.
- Around line 485-492: The current Maestro pre-discovery branch logs a
RenderTemplate error for resource.Transport.Maestro.TargetCluster and continues,
which silently drops the resource from execCtx.Resources and can allow incorrect
gate evaluations; instead, in the block guarded by resource.IsMaestroTransport()
(the code using transportTarget, resource.Transport.Maestro.TargetCluster,
execCtx.Params and re.log.Debugf), stop processing and return an error (or
propagate a wrapped error) when utils.RenderTemplate fails rather than calling
continue — update the handler to return the error up the call stack so the
executor fails closed and the resource is not silently omitted.

In `@internal/k8sclient/client.go`:
- Around line 307-310: DeleteResource currently casts opts.PropagationPolicy
directly to metav1.DeletionPropagation which allows callers to pass arbitrary
strings; update DeleteResource to validate opts.PropagationPolicy at the public
API boundary by checking (when opts != nil && opts.PropagationPolicy != "") that
the value equals one of the known metav1.DeletionPropagation constants
("Background","Foreground","Orphan") before calling metav1.DeletionPropagation;
if it’s not one of those, return a clear error from DeleteResource indicating
the invalid PropagationPolicy value so invalid inputs fail fast instead of
delegating to the API server.

---

Nitpick comments:
In `@charts/examples/kubernetes/adapter-task-config.yaml`:
- Around line 42-48: Update the validationCheck expression to short-circuit true
when is_deleting is set so the resources phase still runs to execute
lifecycle.delete rules during deletion; specifically, amend the validationCheck
(where clusterReadyTTL is currently used) to OR in the is_deleting symbol (e.g.,
"is_deleting || <existing validation expression>") so deletion requests bypass
the normal readiness gate and allow lifecycle.delete handlers to run.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: eef87831-a2c8-4972-8c51-755bad61acdf

📥 Commits

Reviewing files that changed from the base of the PR and between a678f72 and b470fc5.

📒 Files selected for processing (60)
  • charts/README.md
  • charts/examples/kubernetes/adapter-task-config.yaml
  • charts/examples/kubernetes/values.yaml
  • charts/examples/maestro-two-resources/README.md
  • charts/examples/maestro-two-resources/adapter-config.yaml
  • charts/examples/maestro-two-resources/adapter-task-config.yaml
  • charts/examples/maestro-two-resources/adapter-task-resource-manifestwork-configmap.yaml
  • charts/examples/maestro-two-resources/adapter-task-resource-manifestwork-namespace.yaml
  • charts/examples/maestro-two-resources/values.yaml
  • charts/examples/maestro/adapter-task-config.yaml
  • charts/examples/maestro/values.yaml
  • configs/adapter-task-config-template.yaml
  • docs/adapter-authoring-guide.md
  • internal/configloader/constants.go
  • internal/configloader/types.go
  • internal/configloader/validator.go
  • internal/configloader/validator_test.go
  • internal/criteria/evaluator.go
  • internal/criteria/evaluator_test.go
  • internal/dryrun/recording_transport_client.go
  • internal/dryrun/recording_transport_client_test.go
  • internal/dryrun/trace.go
  • internal/dryrun/trace_test.go
  • internal/executor/README.md
  • internal/executor/executor_test.go
  • internal/executor/post_action_executor.go
  • internal/executor/precondition_executor.go
  • internal/executor/resource_executor.go
  • internal/executor/resource_executor_test.go
  • internal/executor/types.go
  • internal/executor/utils.go
  • internal/executor/utils_test.go
  • internal/k8sclient/apply.go
  • internal/k8sclient/client.go
  • internal/k8sclient/interface.go
  • internal/k8sclient/mock.go
  • internal/maestroclient/client.go
  • internal/manifest/generation.go
  • internal/transportclient/interface.go
  • internal/transportclient/types.go
  • test/integration/executor/setup_test.go
  • test/integration/k8sclient/client_integration_test.go
  • test/testdata/dryrun/cel-showcase/dryrun-cel-showcase-api-responses.json
  • test/testdata/dryrun/cel-showcase/dryrun-cel-showcase-discovery.json
  • test/testdata/dryrun/cel-showcase/dryrun-cel-showcase-task-config.yaml
  • test/testdata/dryrun/cel-showcase/dryrun-cel-showcase.sh
  • test/testdata/dryrun/dryrun-delete-api-responses.json
  • test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-discovery.json
  • test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-task-config.yaml
  • test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete.sh
  • test/testdata/dryrun/kubernetes/dryrun-kubernetes-adatepr-task-config-invalid.yaml
  • test/testdata/dryrun/kubernetes/dryrun-kubernetes-discovery.json
  • test/testdata/dryrun/kubernetes/dryrun-kubernetes-task-config.yaml
  • test/testdata/dryrun/kubernetes/dryrun-kubernetes.sh
  • test/testdata/dryrun/maestro-delete/dryrun-maestro-delete-discovery.json
  • test/testdata/dryrun/maestro-delete/dryrun-maestro-delete-task-config.yaml
  • test/testdata/dryrun/maestro-delete/dryrun-maestro-delete.sh
  • test/testdata/dryrun/maestro/dryrun-maestro-adapter-task-config.yaml
  • test/testdata/dryrun/maestro/dryrun-maestro-discovery.json
  • test/testdata/dryrun/maestro/dryrun-maestro.sh
💤 Files with no reviewable changes (2)
  • internal/executor/utils_test.go
  • internal/k8sclient/interface.go
✅ Files skipped from review due to trivial changes (23)
  • test/testdata/dryrun/maestro/dryrun-maestro-adapter-task-config.yaml
  • internal/transportclient/types.go
  • test/testdata/dryrun/kubernetes/dryrun-kubernetes.sh
  • internal/dryrun/trace_test.go
  • internal/executor/README.md
  • test/testdata/dryrun/cel-showcase/dryrun-cel-showcase.sh
  • test/testdata/dryrun/maestro/dryrun-maestro.sh
  • test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete.sh
  • charts/README.md
  • internal/configloader/constants.go
  • internal/k8sclient/apply.go
  • charts/examples/kubernetes/values.yaml
  • internal/configloader/validator_test.go
  • test/testdata/dryrun/maestro-delete/dryrun-maestro-delete.sh
  • test/testdata/dryrun/dryrun-delete-api-responses.json
  • test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-discovery.json
  • charts/examples/maestro-two-resources/README.md
  • charts/examples/maestro-two-resources/adapter-config.yaml
  • test/testdata/dryrun/maestro-delete/dryrun-maestro-delete-discovery.json
  • charts/examples/maestro-two-resources/values.yaml
  • test/testdata/dryrun/kubernetes-delete/dryrun-kubernetes-delete-task-config.yaml
  • docs/adapter-authoring-guide.md
  • charts/examples/maestro-two-resources/adapter-task-config.yaml
🚧 Files skipped from review as they are similar to previous changes (16)
  • test/testdata/dryrun/kubernetes/dryrun-kubernetes-task-config.yaml
  • internal/criteria/evaluator.go
  • charts/examples/maestro/values.yaml
  • internal/criteria/evaluator_test.go
  • internal/manifest/generation.go
  • test/testdata/dryrun/cel-showcase/dryrun-cel-showcase-task-config.yaml
  • internal/maestroclient/client.go
  • test/integration/k8sclient/client_integration_test.go
  • internal/transportclient/interface.go
  • internal/configloader/types.go
  • internal/executor/precondition_executor.go
  • internal/dryrun/trace.go
  • internal/dryrun/recording_transport_client_test.go
  • charts/examples/maestro-two-resources/adapter-task-resource-manifestwork-namespace.yaml
  • charts/examples/maestro-two-resources/adapter-task-resource-manifestwork-configmap.yaml
  • test/testdata/dryrun/maestro-delete/dryrun-maestro-delete-task-config.yaml

Comment thread charts/examples/maestro/adapter-task-config.yaml
Comment thread internal/executor/resource_executor_test.go
Comment thread internal/executor/resource_executor.go Outdated
if err != nil {
re.log.Debugf(ctx, "Resource[%s] pre-discovery: failed to render targetCluster (skipping): %v",
resource.Name, err)
continue
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warning

Blocking

Category: Bug

preDiscoverAll silently skips Maestro resources when RenderTemplate for targetCluster fails (line 489–491), leaving the resource absent from execCtx.Resources. If a downstream lifecycle.delete.when expression checks !resources.?X.hasValue(), this "absent because we couldn't render the template" state is indistinguishable from "absent because the resource doesn't exist" — potentially triggering an unwanted deletion.

The discovery error path already does the right thing (lines 497–506: propagate transient errors). The template rendering path should follow the same pattern:

targetCluster, err := utils.RenderTemplate(resource.Transport.Maestro.TargetCluster, execCtx.Params)
if err != nil {
    re.log.Warnf(ctx, "Resource[%s] pre-discovery: failed to render targetCluster: %v",
        resource.Name, err)
    return NewExecutorError(PhaseResources, resource.Name, "pre-discovery: failed to render targetCluster", err)
}

This way a broken template surfaces as a reconciliation error instead of silently becoming a deletion trigger.

@rafabene
Copy link
Copy Markdown
Contributor

Tip

nit — non-blocking suggestion

File: configs/adapter-task-config-template.yaml:98

Category: Inconsistency

The template example uses deleted_time != null but the authoring guide recommends has(preconditions.myPrecondition.deleted_time). These behave differently in CEL — != null can error if the field isn't in the context, while has() safely checks presence.

Consider aligning the template with the documented best practice:

# before
expression: "preconditions.myPrecondition.deleted_time != null"

# after
expression: "has(preconditions.myPrecondition.deleted_time)"

// guarantees that the initial cluster state is visible to when expressions.
func (re *ResourceExecutor) preDiscoverAll(
ctx context.Context,
resources []configloader.Resource,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tip

nit — non-blocking suggestion

Category: Improvement

preDiscoverAll runs on every reconciliation even when no resource has lifecycle.delete. For adapters that don't use this feature, it adds unnecessary discovery API calls per resource.

A guard at the call site would skip the pass entirely when not needed:

if re.hasLifecycleDelete(resources) {
    if err := re.preDiscoverAll(ctx, resources, execCtx); err != nil {
        return result, err
    }
}

Where hasLifecycleDelete is a simple loop checking if any resource has resource.Lifecycle != nil && resource.Lifecycle.Delete != nil.

Comment thread internal/dryrun/trace.go
fmt.Fprintf(&b, " Kind: %-12s Namespace: %-12s Name: %s\n", rr.Kind, rr.Namespace, rr.ResourceName)

if rr.DiscoveredState != nil {
if stateBytes, err := json.Marshal(rr.DiscoveredState.Object); err == nil {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tip

nit — non-blocking suggestion

Category: Bug

Lines 191 and 329 check rr.DiscoveredState != nil but don't guard against DiscoveredState.Object being nil (an *unstructured.Unstructured can have a nil inner map). Currently executeResourceDelete always populates .Object, but the trace shouldn't depend on that:

if rr.DiscoveredState != nil && rr.DiscoveredState.Object != nil {

Same pattern applies at line 329.

delete(c.resources, key)
}

c.Records = append(c.Records, TransportRecord{
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tip

nit — non-blocking suggestion

Category: Improvement

DeleteResource unconditionally appends a TransportRecord even when the resource doesn't exist in the dry-run store (line 208–213). For K8s transport, the real API would return NotFound; here the delete silently succeeds and logs a record, which can make dry-run traces misleading.

Consider guarding the record or adding a status field:

if _, exists := c.resources[key]; !exists && target == nil {
    // K8s transport: real API would return NotFound
    return apierrors.NewNotFound(schema.GroupResource{Group: gvk.Group, Resource: gvk.Kind}, name)
}

Alternatively, keep the current behavior but add an Exists field to TransportRecord so the trace can distinguish actual deletes from no-ops.

Copy link
Copy Markdown
Contributor

@rafabene rafabene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

1 blocking issue, 4 nits — overall a solid and well-tested implementation of lifecycle delete. The architecture is clean (pre-discovery pass, CEL evaluation, per-resource error tracking) and the test coverage is extensive.

Blocking

  • preDiscoverAll silently swallows Maestro template rendering failures (resource_executor.go:489): When RenderTemplate fails for targetCluster, the resource is silently skipped via continue, leaving it absent from execCtx.Resources. A downstream !resources.?X.hasValue() expression would then interpret this as "resource doesn't exist" and could trigger an unwanted deletion. The discovery error path (lines 497–506) already propagates transient errors correctly — the template rendering path should do the same.

Nits

  • Template vs authoring guide CEL pattern mismatch (adapter-task-config-template.yaml:98): Template suggests deleted_time != null but the authoring guide recommends has(preconditions.myPrecondition.deleted_time) — semantically different in CEL.
  • preDiscoverAll runs unconditionally (resource_executor.go:475): Adds unnecessary discovery API calls for adapters that don't use lifecycle.delete. A simple guard would skip the pass entirely.
  • DiscoveredState.Object nil guard missing (trace.go:191,329): Only the pointer is nil-checked, but the inner map[string]interface{} could also be nil.
  • Dry-run records delete for non-existent resources (recording_transport_client.go:208): Real K8s API would return NotFound; dry-run silently succeeds and logs a misleading record.

See inline comments for details and suggested fixes.

// instead of applying it. This enables dependency-ordered deletion driven by CEL expressions.
if resource.Lifecycle != nil && resource.Lifecycle.Delete != nil {
shouldDelete, delErr := re.evaluateLifecycleDeleteWhen(ctx, resource, execCtx)
if delErr != nil {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warning

Blocking

Category: Bug

When evaluateLifecycleDeleteWhen returns an error (e.g., typo in CEL expression), executeResource returns with result.Operation still at zero value "". In ExecuteAll (lines 65-74), the check result.Operation == manifest.OperationDelete is false, so the error triggers fail-fast instead of continue:

if result.Operation == manifest.OperationDelete {  // "" != "delete" → false
    deleteErrs = append(deleteErrs, err)
    continue  // never reached
}
return results, errors.Join(...)  // fail-fast instead of continue

This contradicts HYPERFLEET-849's requirement to "continue with the rest of resources deletion." A CEL evaluation error on resourceA's lifecycle.delete.when skips all subsequent resources.

Fix: set result.Operation before returning the error:

if delErr != nil {
    result.Status = StatusFailed
    result.Error = delErr
    result.Operation = manifest.OperationDelete  // ← add this
    re.recordResourceError(execCtx, resource, delErr)
    return result, NewExecutorError(...)
}

}
// Transient error (RBAC, network, API server): propagate so the reconciliation
// fails visibly rather than treating the resource as absent.
re.log.Warnf(ctx, "Resource[%s] pre-discovery failed: %v", resource.Name, err)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tip

nit — non-blocking suggestion

Category: Bug

Upgrading my earlier comment about preDiscoverAll running unconditionally — the impact is a behavioral regression, not just wasted API calls.

preDiscoverAll runs before any apply, for ALL resources with discovery configured (not just those with lifecycle.delete). If a resource uses a CRD that doesn't exist yet (first reconciliation), the API server returns "no matches for kind 'X' in version 'v1alpha1'" — this is NOT apierrors.IsNotFound, so line 504 propagates it as a transient error, failing the entire reconciliation before any work is done.

Before this PR, the apply-first-then-discover flow would succeed: a prior resource in the list installs the CRD, then this resource is applied and discovered. Now, pre-discovery fails first.

The guard I suggested earlier (if re.hasLifecycleDelete(resources)) would also fix this regression — only pre-discover when lifecycle.delete is actually in use.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, this is an awesome finding I never though of

@rafabene
Copy link
Copy Markdown
Contributor

Warning

Blocking

File: internal/executor/executor.go:166

Category: Bug

recordResourceError (resource_executor.go:672-673) correctly sets ExecutionError with Phase: "resources" and Step: "<resource-name>", using a first-error-wins guard (if execCtx.Adapter.ExecutionError == nil). But then executor.go:166 calls SetError("ResourceFailed", ...) which unconditionally overwrites ExecutionError:

// SetError at types.go:334 — no guard, unconditional overwrite
ec.Adapter.ExecutionError = &ExecutionError{
    Phase:   reason,  // "ResourceFailed" — not a valid phase constant
    Message: message,
}
// Step field is lost (zero value "")

This corrupts the error diagnostics. Post-action CEL expressions like the Health condition in the example configs:

"Adapter failed at phase [" + adapter.?executionError.?phase.orValue("unknown") + "] step [" + adapter.?executionError.?step.orValue("unknown") + "]"

produce "Adapter failed at phase [ResourceFailed] step []" instead of "Adapter failed at phase [resources] step [clusterNamespace]".

Fix options:

  1. Add a guard to SetError: if ec.Adapter.ExecutionError != nil { return }
  2. Or skip the SetError call in executor.go:166 when ExecutionError is already populated by recordResourceError

Add lifecycle.delete DSL block to per-resource config, enabling
adapters to conditionally delete managed resources based on CEL
expressions (e.g., deleted_at != null) with configurable K8s
propagation policies.

Key design decisions:
- Post-delete rediscovery determines actual resource state after
  DeleteResource: NotFound → store nil (same-reconciliation cascade);
  still present (finalizers/Maestro async) → store object (next cycle)
- TransportClient.DeleteResource added; K8s uses propagationPolicy,
  Maestro ignores it (ManifestWork handles cleanup semantics)
- Nil entries in CEL resources map enable resources.X == null ordering
  expressions for dependency-ordered deletion

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

HYPERFLEET-849 - feat: chart examples for lifecycle delete

Add and update chart examples demonstrating the lifecycle.delete DSL:
- kubernetes: add adapter-task-config.yaml with delete example
- maestro: add adapter-task-config.yaml with delete example
- maestro-two-resources: new example showing dependency-ordered deletion
  of two resources (namespace + configmap) with Maestro ManifestWork
- Update README and values files accordingly

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

HYPERFLEET-849 - feat: dry-run delete support and testdata reorganization

- DryrunTransportClient.DeleteResource removes entries from in-memory
  store, accurately simulating instant deletion for single-pass testing
- NewDryrunTransportClientWithOverrides pre-populates the store so
  overridden resources are discoverable before any ApplyResource call,
  enabling delete dry-runs where resources pre-exist on the cluster
- GetResource returns apierrors.NewNotFound so post-delete rediscovery
  correctly detects NotFound via apierrors.IsNotFound
- ResourceResult.DiscoveredState captures the pre-delete object; shown
  in text trace as "Pre-delete state:" block and in JSON as
  "discoveredState" field
- Reorganize testdata into named scenario subdirectories with shell
  runner scripts; add kubernetes-delete and maestro-delete scenarios

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@ciaranRoche ciaranRoche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Apr 23, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ciaranRoche

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot Bot merged commit 95b2701 into openshift-hyperfleet:main Apr 23, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants