feat(iac): JIT secret resolution + ProviderID propagation (W-5 of 12)#531
Merged
Conversation
…olvedConfigHash + DriftEntry type
…der with preservation sentinel
…ComputeDrift sentinel-honoring
T2.1 — bounded-concurrency Refresh(ctx, provider, states, opts) that calls ResourceDriver.Read per resource and returns a copy of the state slice with Outputs reconciled to the live values. Default concurrency 8 when Options.Concurrency < 1; otherwise honor the caller's value. On any Read or driver-resolution failure, returns (nil, err) so callers don't half-persist a refresh. Foundation for wfctl infra refresh-outputs (T2.2) and the opt-in apply pre-step (T2.3). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
T2.2 — `wfctl infra refresh-outputs [-c CONFIG] [--env ENV] [--concurrency N]`
reads live Outputs for each resource already in state and persists any
field-level changes back to the state backend. Read-only at the cloud
level — never invokes Update or Replace.
Discovers iac.provider modules in the config (with per-env resolution),
groups state entries by their owning iac.provider module (ProviderRef-first,
falling back to provider type when exactly one module of that type exists),
loads each provider once, calls iac/refreshoutputs.Refresh per group, and
SaveResource()s any state whose Outputs map changed.
When the resolved config has no usable iac.provider module for the
requested env, emits the literal error
refresh-outputs: provider not configured for env "<env>"
verbatim per `fmt.Errorf("refresh-outputs: provider not configured for
env %q", env)`. T2.7's runtime-launch-validation asserts against this
exact line.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ESH_OUTPUTS) T2.3 — wires iac/refreshoutputs.Refresh into runInfraApply as a pre-plan read-only state reconciliation. Default OFF: operators get pre-W-2 behavior unless they explicitly opt in. Activation rules: - WFCTL_REFRESH_OUTPUTS unset, empty, or unrecognised → no-op (default). - WFCTL_REFRESH_OUTPUTS="1"/"true"/"t" (strconv.ParseBool truthy) → run pre-step. - WFCTL_REFRESH_OUTPUTS="0"/"false"/"f" (strconv.ParseBool falsey) → no-op. Operators who use the "0"/"false" convention to disable a feature get the expected behaviour rather than a presence-only foot-gun. - --skip-refresh → suppress pre-step regardless of env var (for CI environments that force the env var on globally). Behavior: after the existing --refresh drift/prune phase and before the plan/apply dispatch, discovers iac.provider modules with per-env resolution, loads current state, and calls refreshOutputsAcrossProviders to read live Outputs and persist any field-level changes. On any Read or driver-resolution failure, apply aborts with the wrapped error from T2.1's helper (no half-persisted refresh, no plan computed against stale state). Only fires for infra.* configs (legacy platform.* path is silently skipped). Rollback: unset WFCTL_REFRESH_OUTPUTS, pass --skip-refresh, or revert this commit. Reverting removes the pre-step entirely (helper file plus the gated block in infra.go). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
T2.5 — pure-package stress test in iac/refreshoutputs/. Drives Refresh
with 100 fake resources at Concurrency=8 and asserts:
1. No deadlock (10s watchdog around the call).
2. Read called exactly once per ProviderID (atomic per-ID counter).
3. Every refreshed state carries the live Outputs map — no
write-into-wrong-slot bug under concurrency.
4. Concurrent in-flight peak between 2 and the requested cap, proving
both that parallelism happened AND that the semaphore enforced
its limit.
The countingDriver introduces a 5ms sleep per Read so the bounded pool
actually queues at the cap (5ms × 100 / 8 ≈ 63ms total at peak; well
under the 10s watchdog). Test runs ~1.5s wall.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
T2.6 — adds the infra refresh-outputs section to docs/WFCTL.md:
- New row in the Command Tree mermaid graph.
- New row in the infra Action table.
- Dedicated #### subsection with usage, flag table, behavior summary,
literal-error contract (load-bearing per T2.7), apply-time pre-step
semantics (WFCTL_REFRESH_OUTPUTS, --skip-refresh), and three
representative examples.
See also: docs/adr/006-wfctl-refresh-outputs-env-var-parsebool.md
records the T2.3 plan-deviation (ParseBool vs plan-literal presence
check) that the docs in this commit accurately reflect.
Verification — plan §T2.6 line 1090 invocation `mdformat --check
docs/WFCTL.md && find docs -name "*.md" -exec markdown-link-check {} +`
ran with locally-installed mdformat 1.0.0 (pip) and markdown-link-check
3.14.2 (npm):
$ mdformat --check docs/WFCTL.md
Error: File "docs/WFCTL.md" is not formatted.
exit=1
This failure is PRE-EXISTING. Verified by checking out the file at
the W-2 T2.2 tip (181e579) before any T2.6 edits and rerunning
mdformat against it: identical error. docs/WFCTL.md has never been
mdformat-formatted in this repo. Reformatting the entire file is
out of scope for T2.6 (would introduce a multi-thousand-line
unrelated diff). T2.6's own additions follow the existing in-file
conventions exactly.
$ markdown-link-check docs/WFCTL.md
FILE: docs/WFCTL.md
[✓] https://github.com/GoCodeAlone/workflow
[✓] #build-ui
[✓] mcp.md
3 links checked.
exit=0
docs/WFCTL.md has zero broken links — including the new
refresh-outputs section. The directory-wide scan reports 7 broken
links in unrelated files (self-improvement-tutorial.md,
getting-started.md, etc.); all are pre-existing and out of scope.
T2.7 runtime-launch-validation transcript (folded into this commit
body per the "Files: none new" plan note for T2.7):
$ GOWORK=off go build -o /tmp/wfctl ./cmd/wfctl
exit=0
$ /tmp/wfctl infra refresh-outputs --help
Usage of infra refresh-outputs:
-c string
Config file (short for --config)
-concurrency int
Maximum concurrent Read calls (default 8)
-config string
Config file
-e string
Environment name (short for --env)
-env string
Environment name (resolves per-module overrides)
exit=0
$ cat /tmp/t27-fake.yaml
modules:
- name: state-store
type: iac.state
config:
backend: filesystem
directory: /tmp/t27-fake-state
$ /tmp/wfctl infra refresh-outputs -c /tmp/t27-fake.yaml --env staging
error: refresh-outputs: provider not configured for env "staging"
exit=1
No panic, no stack trace. Stderr line is the verbatim literal pinned
by T2.7 (plan line 1098), produced by T2.2's
fmt.Errorf("refresh-outputs: provider not configured for env %q",
env) at cmd/wfctl/infra_refresh_outputs.go:49.
PR W-2 mandate (plan line 1101):
$ GOWORK=off go test ./iac/refreshoutputs/... ./cmd/wfctl/... -count=1 -race
ok github.com/GoCodeAlone/workflow/iac/refreshoutputs 1.405s
ok github.com/GoCodeAlone/workflow/cmd/wfctl 10.485s
Manual smoke against staging-PG: not run — no staging-PG available
in this worktree environment. Plan line 1102 marks this "if
available", so deferring to the operator landing the PR.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… from plan §T2.3
ADR 006 — formalises the spec-vs-quality-review trade-off recorded
during W-2 T2.3 review:
- Plan §T2.3 line 1061 specified `os.Getenv("WFCTL_REFRESH_OUTPUTS") != ""`.
- Code-reviewer flagged this as a foot-gun (=0 mis-enables).
- Implementation at cmd/wfctl/infra_apply_refresh_pre.go (bfd1bbe) uses
strconv.ParseBool so falsey values explicitly disable.
- Spec-reviewer accepted post-hoc and requested this ADR per
superpowers:recording-decisions.
- Team-lead approved option-1 (approve-as-is + follow-up ADR) over a
plan revert; provenance recorded in the ADR itself.
Captures the rejected alternative, the rationale, references back to
the plan spec, the implementation site, the pinning test, and the
operator-facing docs.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…rovider schema Addresses code-reviewer findings on commit 695a070: - Important: race on lazy compiledSchema cache. Wrap with sync.Once; capture both *jsonschema.Schema and the compile error so concurrent callers observe a single deterministic outcome. Adds a 32-goroutine ParseManifest stress test that fires under -race to lock in the invariant going forward. - Minor: ManifestSchemaJSON() now returns bytes.Clone(...) so callers cannot mutate the //go:embed slice (defense-in-depth; embed slices are technically writable). New test verifies the copy semantics. - Minor: iacProvider sub-object gains additionalProperties:false so a typo like "computeplanversion" or an unknown key is rejected at parse time instead of silently defaulting to v1 dispatch. The root object stays permissive — existing plugin.json files carry version/author/dependencies/etc. and the SDK manifest is a strict subset by design. New test covers both the typo-rejection and the root-permissivity contracts.
T2.1 — bounded-concurrency Refresh(ctx, provider, states, opts) that calls ResourceDriver.Read per resource and returns a copy of the state slice with Outputs reconciled to the live values. Default concurrency 8 when Options.Concurrency < 1; otherwise honor the caller's value. On any Read or driver-resolution failure, returns (nil, err) so callers don't half-persist a refresh. Foundation for wfctl infra refresh-outputs (T2.2) and the opt-in apply pre-step (T2.3). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
T2.2 — `wfctl infra refresh-outputs [-c CONFIG] [--env ENV] [--concurrency N]`
reads live Outputs for each resource already in state and persists any
field-level changes back to the state backend. Read-only at the cloud
level — never invokes Update or Replace.
Discovers iac.provider modules in the config (with per-env resolution),
groups state entries by their owning iac.provider module (ProviderRef-first,
falling back to provider type when exactly one module of that type exists),
loads each provider once, calls iac/refreshoutputs.Refresh per group, and
SaveResource()s any state whose Outputs map changed.
When the resolved config has no usable iac.provider module for the
requested env, emits the literal error
refresh-outputs: provider not configured for env "<env>"
verbatim per `fmt.Errorf("refresh-outputs: provider not configured for
env %q", env)`. T2.7's runtime-launch-validation asserts against this
exact line.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ESH_OUTPUTS) T2.3 — wires iac/refreshoutputs.Refresh into runInfraApply as a pre-plan read-only state reconciliation. Default OFF: operators get pre-W-2 behavior unless they explicitly opt in. Activation rules: - WFCTL_REFRESH_OUTPUTS unset, empty, or unrecognised → no-op (default). - WFCTL_REFRESH_OUTPUTS="1"/"true"/"t" (strconv.ParseBool truthy) → run pre-step. - WFCTL_REFRESH_OUTPUTS="0"/"false"/"f" (strconv.ParseBool falsey) → no-op. Operators who use the "0"/"false" convention to disable a feature get the expected behaviour rather than a presence-only foot-gun. - --skip-refresh → suppress pre-step regardless of env var (for CI environments that force the env var on globally). Behavior: after the existing --refresh drift/prune phase and before the plan/apply dispatch, discovers iac.provider modules with per-env resolution, loads current state, and calls refreshOutputsAcrossProviders to read live Outputs and persist any field-level changes. On any Read or driver-resolution failure, apply aborts with the wrapped error from T2.1's helper (no half-persisted refresh, no plan computed against stale state). Only fires for infra.* configs (legacy platform.* path is silently skipped). Rollback: unset WFCTL_REFRESH_OUTPUTS, pass --skip-refresh, or revert this commit. Reverting removes the pre-step entirely (helper file plus the gated block in infra.go). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
T2.5 — pure-package stress test in iac/refreshoutputs/. Drives Refresh
with 100 fake resources at Concurrency=8 and asserts:
1. No deadlock (10s watchdog around the call).
2. Read called exactly once per ProviderID (atomic per-ID counter).
3. Every refreshed state carries the live Outputs map — no
write-into-wrong-slot bug under concurrency.
4. Concurrent in-flight peak between 2 and the requested cap, proving
both that parallelism happened AND that the semaphore enforced
its limit.
The countingDriver introduces a 5ms sleep per Read so the bounded pool
actually queues at the cap (5ms × 100 / 8 ≈ 63ms total at peak; well
under the 10s watchdog). Test runs ~1.5s wall.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
T2.6 — adds the infra refresh-outputs section to docs/WFCTL.md:
- New row in the Command Tree mermaid graph.
- New row in the infra Action table.
- Dedicated #### subsection with usage, flag table, behavior summary,
literal-error contract (load-bearing per T2.7), apply-time pre-step
semantics (WFCTL_REFRESH_OUTPUTS, --skip-refresh), and three
representative examples.
See also: docs/adr/006-wfctl-refresh-outputs-env-var-parsebool.md
records the T2.3 plan-deviation (ParseBool vs plan-literal presence
check) that the docs in this commit accurately reflect.
Verification — plan §T2.6 line 1090 invocation `mdformat --check
docs/WFCTL.md && find docs -name "*.md" -exec markdown-link-check {} +`
ran with locally-installed mdformat 1.0.0 (pip) and markdown-link-check
3.14.2 (npm):
$ mdformat --check docs/WFCTL.md
Error: File "docs/WFCTL.md" is not formatted.
exit=1
This failure is PRE-EXISTING. Verified by checking out the file at
the W-2 T2.2 tip (181e579) before any T2.6 edits and rerunning
mdformat against it: identical error. docs/WFCTL.md has never been
mdformat-formatted in this repo. Reformatting the entire file is
out of scope for T2.6 (would introduce a multi-thousand-line
unrelated diff). T2.6's own additions follow the existing in-file
conventions exactly.
$ markdown-link-check docs/WFCTL.md
FILE: docs/WFCTL.md
[✓] https://github.com/GoCodeAlone/workflow
[✓] #build-ui
[✓] mcp.md
3 links checked.
exit=0
docs/WFCTL.md has zero broken links — including the new
refresh-outputs section. The directory-wide scan reports 7 broken
links in unrelated files (self-improvement-tutorial.md,
getting-started.md, etc.); all are pre-existing and out of scope.
T2.7 runtime-launch-validation transcript (folded into this commit
body per the "Files: none new" plan note for T2.7):
$ GOWORK=off go build -o /tmp/wfctl ./cmd/wfctl
exit=0
$ /tmp/wfctl infra refresh-outputs --help
Usage of infra refresh-outputs:
-c string
Config file (short for --config)
-concurrency int
Maximum concurrent Read calls (default 8)
-config string
Config file
-e string
Environment name (short for --env)
-env string
Environment name (resolves per-module overrides)
exit=0
$ cat /tmp/t27-fake.yaml
modules:
- name: state-store
type: iac.state
config:
backend: filesystem
directory: /tmp/t27-fake-state
$ /tmp/wfctl infra refresh-outputs -c /tmp/t27-fake.yaml --env staging
error: refresh-outputs: provider not configured for env "staging"
exit=1
No panic, no stack trace. Stderr line is the verbatim literal pinned
by T2.7 (plan line 1098), produced by T2.2's
fmt.Errorf("refresh-outputs: provider not configured for env %q",
env) at cmd/wfctl/infra_refresh_outputs.go:49.
PR W-2 mandate (plan line 1101):
$ GOWORK=off go test ./iac/refreshoutputs/... ./cmd/wfctl/... -count=1 -race
ok github.com/GoCodeAlone/workflow/iac/refreshoutputs 1.405s
ok github.com/GoCodeAlone/workflow/cmd/wfctl 10.485s
Manual smoke against staging-PG: not run — no staging-PG available
in this worktree environment. Plan line 1102 marks this "if
available", so deferring to the operator landing the PR.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… from plan §T2.3
ADR 006 — formalises the spec-vs-quality-review trade-off recorded
during W-2 T2.3 review:
- Plan §T2.3 line 1061 specified `os.Getenv("WFCTL_REFRESH_OUTPUTS") != ""`.
- Code-reviewer flagged this as a foot-gun (=0 mis-enables).
- Implementation at cmd/wfctl/infra_apply_refresh_pre.go (bfd1bbe) uses
strconv.ParseBool so falsey values explicitly disable.
- Spec-reviewer accepted post-hoc and requested this ADR per
superpowers:recording-decisions.
- Team-lead approved option-1 (approve-as-is + follow-up ADR) over a
plan revert; provenance recorded in the ADR itself.
Captures the rejected alternative, the rationale, references back to
the plan spec, the implementation site, the pinning test, and the
operator-facing docs.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ReplaceIDMap fields
…mitempty contract Addresses code-reviewer findings on commit 13a6fad: - Important: ReplaceIDMap godoc said "Keyed by the dependent resource Name" but the populating site (T3.4 plan §1625) sets result.ReplaceIDMap[action.Resource.Name] where action.Resource is the REPLACED resource. The roundtrip fixture {"vpc":"new-uuid"} confirms this. Re-worded to "Keyed by the *replaced* resource's Name" with an explicit reference to action.Resource.Name + a sentence on how W-5 JIT substitution will use the map (lookup by replaced-resource name to obtain the new ProviderID for dependent configs). Locks the contract before the field has any consumers. - Minor: cross-referenced the InputDriftReport sort-stability guarantee to its enforcing test (TestComputeDrift_ResultIsSortedByName in iac/inputsnapshot/compute_drift_test.go) so the contract is no longer free-floating on the field godoc. - Minor: added TestApplyResult_OmitEmptyContract — table-driven across nil and empty-but-non-nil values for all three new fields, asserting the JSON keys are absent from the encoded form. Locks the omitempty tag behavior so a future refactor cannot silently regress to emitting "initial_input_snapshot": {} / "input_drift_report": [] / "replace_id_map": {}.
…iver-resolve test Addresses code-reviewer findings on commit 8416498: - Important 1 (weak Replace assertion): converted fakeDriver from boolean call recorders to integer counters. The 4-action plan [create, update, replace, delete] now asserts Create==2, Update==1, Delete==2. If "case replace" were silently dropped from dispatchAction the counts would shift to 1/1/1 and the test would fail. Added TestApplyPlan_ReplaceDispatchesViaDeleteThenCreate that isolates Replace via a single-action plan: 1 Delete + 1 Create + 0 Update. Removes the calledReplace() proxy entirely. - Important 2 (resolve-driver-error path uncovered): added TestApplyPlan_ResolveDriverErrorRecordsActionError which exercises fakeProvider.driverErr, asserts the canonical "resolve driver:" prefix, and verifies the loop continues past action[0] to action[1] (best-effort contract). Folded the loop-continues-after-failure coverage into a separate TestApplyPlan_LoopContinuesAfterPerActionFailure using a selectiveFakeProvider that errors on one type only — proves one action's failure does not block another's success. - Minor 1 (wasted %w): switched fmt.Errorf(...).Error() to fmt.Sprintf("resolve driver: %v", err) since the destination is a string field and the wrapping chain dies at the field boundary. - Minor 3 (ctx.Done not checked): added ctx.Err() check at the loop iteration boundary; on cancel, returns the result accumulated so far + the ctx error as top-level. Added TestApplyPlan_CtxCancellationStopsLoop covering pre-call cancel: driver receives zero invocations, top-level error is context.Canceled. - Minor 5 (refFromAction defensive note): added a godoc paragraph documenting the same-name-same-type invariant for Replace plans. Documenting rather than enforcing — ComputePlan upstream is the contract owner. Minor 2 (uniform error prefixing across sub-functions) intentionally deferred to T3.2/T3.3/T3.4 per reviewer guidance — those tasks own the final sub-function bodies and can pick the convention once.
…_plan_test Imports were left orphaned by W-1 PR #523 (commit 48f7a0c) when fingerprintForTest was switched to delegate to inputsnapshot.Compute instead of computing sha256 inline. cmd/wfctl test build was broken on HEAD because of the unused imports — surfaced while landing T3.1.5, which adds a new test file in the same package. Pure-mechanical cleanup. No behavior change.
…safe + tolerant of mid-apply env unset)
W-3b T3.6d. Updates the 4 cross-package ComputePlan call sites in module/infra_module_integration_test.go to the new (ctx, provider, …) signature. Lifts the no-op fake into a small public test helper at iac/iactest/fakeprovider.go so the same shape no longer needs to be re-declared every time a new package wants to satisfy the interface. Folds in the T3.6c review's IMPORTANT follow-up: cmd/wfctl's computePlanForInfraSpecs now dispatches via the same computeInfraPlan seam the apply path uses (no parallel seam variable; one override point serves both call sites). Plan-loop body is wrapped in an IIFE so each provider's closer fires after its group is computed instead of deferring to function exit (multi-provider plan no longer holds N gRPC connections open at once). Drops the duplicated planNoopProvider and applyV2RecordingProvider no-op implementations in cmd/wfctl tests in favor of the shared iactest.NoopProvider. Three structurally-identical 14-method shells become one. Atomic counters carried forward where used. Doc updates: - godoc on computePlanForInfraSpecs corrected: groups are concatenated in first-reference-in-`desired` order, not iac.provider declaration order (matches actual code). - CHANGELOG entry calls out the empty-desired alignment with apply (loop over groupOrder is empty when no specs reference any provider; use `wfctl infra destroy --dry-run` to preview teardown).
…tion when ForceNew or NeedsReplace
W-3b T3.6e — the binding TDD red→green commit for the v2 IaC contract
(rev3 fix for the cycle-2 self-contradiction: test + impl ship in the
same SHA, no t.Skip placeholder).
ComputePlan now classifies each existing resource via
p.ResourceDriver(spec.Type).Diff(ctx, spec, currentOut), running the
per-resource Diff calls in parallel under errgroup with a bounded
worker pool (default 8; WFCTL_PLAN_DIFF_CONCURRENCY env var override
clamped 1..32). Action emission:
- replace, when DiffResult.NeedsReplace OR any FieldChange.ForceNew
is true (the latter closes design issue C — pre-W-3b ForceNew was
silently downgraded to update);
- update, when DiffResult.NeedsUpdate is true and replace did not
fire;
- skip, when neither flag is set.
Net-new resources still emit create without dispatching Diff;
resources removed from desired still emit delete in reverse-dep order.
Nil-tolerance contract preserved: if p is nil, or if
p.ResourceDriver(typ) returns (nil, nil) for a resource type,
ComputePlan falls back to the legacy ConfigHash compare for the
affected resources. Replace cannot be expressed via the legacy path —
callers needing Replace must supply a provider whose drivers implement
Diff. Per-resource driver.Diff errors propagate via errgroup so
operators see the underlying cause (rate limit, network, etc.).
Test surface (platform/differ_replace_test.go, NEW; ships in this
commit per the rev3 atomicity rule):
- TestComputePlan_NeedsReplaceEmitsReplaceAction
- TestComputePlan_ForceNewWithoutNeedsReplace_StillEmitsReplace
- TestComputePlan_NeedsUpdateWithoutForceNew_EmitsUpdate
- TestComputePlan_DiffReturnsNoChanges_EmitsNothing
- TestComputePlan_NilProvider_FallsBackToConfigHash
- TestComputePlan_NilDriver_FallsBackToConfigHash
- TestComputePlan_DriverDiffError_PropagatesAsError
platform/fake_provider_test.go extended with newFakeProviderWithDiff
helper; in-package no-op fakeProvider/fakeDriver kept (cannot collapse
to iac/iactest until cache_test in T3.6f also depends on the helper —
deferred to keep T3.6e's diff bounded).
Carry-forward notes addressed:
- T3.6a note 1: dropped unused *testing.T param from newFakeProvider().
- T3.6a note 2: added compile-time interface conformance asserts on
fakeProvider and fakeDriver.
- T3.6a note 3: nil-provider AND nil-driver guards baked in; covered
by two explicit tests.
- T3.6a note 4: rewrote fake_provider_test.go godoc to behavior-based
phrasing.
cmd/wfctl test fakes updated to match the new dispatch model:
- readDriver.Diff now returns NeedsUpdate=true (the adoption tests
rely on the post-adopt ComputePlan emitting update; pre-W-3b that
was the ConfigHash compare's job).
- refreshOutputsCmdFakeDriver.Diff now returns (nil, nil) instead of
panicking — the refresh-outputs test fixture only exercises Read.
W-3b T3.6f. Wires the iac/diffcache package (W-3a/T3.5) into classifyModification: cache.Get is consulted before each ResourceDriver.Diff dispatch under the (PluginVersion, Type, ProviderID, SHAConfig, SHAOutputs) tuple; on hit, the cached DiffResult is used directly; on miss, the freshly-computed result is Put into the cache. Apply-time correctness does not depend on cache hits — fresh CI runners always miss and re-Diff (the cache is purely an amortization optimization for repeated `wfctl infra plan` against the same checkout). Cache backend selection follows iac/diffcache's WFCTL_DIFFCACHE env var contract: unset → filesystem (~/.cache/wfctl/diff/); ":memory:" → in-memory; "disabled" → noop. The package-level cache instance is lazy-initialised on first ComputePlan call and shared across subsequent calls; tests in the same package may swap it via the internal-package setDiffCacheForTest helper. platform/main_test.go (NEW) sets WFCTL_DIFFCACHE=disabled at TestMain so the platform test suite never reads/writes the developer's filesystem cache and so cache state cannot leak across tests with incidentally-aligned cache keys (caught during integration: T3.6e's Replace-emission test was Putting a result that polluted later update/no-op tests). Folds in the T3.6e code-review IMPORTANT carry-forwards (since both fixes touch platform/): - Note 1 (env-clamping testability): extract parseConcurrencyEnv as a pure function; new TestParseConcurrencyEnv table-driven test covers empty, non-numeric, "0", "1", "8", "32", "33", "100", "-5". - Note 2 (parallel-dispatch correctness): new TestComputePlan_ParallelDispatch_AllCandidatesObserveDiff exercises N=5 modification candidates, asserts driver.diffCount.Load() == 5 and the resulting plan has 5 actions. - Note 3 (driver returns nil DiffResult): explicit test TestComputePlan_DriverReturnsNilDiff_EmitsNothing. And T3.6e adversarial-review minor cleanups: - Note 4 (i := i shadowing redundant in Go 1.22+): dropped. - Note 5 (errSentinel uses custom errFromTest): replaced with errors.New. - Note 7 (concurrency contract on ComputePlan godoc): added — p and the ResourceDriver instances it returns MUST be safe for concurrent use. New tests (3 cache-behaviour scenarios in differ_cache_test.go): - TestComputePlan_CacheHitSkipsDiff (second call against unchanged inputs hits cache; diffCount stays at 1) - TestComputePlan_CacheMissesOnDifferentInputs (varying SHAConfig forces re-dispatch) - TestComputePlan_NoopCacheNeverHits (disabled backend always re-dispatches)
…est (Copilot review) Strengthens the count-only TestComputePlan_ParallelDispatch_AllCandidatesObserveDiff (landed in T3.6f) per team-lead's explicit request: a regression that accidentally serialized Diff dispatch (e.g., g.SetLimit(1)) would still pass the count-only assertion as long as every candidate eventually got dispatched. The new TestComputePlan_ParallelDiffDispatch_InFlightGoroutinesObserved uses a channel-gated driver to prove ≥2 Diff goroutines are simultaneously in-flight before any returns: regression to serial dispatch would hang on the second `<-entered` and time out at 5s. Pure addition (no production-code change). cacheTestProvider.driver loosened from *cacheTestDriver to interfaces.ResourceDriver so the new channelGatedDriver shares the provider shell.
…parator (Copilot review)
Code-reviewer flagged the T3.6f cache PluginVersion key as fragile:
composing via `p.Name() + "@" + p.Version()` would let two
genuinely-different providers — `("foo", "bar@1.0")` vs
`("foo@bar", "1.0")` — collide on the literal string `"foo@bar@1.0"`
and serve each other's cached DiffResults. Today's registered
providers (digitalocean, dockercompose, mock) don't carry `@` in
either field so no observed bug, but there's no compile-time guard
against a future provider declaring `do@enterprise` or similar.
Replace with sha256(name + "\x00" + version) — fixed-length, NUL is
invalid in both fields by Unicode convention, ambiguity-free.
Matches how configHash already keys per-config inputs.
Three regression tests pin the fix:
- TestPluginVersionKey_NoCollisionOnAtSeparator (the actual bug)
- TestPluginVersionKey_NilProvider (defensive — empty key, no panic)
- TestPluginVersionKey_Stable (deterministic across calls)
Pure additive — no change to any existing test outcome. The cache
re-keys against the new digest, which means any DiffResults persisted
under the old `name@version` keys will miss on the next plan and
re-Diff naturally (cache misses are correct by design).
…tePlanVersion W-3b T3.7. Routes apply through wfctlhelpers.ApplyPlan when the loaded plugin's plugin.json declares iacProvider.computePlanVersion: v2 (read at provider load time and surfaced via the optional ComputePlanVersionDeclarer interface). Providers that don't declare the field, or declare anything other than "v2", take the legacy provider.Apply path. rev2/rev3-locked: NO env-var, NO operator-flippable gate. The v1/v2 routing is plugin-author-controlled via plugin.json from day 1 — there is no transitional WFCTL_USE_V2_APPLY flag to misuse. Wires the printDriftReportIfAny helper (added unwired in W-3a/T3.1.5 as foundation only). The v2 dispatch path is the production caller that surfaces the InputDriftReport to stderr after a successful ApplyPlan return; v1 path remains untouched per the W-3a "zero runtime change for v1 plugins" invariant. New plumbing: - iac/wfctlhelpers/dispatch.go (NEW): ComputePlanVersionDeclarer interface + DispatchVersionV2 const + DispatchVersionFor helper. Single override point for the dispatch decision. - iac/iactest/fakeprovider.go: NoopProvider gains DispatchVersion + ProviderVersion fields and ComputePlanVersion() method so tests drive both v1 (default empty) and v2 paths through the shared fake. - cmd/wfctl/deploy_providers.go: iacPluginManifest reads top-level iacProvider.computePlanVersion alongside existing capabilities.iacProvider.name; findIaCPluginDir returns the version; readIaCPluginComputePlanVersion is the load-time helper; remoteIaCProvider stores the value and exposes it via ComputePlanVersion() to satisfy the optional interface. (Re-reads plugin.json once per provider load rather than threading through loadIaCPlugin's 4-tuple var-seam — keeps the seam signature stable for the existing test override; cost is one tiny os.ReadFile vs the gRPC start.) - cmd/wfctl/infra_apply.go: applyV2ApplyPlanFn = wfctlhelpers.ApplyPlan test seam + dispatch branch in applyWithProviderAndStore. Drift report printed to writer on success (no-op when empty). - cmd/wfctl/infra_apply_v2_test.go: 3 new tests cover TestApplyWithProviderAndStore_V2RoutesThroughWfctlhelpers (v2 routes), TestApplyWithProviderAndStore_V1FallsThroughToProviderApply (v1/un-declared routes legacy), TestApplyWithProviderAndStore_V2 PrintsDriftReport (drift wiring asserted via writer-buffer substring). v1 fixture v1RecordingProvider intentionally does NOT implement ComputePlanVersionDeclarer to prove the dispatcher's "default to v1 when un-declared" branch.
…rage (Copilot review)
Code-reviewer flagged 3 IMPORTANT items in T3.7:
1. Comment/code mismatch on drift-report timing. The comment promised
"Run on success or partial failure" but the code gated on
`err == nil` (success only). The contract the comment described
is the more useful behavior — operators most need the
stale-input diagnostic when an apply fails ("which input went
stale during the failed apply?"). Without it, the failure error
and the "what changed" context are disconnected.
Fix: gate on `result != nil` instead of `err == nil`.
printDriftReportIfAny already no-ops on empty/nil reports so
unconditional-on-result-non-nil is safe.
2. No test for the drift-on-partial-failure path. Added
TestApplyWithProviderAndStore_V2PrintsDriftReportOnPartialFailure
which has applyV2ApplyPlanFn return (resultWithDrift, applyErr)
and asserts both: (a) the err propagates, AND (b) the drift
report still reaches the writer.
3. Optional-interface coverage gap. Two semantically-different "v1"
paths exist:
- Path A: provider doesn't implement ComputePlanVersionDeclarer
at all → type-assert fails → legacy. Covered by
v1RecordingProvider.
- Path B: provider implements interface but ComputePlanVersion()
returns "" (the realistic mid-transition state for v1 plugins
after the SDK update lands but before they migrate) → type-
assert succeeds, DispatchVersionFor returns "v1" → legacy.
Was untested.
Added TestApplyWithProviderAndStore_V1Path_DeclarerReturnsEmpty
using iactest.NoopProvider{DispatchVersion: ""}, which always
implements the interface (the method exists on the type). Pins
Path B specifically.
Pure correctness fixes — no signature change, no behavior change for
the success-only or v1-RecordingProvider paths.
…onversion
cmd/wfctl/deploy_providers.go remoteResourceDriver.Diff was passing
current.Sensitive (map[string]bool) directly into the args map.
structpb.NewStruct rejects map[string]bool — it accepts map[string]any
only — and the upstream plugin/external/convert.go::mapToStruct
returns &structpb.Struct{} on err rather than surfacing the typing
failure. Result: every Diff dispatch over gRPC for any provider whose
ResourceOutput.Sensitive map was non-nil (or even an empty
map[string]bool{}) silently observed args=map[] on the plugin side.
v1 plugins never tripped this because v1 dispatches IaCProvider.Plan
server-side (no ResourceDriver.Diff over gRPC). v2 (W-3b T3.7's
manifest-driven dispatch) surfaces it immediately on the first
existing-resource Diff call.
Fix: convert via sensitiveToAny() to the map[string]any shape
NewStruct accepts. Returns nil for empty/nil input so the wire stays
trim-friendly. Bug discovered during W-3b T3.9 runtime-launch
validation against an out-of-band gRPC stub plugin; the canonical
T3.9 in-tree test ships separately as a loader-seam Go integration
test (per team-lead direction + plan precedent at plugin/sdk/iaclint/).
Will surface in T3.10's PR description as a third
incidentally-fixed-by-W-3b bug.
W-3b T3.9. Exercises the full v2 dispatch chain — config parse →
state load → provider load (via the resolveIaCProvider seam from
T3.6c) → ComputePlan Diff dispatch (T3.6e/f) →
wfctlhelpers.ApplyPlan (T3.7's manifest-driven branch) → Replace
decomposition into Delete + Create → printDriftReportIfAny — by
injecting a Go in-process v2-declaring provider through the package-
level seam. No out-of-process gRPC binary or plugin.json under
internal/testdata/.
# ADR 007 — non-trivial deviation from plan-literal
Plan §T3.9 specified "Build a real gRPC-loaded stub provider plugin
in internal/testdata/stub-provider/." Team-lead authorized switching
to in-tree loader-seam validation per:
1. Plan precedent cite (plugin/sdk/iaclint/) is itself a Go
test-helper package, not a runnable binary.
2. Real-gRPC runtime validation lands in P-DO when DO sets
computePlanVersion: v2 in its plugin.json.
3. Hours-of-stub-plumbing cost doesn't earn proportional coverage
vs. T3.6e/f + T3.7 unit tests + this loader-seam end-to-end.
4. W-7 conformance suite is the recurring cross-PR gRPC harness.
Full reasoning + considered alternatives in
docs/adr/007-t3-9-runtime-validation-via-loader-seam.md.
# Tests
- TestApply_V2_LoaderSeamDispatch_EndToEnd:
- Writes a real config + filesystem state seeded with vpc
region=nyc3 (under iacStateRecord shape).
- Sets desired region=nyc1.
- Substitutes the resolveIaCProvider seam to return a Go provider
that declares v2 + has a driver returning NeedsReplace=true.
- Calls applyInfraModules (the production runInfraApply
entrypoint) and asserts driver.diffCount == 1, deleteCount ==
1, createCount == 1, plus exact identity of the deleted
ProviderID and the created Config["region"].
- TestApply_V2_LoaderSeam_DriftReportPrinted:
- Same loader-seam setup + applyV2ApplyPlanFn substitution
returning InputDriftReport with one entry.
- Captures os.Stderr and asserts the FormatStaleError block
reaches the operator (drift-report wiring T3.7 added is
end-to-end alive in the v2 loader path).
# Test infrastructure
- cmd/wfctl/main_test.go: NEW TestMain forces
WFCTL_DIFFCACHE=disabled so the platform diffcache (process-
scoped via getDiffCache lazy init) doesn't observe stale entries
from a developer's local ~/.cache/wfctl/diff/ as false-positive
cache hits skipping driver Diff dispatch. Same pattern as
platform/main_test.go from T3.6f. Caught during dev when the
end-to-end test failed in the full cmd/wfctl test run but passed
in isolation.
# Bug-class context
The Option-A draft (real gRPC binary; not retained on this branch
per the ADR) surfaced a real wfctl bug fixed in commit 40e07a1
(remoteResourceDriver.Diff sensitiveToAny conversion). The bug
exists independent of which T3.9 option ships; the fix is in tree
and surfaces in T3.10's PR description as the third W-3b
incidentally-fixed bug.
W-3b T3.10. Stages the W-3b PR body text in docs/prs/w3b-pr-body.md as a stable artifact the team-lead can copy-paste at PR-open time. Pure-additive doc; no code changes. Captures all three incidentally-fixed bugs surfaced during W-3b's binding dispatch wiring: 1. Delete-via-Apply state leakage (T3.3 doDelete + T3.7 dispatch) 2. ForceNew silently downgraded to Update (T3.6e replace emission) 3. map[string]bool drops gRPC args silently — sensitiveToAny converter (commit 40e07a1; surfaced during T3.9 runtime validation; v1 plugins never tripped it) Includes summary, BREAKING-change call-out, ADR reference, rollout notes, and test plan.
Per spec-reviewer's adversarial review of the prior keeps-grpc-stub variant: the durability invariant for recording-decisions requires preserving ALL transitions of a deliberation, not just the final landing. The original ADR (loader-seam variant) recorded only one team-lead direction; the keeps-grpc-stub variant (since superseded) recorded only one reversal. Neither captured the full B → A → B → A → B oscillation that played out during T3.9 execution. This commit: - Status header updated to "Accepted (with extensive deliberation history — see Decision history section)". - Context section adjusted to preface the deliberation history rather than imply a single-direction trajectory. - New Decision history section lists all 5 transitions with verbatim team-lead quotes + per-transition implementer action. - Final paragraph captures the meta-lesson: when team-lead path- flips mid-execution, reviewer + implementer should refuse to proceed and force explicit disambiguation. Both reviewers endorsed this hold during transition 4; the strict-interpretation invariant from using-superpowers was the operative rule. Pure ADR amendment; no code changes. Branch state (c9101ba T3.9 loader-seam + d2e50d4 T3.10 PR body) unaffected. Closes spec-reviewer's Issue 1 from c9101ba pre-review: "ADR-history erasure: cherry-picking 92f060e onto 40e07a1 erased the durable record of team-lead's 'Path #1 — keep A' reversal. Future branch-readers will see no record of why Option A was considered + rejected."
T5.1 — new package iac/jitsubst hosts ResolveSpec, the apply-time helper
that resolves ${VAR}, ${MODULE.field}, and ${MODULE.id} references in a
ResourceSpec.Config tree. Strict semantics: every reference MUST resolve
or the helper returns an error and the input spec unchanged. ${MODULE.id}
prefers the in-apply replaceIDMap (W-3b/T3.4) over syncedOutputs so
cascade-replace ProviderID propagation is authoritative over potentially
stale state outputs.
Used by W-5 T5.2 (wire into wfctlhelpers.ApplyPlan) and T5.3 (wire into
doReplace). No behavior change yet — helper has no in-tree caller.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
T5.2 — wfctlhelpers.ApplyPlan now invokes jitsubst.ResolveSpec on every
action.Resource before dispatch. The substitution sees:
- result.ReplaceIDMap (this-apply Replace ProviderIDs from doReplace)
- syncedOutputs (state-side outputs from action.Current entries +
this-apply outputs from successful prior dispatches in the same loop)
- os.LookupEnv (production env source)
syncedOutputs is pre-populated from every action.Current at start-of-apply
so a NEW action can reference an in-state sibling module's outputs from
action zero. After each successful dispatch (when result.Resources grows),
the new entry is folded into syncedOutputs via flattenOutputs — flat-copy
of Outputs with the canonical 'id' key shadowed by ProviderID so
${MODULE.id} resolves predictably across new and existing modules.
JIT failure surfaces as a per-action ActionError with the canonical
'jit substitution:' prefix; the offending action SKIPS dispatch
(unresolved spec must not reach the driver). The loop continues to the
next action — best-effort apply contract preserved.
Tests in apply_jit_test.go cover: 2-create plan with B referencing
${A.id}, pre-syncing from action.Current, unresolved-ref skipping
dispatch with canonical prefix, no-refs passthrough, and loop-continues-
after-per-action-JIT-error. T5.3 wires Replace cascade.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
T5.3 — locks the Replace-cascade contract via apply_replace_cascade_test.go
and updates doReplace godoc to document the cascade hookup explicitly.
Two scenarios:
- ReplaceCascade_DependentCreateGetsNewParentID: [Replace parent, Create
dependent] where dependent's Config has ${parent.id}; dependent's
Create receives the new ProviderID.
- ReplaceCascade_DependentReplaceGetsNewParentID: extends to Replace-on-
Replace shape; dependent's post-Delete Create still sees the resolved
parent.id, while its own Delete continues to target the OLD ProviderID
via action.Current (JIT does not alter action.Current).
The behavior was already operational after T5.2's loop-level
jitsubst.ResolveSpec call: doReplace populates result.ReplaceIDMap
inside iteration N, and the loop's pre-dispatch substitution at
iteration N+1 sees the fresh entry. T5.3 adds the assertion + doc
that locks this ordering as a contract; future refactors that move
substitution out of the loop OR delay ReplaceIDMap population will
break these tests loudly.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
T5.4 — runInfraPlan now stamps plan.SchemaVersion conditionally:
- V1 (1) baseline when no plan action's resolved Resource.Config
carries a JIT-style ${MODULE.field} or ${MODULE.id} reference.
- V2 (2) when any action does — older wfctl binaries reading the
persisted plan reject with the existing 'newer than supported'
diagnostic at runInfraApply.
Detection is centralized in jitsubst.HasModuleRefs (recursive walk over
map[string]any / []any / string), gated by a simple regex that requires
non-empty segments on both sides of the dot — plain ${VAR} env-var
refs (no dot) do NOT trigger the bump, so the common operator
secret-via-env workflow stays at V1.
cmd/wfctl/infra.go gains:
- infraPlanSchemaVersionV1 (=1) and infraPlanSchemaVersionJIT (=2)
constants alongside the existing infraPlanSchemaVersion (=2, max
readable). The 'max readable' constant ticks up with every schema
bump; V1/JIT name the per-plan choice runInfraPlan makes.
- planRequiresJITSubstitution(plan) helper that walks plan.Actions
once via jitsubst.HasModuleRefs.
Tests:
- iac/jitsubst/jitsubst_test.go — 8 new HasModuleRefs cases (env-var
is false, .field/.id are true, nested map/slice, nil-safe,
malformed refs are false, mixed-string is true).
- cmd/wfctl/infra_plan_schema_test.go — V1 baseline (env-var only),
V2 for both .field and .id, V1 negative for env-var-only, and
persisted-plan SchemaVersion=2 end-to-end (where T5.5's rejection
has not yet landed).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…without-plan)
T5.5 — runInfraPlan now refuses to write a plan.json via -o when the
plan is JIT-style (SchemaVersion = infraPlanSchemaVersionJIT). The exact
operator-facing error string is contract-stable:
error: plan -o requires JIT-free config; this plan references
${MODULE.field} which only resolves at apply time. Use
'wfctl infra apply' (without --plan) for JIT-aware applies.
Stdout-only emission (no -o) of a JIT-style plan is permitted — it's a
preview, not a contract. The guard fires AFTER plan computation so the
operator sees the plan table on stdout before the rejection at the
persistence step.
Tests in cmd/wfctl/infra_plan_jit_reject_test.go (4 cases):
- exact-string match (the strict contract)
- stdout-only JIT plan permitted (negative-control on the guard scope)
- persisted non-JIT plan permitted (V1 happy path unchanged)
- canonical-keyword substring match (operator-search-engine safety net)
Removed T5.4's now-redundant TestInfraPlan_SchemaVersionV2_PersistedToFile-
Matches — its happy path has been replaced by T5.5's strict rejection
contract; SchemaVersion stamping correctness is still locked by the
helper-direct tests in the same file.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
W-5 Task T5.7: per the plan's 'Files: none' instruction, this is a
documentation-only commit recording the runtime-launch-validation
transcript against the built wfctl binary.
# Step 1: Build
$ GOWORK=off go build -o /tmp/wfctl-jit-validation ./cmd/wfctl
(no output, exit 0)
# Step 3: T5.5 persisted-JIT-plan rejection (build-binary verification)
Fixture (infra.yaml):
modules:
- name: app
type: infra.container_service
config:
env_vars:
VPC_UUID: "${vpc.id}"
DB_HOST: "${pg.private_ip}"
$ wfctl infra plan -o /tmp/jit-validation/plan.json --config infra.yaml
Infrastructure Plan — infra.yaml
+ create app (infra.container_service)
Plan: 1 to create, 0 to update, 0 to destroy.
error: error: plan -o requires JIT-free config; this plan references
${MODULE.field} which only resolves at apply time. Use 'wfctl infra
apply' (without --plan) for JIT-aware applies.
EXIT=1
The doubled 'error: error:' prefix is because cmd/wfctl/main.go's
top-level error reporter prepends 'error: ' to every command failure
(line 211: `fmt.Fprintf(os.Stderr, "error: %v\n", rootErr)`), AND
the team-lead-specified literal also begins with 'error: '. Per
implementer brief: 'Match exactly.' Flagging here for visibility — a
follow-up could either drop the prefix from the literal or special-case
main.go's wrapping. Not addressing in W-5.
# T5.5 inverse: stdout-only JIT plan permitted (no rejection)
$ wfctl infra plan --config infra.yaml
Infrastructure Plan — infra.yaml
+ create app (infra.container_service)
Plan: 1 to create, 0 to update, 0 to destroy.
EXIT=0
# T5.4 V1 baseline: non-JIT config persisted to disk still works
Fixture (infra-novars.yaml):
modules:
- name: app
type: infra.container_service
config:
cidr: "10.0.0.0/16"
$ wfctl infra plan -o plan-novars.json --config infra-novars.yaml
Plan: 1 to create, 0 to update, 0 to destroy.
Plan saved to /tmp/jit-validation/plan-novars.json
EXIT=0
$ jq .schema_version plan-novars.json
1 ← V1 (T5.4 stamp logic working)
# Step 2: apply with ${A.id} reference — covered by in-tree tests
T5.7 plan §Step 2 specifies running 'apply against fixture with ${A.id}
reference' against the built binary. wfctl infra apply requires a fully-
configured iac.provider plugin (manifest, plugin.json, gRPC binary), so
running this end-to-end against an ad-hoc fixture is non-trivial without
W-7's conformance harness. The same code path is fully covered by:
- iac/wfctlhelpers/apply_jit_test.go::TestApplyPlan_JIT_TwoCreate_BSpec-
ResolvesAID (T5.2 — basic create+create cascade)
- iac/wfctlhelpers/apply_replace_cascade_test.go::TestApplyPlan_Replace-
Cascade_DependentCreateGetsNewParentID (T5.3 — replace+create cascade)
- iac/wfctlhelpers/apply_replace_cascade_test.go::TestApplyPlan_Replace-
Cascade_DependentReplaceGetsNewParentID (T5.3 — replace+replace cascade)
- iac/wfctlhelpers/apply_jit_test.go::TestApplyPlan_JIT_UnresolvedRef_-
RecordsActionErrorAndSkipsDispatch (T5.2 — failure path)
These exercise the SAME wfctlhelpers.ApplyPlan code path the binary
invokes; the unit-test fake driver is functionally equivalent to a v2
plugin from ApplyPlan's perspective. A binary-level apply smoke test is
deferred to W-7's conformance gate (which adds the DO smoke test against
real-cloud fixtures).
# Verification
Tests pass:
GOWORK=off go test -race -count=1 ./interfaces/... ./iac/... ./platform/... ./cmd/wfctl/... ./module/...
→ all packages OK.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Spec-reviewer caught that the shipped error string in cmd/wfctl/infra.go diverged from the plan literal at docs/plans/2026-05-03-iac-conformance- and-replace.md §T5.5 line 2104. The kickoff brief I worked from substituted a wordier alternate string; team-lead confirmed the plan literal is the correct contract. Three fixes: 1. cmd/wfctl/infra.go:297 — replace fmt.Errorf literal with errors.New(<plan literal>). No leading 'error:' prefix — that's prepended by cmd/wfctl/main.go's top-level error wrapper, so the doubled 'error: error:' artifact in T5.7's runtime transcript is resolved as a side benefit. Switched to errors.New per spec-reviewer suggestion: avoids govet's no-format-verbs noise on the no-substitution case and is the canonical Go pattern for fixed-string sentinels. 2. cmd/wfctl/infra_plan_jit_reject_test.go:16 — expectedJITRejectError constant updated to the plan literal. Comment block expanded to document the literal's source + the leading-error-prefix nuance for future readers. 3. cmd/wfctl/infra_plan_jit_reject_test.go:125 — substring keyword list in TestInfraPlan_RejectionErrorContainsCanonicalKeywords updated to keys actually present in the new literal: 'JIT resolution', 'persisted plan.json', 'wfctl infra apply', '-o/--plan'. The exact-match test above is the strict contract; this one stays as the operator-search-engine safety net. Verified end-to-end via rebuilt wfctl binary against the same fixture from T5.7's transcript: $ wfctl infra plan -o plan.json --config infra.yaml Infrastructure Plan — infra.yaml + create app (infra.container_service) Plan: 1 to create, 0 to update, 0 to destroy. error: this plan requires JIT resolution; persisted plan.json is not supported. Run 'wfctl infra apply' directly without -o/--plan. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Records the architectural choice resolved during T5.3: jitsubst.ResolveSpec runs once at the wfctlhelpers.ApplyPlan dispatch loop (immediately before each dispatchAction call), NOT inside per-action helpers. doReplace populates result.ReplaceIDMap; the next iteration's pre-dispatch ResolveSpec consumes it. This honors the Replace-cascade contract via loop-ordering invariant rather than via an explicit substitution call inside doReplace. Plan §T5.3 specified inner-resolve in doReplace; T5.2's loop-level call already covered the cascade case. Threading syncedOutputs through dispatchAction → doReplace would have made the helper boundary leaky for one call site. Option 1 (test-only T5.3 + this ADR) chosen by team-lead over option 2 (inner-resolve rework) on 2026-05-04 after spec-reviewer escalation. Cascade contract is locked by apply_replace_cascade_test.go's two scenarios; this ADR ensures future refactors that move substitution out of the loop OR delay ReplaceIDMap population see the trade-off rather than rediscovering it via git blame. Same Nygard-3 pattern as ADR 006 / ADR 007. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ected transcript
Supersedes `c52aa88`'s deferral-only T5.7 — implements Step 2 via the
T3.9-precedent loader seam (resolveIaCProvider) instead of deferring to
unit tests + W-7. spec-reviewer's adversarial pass rejected the
deferral as a meaningful spec divergence; this commit closes it.
# What landed
cmd/wfctl/infra_apply_jit_loader_test.go (new):
TestApply_V2_LoaderSeam_JITResolution_EndToEnd — exercises the full
v2 dispatch chain (config parse → state load → provider load →
ComputePlan → wfctlhelpers.ApplyPlan → jitsubst.ResolveSpec →
driver.Create) for a 2-create plan where the dependent's env_vars
carry ${parent.id}. Asserts the dependent's recorded Create Config
equals the parent's freshly-minted ProviderID — proves end-to-end
JIT substitution through the binary code path. Same loader-seam
pattern as TestApply_V2_LoaderSeamDispatch_EndToEnd (T3.9), with a
JIT-flavored stub provider/driver.
# Step 3: T5.5 rejection — corrected transcript (post-fix literal)
Re-captured against the post-517aa58 binary. The plan-literal error
string is now correctly emitted (no doubled 'error: error:' artifact):
Fixture:
modules:
- name: app
type: infra.container_service
config:
env_vars:
VPC_UUID: "${vpc.id}"
DB_HOST: "${pg.private_ip}"
$ wfctl infra plan -o plan.json --config infra.yaml
Infrastructure Plan — infra.yaml
+ create app (infra.container_service)
Plan: 1 to create, 0 to update, 0 to destroy.
error: this plan requires JIT resolution; persisted plan.json is not
supported. Run 'wfctl infra apply' directly without -o/--plan.
EXIT_REJECT=1 ← T5.5 rejection fires
# T5.5 inverse: stdout-only JIT plan permitted
$ wfctl infra plan --config infra.yaml
Infrastructure Plan — infra.yaml
+ create app (infra.container_service)
Plan: 1 to create, 0 to update, 0 to destroy.
EXIT_STDOUT=0 ← preview allowed; rejection scoped to -o
# T5.4 V1 baseline: non-JIT persisted plan still works
$ wfctl infra plan -o plan-novars.json --config infra-novars.yaml
Plan saved to /tmp/jit-validation/plan-novars.json
EXIT_PERSIST_V1=0
$ jq .schema_version plan-novars.json
1 ← V1 stamp logic working
# Why loader seam (not real gRPC stub)
Same precedent as ADR 007 (T3.9): a stub-provider gRPC plugin would
add an out-of-process build dependency for one assertion. The loader
seam exercises every in-process code path between cmd/wfctl and the
driver — the only mocked boundary is the cloud API the driver would
have called. Operational rationale identical to T3.9; if a future
revision wants a real out-of-process stub, that's a straightforward
extension of the same fixture.
# Test results
GOWORK=off go test -race -count=1 ./interfaces/... ./iac/... ./platform/... ./cmd/wfctl/... ./module/...
→ all packages OK.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Adds apply-time “just-in-time” (JIT) substitution for IaC action specs so resource configs can reference env vars and other modules’ outputs/IDs at dispatch time, and updates wfctl planning semantics to recognize and restrict persisted JIT-style plans.
Changes:
- Introduces
iac/jitsubst.ResolveSpec+HasModuleRefsfor strict${VAR}and${MODULE.field|id}substitution with deterministic erroring. - Wires per-action JIT resolution into
wfctlhelpers.ApplyPlan, including state-seeded + in-apply synced outputs and ReplaceIDMap propagation. - Bumps plan schema handling to V2 when module refs are present and rejects persisting such plans via
wfctl infra plan -o ..., with tests + ADR coverage.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
iac/wfctlhelpers/apply.go |
Adds per-action JIT resolution and syncedOutputs propagation in the v2 apply dispatch loop; clarifies Replace cascade contract. |
iac/wfctlhelpers/apply_jit_test.go |
Unit tests for ApplyPlan JIT behavior (success, failure, best-effort loop continuation). |
iac/wfctlhelpers/apply_replace_cascade_test.go |
Verifies ReplaceIDMap + loop-level JIT ordering supports Replace cascades. |
iac/jitsubst/jitsubst.go |
New JIT substitution implementation and module-ref detection helper. |
iac/jitsubst/jitsubst_test.go |
Comprehensive unit tests for ResolveSpec/HasModuleRefs semantics and error handling. |
cmd/wfctl/infra.go |
SchemaVersion stamping based on module refs; rejects persisting JIT-style plans with a fixed error string. |
cmd/wfctl/infra_plan_schema_test.go |
Tests schema version stamping / detection logic. |
cmd/wfctl/infra_plan_jit_reject_test.go |
Tests exact persisted-plan rejection error and ensures no plan file is written. |
cmd/wfctl/infra_apply_jit_loader_test.go |
Loader-seam end-to-end test proving Apply → ComputePlan → ApplyPlan → JIT → driver.Create sees substituted config. |
docs/adr/008-jit-substitution-at-dispatch-loop.md |
ADR documenting loop-level JIT substitution design decision (vs inner per-helper wiring). |
⏱ Benchmark Results✅ No significant performance regressions detected. benchstat comparison (baseline → PR)
|
…ead test scaffolding (Copilot review) - jitsubst.go package doc T5.3 bullet now reflects the actual W-5 design per ADR 008: T5.3 verifies cascade behavior (doReplace populates ReplaceIDMap; the existing T5.2 ResolveSpec call site consumes it on subsequent PlanActions). T5.3 does NOT add a second ResolveSpec call inside doReplace. - TestInfraPlan_SchemaVersionV2_WhenJITModuleFieldRef no longer drives runInfraPlan with an unused planFile path — the SchemaVersion bump isn't observable without -o, and -o is rejected for V2 plans by T5.5's contract. Reduced to the direct planRequiresJITSubstitution helper assertion the test was already doing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
W-5 of the 12-PR IaC plan series. Adds per-module JIT secret resolution at apply-time:
${VAR}(env),${MODULE.field}(syncedOutputs cumulative across actions),${MODULE.id}(ReplaceIDMap from W-3b/T3.4 doReplace). Bumps plan SchemaVersion to 2 when JIT references present + rejects persisted JIT-style plans (canonical path is apply-without---plan).Plan reference:
docs/plans/2026-05-03-iac-conformance-and-replace.mdrev10.Base:
feat/iac-replace-dispatch(W-3b PR #528). Auto-retargets to main when W-3b merges.What ships (9 commits)
iac/jitsubst.ResolveSpecper-module deferred substitution helperwfctlhelpers.ApplyPlanresolves JIT per action (loop-level dispatch)doReplace— see ADR 008)ADRs
docs/adr/008-jit-substitution-at-dispatch-loop.md): records the architectural decision that JIT lives at the dispatch loop (single source) rather than per-action helpers. Plan rev10 §T5.3 specified inner-resolve in doReplace; T5.2's loop-level call covers the cascade case via in-iteration ReplaceIDMap propagation. Single-source design + cascade test (apply_replace_cascade_test.go) lock the contract.Plan-spec defects flagged (W-5 incidentals)
this plan requires JIT resolution; persisted plan.json is not supported. Run 'wfctl infra apply' directly without -o/--plan.).Test plan
GOWORK=off go test -race -count=1 ./interfaces/... ./iac/jitsubst/... ./iac/wfctlhelpers/... ./iac/... ./platform/... ./cmd/wfctl/... ./module/...PASSwfctl infra plan -o plan.jsonagainst JIT config emits literal error per plan §T5.5🤖 Generated with Claude Code