diff --git a/CHANGELOG.md b/CHANGELOG.md index e02d6327..83510c28 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -9,6 +9,29 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Added +- **Engine-side sensitive-output routing** (v0.27.0): `ResourceDriver` outputs flagged with + `Sensitive: {key: true}` on Create/Update are routed through the configured `secrets.Provider` + and replaced in state with `secret_ref://_` placeholders. Plugins remain + platform-agnostic: a plugin compiled into a wfctl-from-CI run, a wfctl-from-CLI run, or a + workflow-cloud server transparently gets sensitive-output handling without each host writing + its own routing. Read paths (adoption, refresh) are sanitize-only — no `provider.Set` calls + from a Read path (prevents cache pollution). On `SaveResource` failure after `provider.Set` + succeeded, the engine compensates with `driver.Delete` + `provider.Delete` to prevent orphan + cloud resources + routed secrets. See `docs/plans/2026-05-09-engine-sensitive-output-routing-design.md`. +- **`iac/sensitive` package**: `Route(ctx, provider, resourceName, *out) (sanitized, hydrated, error)`, + `Revoke(ctx, provider, resourceName, mergedKeys) error`, `IsPlaceholder(v any) bool`, + `MaskSensitiveForDiff(driverKeys, desired, current) (map, map)`, `Placeholder(resource, key) string`, + `PlaceholderPrefix string` (`"secret_ref://"`), `SecretKey(resource, key) string`. Routing + trigger is exclusively per-call `out.Sensitive[k]==true`; `ResourceDriver.SensitiveKeys()` + remains a display-masking-only signal. Limitation: only string-typed sensitive output values + are supported in v0.27.0. +- **`wfctl infra audit-state-secrets`** (with `--prune`): walks state.Outputs vs. + `secrets.Provider` to detect orphans, missing routed values, legacy plaintext, and mistaken + `secret://...` config-references in state. Distinct from `audit-secrets` which audits the + `secrets.generate` config block. Exit codes: 0 = no findings, 1 = findings, 2 = audit error. + For write-only providers (GitHub Actions Get returns ErrUnsupported), emits structured + ADVISORY lines for each placeholder it cannot verify, but does not exit non-zero on those alone. + - **`interfaces.ResourceReplacer`**: optional driver interface for resources requiring orchestration beyond naive Delete-then-Create (e.g., Droplets with attached Block Storage Volumes). Drivers implementing `Replace(ctx, oldRef, spec) → (*ResourceOutput, error)` take @@ -53,6 +76,26 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Changed +- **`applyWithProviderAndStore` and `applyPrecomputedPlanWithStore` signatures**: each gains + trailing `cfgFile string` and `hydratedOut map[string]string` parameters. Existing tests can + pass empty string and nil. The cfgFile is used to load the configured `secrets.Provider` + for sensitive-output routing; the hydratedOut map is filled by the helper so post-apply + consumers can rehydrate `secret_ref://` placeholders without going through `provider.Get` + (works on write-only providers like GitHub Actions). +- **`applyInfraModules` return signature**: now returns `(map[string]string, error)` where the + map is the hydrated routed-secret values. Callers update from `if err := applyInfraModules(...)` + to `if _, err := applyInfraModules(...)` (or capture the map for hand-off to + `syncInfraOutputSecrets`). +- **`syncInfraOutputSecrets` and `resolveInfraOutput` signatures**: each gains a trailing + `hydrated map[string]string` parameter for in-process routed-secret hand-off. Callers without + same-process apply context pass `nil`; sensitive placeholders then surface a documented + cold-start constraint error explaining the fallback path (`secret://_` direct + reference). +- **`adoptExistingResources` and `runInfraRefreshOutputs`**: now route state writes through + `persistResourceWithSecretRouting` in read-mode (sanitize-only). Pre-existing + `secret_ref://...` placeholders are preserved across re-applies; newly-declared sensitive + keys on Read paths are dropped (not routed) to prevent cache pollution. + - **BREAKING (`wfctl infra plan`)**: configs declaring at least one `iac.provider` module now require the plugin process to load successfully — `plan` invokes the same loader that `apply` uses so `platform.ComputePlan` can dispatch `ResourceDriver.Diff` for honest Replace-action @@ -77,6 +120,44 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 tags (additive — previously these fields serialised as `null` / `[]` in JSON; they are now omitted entirely when empty, which is what most consumers expect). +### Migration (engine-side sensitive-output routing, v0.27.0) + +- **Greenfield envs**: no action required. Plugins that opt into routing add + `Sensitive: {key: true}` to their `ResourceOutput` returns on Create/Update. Operators add a + `secrets:` block to their workflow config (recommend `provider: env` with a prefix for + local runs). +- **Pre-existing state with plaintext secrets**: run `wfctl infra audit-state-secrets` to + enumerate. Rotate via `wfctl infra bootstrap --force-rotate ` running v0.27.0 (the + rotation regenerates with engine-side routing). +- **Apply runs against plugins that newly emit `Sensitive` outputs without a `secrets:` block**: + the engine now hard-fails BEFORE the per-resource persistence loop with a named-resource + diagnostic. Add `secrets:` to your config to proceed. + +### Rollback (engine-side sensitive-output routing, v0.27.0) + +**Validation status:** mechanical analysis only. v0.26.0 source (verified at +git tag v0.26.0) has no awareness of `secret_ref://` prefix; the legacy +`Outputs: r.Outputs` path writes whatever the driver returned and reads +state values verbatim. So a v0.26.0 binary running against state written +by v0.27.0 will treat `secret_ref://...` strings as literal values +(documented in step 3 below). Live runtime smoke against a v0.26.0 +binary is deferred to first-rollback-event; the test path is well-defined +and a stop-the-line gate. + +To pin to v0.26.x: + +1. Pin `setup-wfctl@v0.26.x` and rebuild. +2. State records written under v0.27.0 contain `secret_ref://...` placeholders. v0.26.x + consumers do not understand these and treat them as literal strings (e.g., + `infra_output` generators copy the literal placeholder into a downstream secret). +3. Recovery: rotate the affected secrets via `wfctl infra bootstrap --force-rotate ` + running v0.27.0 first to regenerate plaintext state, OR manually edit the state record + (filesystem JSON) to inline the value from `secrets.Provider`. +4. The new package `iac/sensitive`, helper `persistResourceWithSecretRouting`, and + `audit-state-secrets` command are additive; reverting the call sites to v0.26.x literal + `Outputs: r.Outputs` shape is a one-commit revert of `infra_apply.go` + + `infra_refresh_outputs.go`. + ## [0.18.11.1] - 2026-04-25 ### Fixed diff --git a/DOCUMENTATION.md b/DOCUMENTATION.md index 77538755..b67613f0 100644 --- a/DOCUMENTATION.md +++ b/DOCUMENTATION.md @@ -1830,6 +1830,77 @@ The plugin is auto-registered via `init()` in `plugin/admincore/plugin.go`. No Y --- +## Sensitive output routing (v0.27.0+) + +When a `ResourceDriver.Create` or `Update` returns a `*ResourceOutput` with +`Sensitive: {key: true}`, the engine routes those output values through the +configured `secrets.Provider` instead of writing them plaintext to state. +Plugins remain platform-agnostic: a plugin compiled into a wfctl-from-CI +run, a wfctl-from-CLI run, or a workflow-cloud server transparently gets +sensitive-output handling without each host writing its own routing. + +**Routed secret naming:** `_` (e.g., a resource +named `coredump-deploy-key` with `secret_key` output is stored under +`coredump-deploy-key_secret_key`). + +**State placeholder:** Routed fields appear in `state.Outputs` as +`secret_ref://_` strings. Distinct from the user-supplied +`secret://` config-reference convention. + +**Required configuration:** A `secrets:` block in your workflow config. +For local/ad-hoc runs, environment variables are the simplest option: + +```yaml +secrets: + provider: env + config: + prefix: WORKFLOW_ +``` + +If a plugin emits sensitive outputs but no `secrets.Provider` is configured, +apply hard-fails BEFORE the per-resource persistence loop with a named +resource and remediation. This prevents partial-apply mid-stream. + +**Failure modes:** + +- `secrets.Provider.Set` fails: apply errors; rerun is idempotent. +- `state.SaveResource` fails after `Set` succeeded: the engine compensates + by calling `driver.Delete` + `provider.Delete` (with a fresh 30s + context, so compensation runs even on Ctrl-C cancel) and surfaces a + combined error naming the compensation outcome. + +**Read paths (refresh, adoption):** never call `provider.Set`. Placeholders +are inherited from prior state to avoid cache pollution. Newly-declared +sensitive keys on Read paths are dropped. + +**Drift detection:** `iac/sensitive.MaskSensitiveForDiff` masks sensitive +keys from both desired and current sides before `driver.Diff`, preventing +false-positive drift on keys that the cloud refuses to re-emit (e.g., DO +Spaces `secret_key`). As of v0.27.0, no in-tree call site dispatches +`driver.Diff` against state.Outputs that may contain placeholders; the +helper is exported for future consumers. + +**Cold-start consumers:** `secret_ref://` placeholders cannot be rehydrated +cross-process on write-only providers (GitHub Actions). Same-process +consumers (e.g., `infra_output:` generators in the same `wfctl infra apply` +invocation) get the routed value via in-memory hand-off. Cross-process +consumers reference the routed secret by name via `secret://_` +directly. + +**Limitation (v0.27.0):** only string-typed sensitive output values are +supported. Non-string sensitive outputs (e.g., `[]byte`, `int`) yield an +error from `Route`. Future expansion via a `MarshalSensitive` interface +is out of scope. + +**Recovery:** `wfctl infra audit-state-secrets` audits state for orphan +secrets, missing routed values, legacy plaintext fields, and mistaken +config-references in state. Distinct from `audit-secrets` which audits +the `secrets.generate` config block. Run both as part of regular hygiene: + +```sh +wfctl infra audit-state-secrets --config infra.yaml [--prune] +``` + ## IaC Provider Plugin Interfaces Cloud-provider plugins implement the core `interfaces.IaCProvider` Go diff --git a/cmd/wfctl/infra.go b/cmd/wfctl/infra.go index 445ec8ee..f8607d56 100644 --- a/cmd/wfctl/infra.go +++ b/cmd/wfctl/infra.go @@ -96,6 +96,11 @@ func runInfra(args []string) error { return runInfraPruneCmd(args[1:]) case "rotate-and-prune": return runInfraRotateAndPruneCmd(args[1:]) + case "audit-state-secrets": + if rc := runInfraAuditStateSecrets(args[1:], os.Stdout); rc != 0 { + return fmt.Errorf("audit-state-secrets exited with code %d", rc) + } + return nil default: return infraUsage() } @@ -123,6 +128,7 @@ Actions: audit-keys List cloud-side resources of --type via the provider's EnumeratorAll prune Destructively delete cloud resources by --created-before / --exclude-access-key (two-key opt-in) rotate-and-prune All-in-one: rotate canonical credential, then prune older keys with the new key as exclusion target + audit-state-secrets Audit state.Outputs vs. secrets.Provider for orphans, legacy, missing Options: --config Config file (default: infra.yaml or config/infra.yaml) @@ -1119,6 +1125,13 @@ func platformNow() time.Time { } func runInfraApply(args []string) error { + // runHydrated carries routed-secret values from the same-process apply + // (sensitive.Route's hydrated map) to syncInfraOutputSecrets below. + // Empty when no driver emitted sensitive outputs; nil for the + // precomputed-plan branch unless threaded through. Required for + // rehydration on write-only providers (GitHub Actions secrets are + // write-only after Set). + var runHydrated map[string]string fs := flag.NewFlagSet("infra apply", flag.ContinueOnError) var configFlag string fs.StringVar(&configFlag, "config", "", "Config file") @@ -1388,9 +1401,11 @@ func runInfraApply(args []string) error { if plan.DesiredHash != currentHash { return fmt.Errorf("plan stale: config hash mismatch (run wfctl infra plan again)") } - if err := applyFromPrecomputedPlan(ctx, plan, cfgFile, envName); err != nil { + h, err := applyFromPrecomputedPlan(ctx, plan, cfgFile, envName) + if err != nil { return err } + runHydrated = h // Fall through to post-apply infra_output secrets sync below — // same as the live-diff path so STAGING_DATABASE_URL and similar // infra_output secrets are always refreshed after a successful apply. @@ -1407,9 +1422,11 @@ func runInfraApply(args []string) error { ) } if hasInfraModules(cfgFile) { - if err := applyInfraModules(ctx, cfgFile, envName); err != nil { + h, err := applyInfraModules(ctx, cfgFile, envName) + if err != nil { return err } + runHydrated = h } else { pipelineCfg := cfgFile if envName != "" { @@ -1456,7 +1473,7 @@ func runInfraApply(args []string) error { } } } - return syncInfraOutputSecrets(ctx, secretsCfg, secretsProvider, states, wfCfg, envName) + return syncInfraOutputSecrets(ctx, secretsCfg, secretsProvider, states, wfCfg, envName, runHydrated) } func runInfraStatus(args []string) error { diff --git a/cmd/wfctl/infra_apply.go b/cmd/wfctl/infra_apply.go index fa73aaea..0b4cb2d2 100644 --- a/cmd/wfctl/infra_apply.go +++ b/cmd/wfctl/infra_apply.go @@ -15,9 +15,11 @@ import ( "github.com/GoCodeAlone/workflow/config" "github.com/GoCodeAlone/workflow/iac/inputsnapshot" + "github.com/GoCodeAlone/workflow/iac/sensitive" "github.com/GoCodeAlone/workflow/iac/wfctlhelpers" "github.com/GoCodeAlone/workflow/interfaces" "github.com/GoCodeAlone/workflow/platform" + "github.com/GoCodeAlone/workflow/secrets" ) // printDriftReportIfAny writes the canonical FormatStaleError output to w @@ -148,11 +150,17 @@ func hasPlatformModules(cfgFile string) bool { // // This is the new dispatch path used when the config contains infra.* modules // instead of the legacy platform.* + pipelines.apply pipeline path. -func applyInfraModules(ctx context.Context, cfgFile, envName string) error { //nolint:cyclop +// applyInfraModules applies infra.* modules and returns the hydrated +// routed-secret map for in-process hand-off to post-apply consumers +// (syncInfraOutputSecrets). The map may be empty when no driver emitted +// sensitive outputs in this run, or when no secrets.Provider is +// configured. Returning a separate map (rather than threading via a +// callback) keeps the caller's flow explicit: apply → consume hydrated. +func applyInfraModules(ctx context.Context, cfgFile, envName string) (map[string]string, error) { //nolint:cyclop // Resolve specs (env overrides applied when envName is set). specs, err := parseInfraResourceSpecsForEnv(cfgFile, envName) if err != nil { - return fmt.Errorf("parse infra resource specs: %w", err) + return nil, fmt.Errorf("parse infra resource specs: %w", err) } // Keep only infra.* specs for the direct path. @@ -164,10 +172,10 @@ func applyInfraModules(ctx context.Context, cfgFile, envName string) error { //n } if len(infraSpecs) == 0 { fmt.Println("No infra.* modules to apply.") - return nil + return nil, nil } if err := validateUniqueInfraResourceNames(infraSpecs); err != nil { - return err + return nil, err } // --include: apply scope filter. Resolve the include set from the package- @@ -180,14 +188,14 @@ func applyInfraModules(ctx context.Context, cfgFile, envName string) error { //n // Load current state once; nil on first run is valid (no prior state). current, err := loadCurrentState(cfgFile, envName) if err != nil { - return fmt.Errorf("load current state: %w", err) + return nil, fmt.Errorf("load current state: %w", err) } // Validate and apply the include filter to both specs and state before // grouping by provider so that out-of-scope resources are never passed // down to any provider. if err := validateIncludeSet(includeSet, infraSpecs, current); err != nil { - return err + return nil, err } infraSpecs = filterSpecsByInclude(infraSpecs, includeSet) current = filterStatesByInclude(current, includeSet) @@ -195,7 +203,7 @@ func applyInfraModules(ctx context.Context, cfgFile, envName string) error { //n // Load full config to resolve iac.provider module definitions. cfg, err := config.LoadFromFile(cfgFile) if err != nil { - return fmt.Errorf("load config: %w", err) + return nil, fmt.Errorf("load config: %w", err) } // Plan-time JIT resolution (PR-1): substitute ${MODULE.field} and @@ -204,7 +212,7 @@ func applyInfraModules(ctx context.Context, cfgFile, envName string) error { //n // diagnostics — they're plan-output sugar only. infraSpecs, _, err = resolveSpecsAgainstState(infraSpecs, current, cfg, envName) if err != nil { - return fmt.Errorf("resolve specs against state: %w", err) + return nil, fmt.Errorf("resolve specs against state: %w", err) } // Build a lookup table of iac.provider module name → (providerType, providerCfg). @@ -219,7 +227,7 @@ func applyInfraModules(ctx context.Context, cfgFile, envName string) error { //n // in-scope resources. groupOrder, groups, err := groupSpecsByProviderRef(infraSpecs, providerDefs, disabledProviders, envName) if err != nil { - return err + return nil, err } // Supplement with state-only groups when infraSpecs is empty after include @@ -243,9 +251,15 @@ func applyInfraModules(ctx context.Context, cfgFile, envName string) error { //n // ownership metadata. store, storeErr := resolveStateStore(cfgFile, envName) if storeErr != nil { - return fmt.Errorf("open state store: %w", storeErr) + return nil, fmt.Errorf("open state store: %w", storeErr) } + // runHydrated accumulates routed-secret values from this apply across all + // provider groups. Passed to syncInfraOutputSecrets after the loop so + // post-apply consumers can read just-routed values via in-memory hand-off + // (works on write-only providers like GitHub Actions). + runHydrated := make(map[string]string) + // Apply each provider group in first-reference order (the order in which // each group's first spec appeared in infraSpecs, not iac.provider declaration order). applyGroup := func(moduleRef string, g *specGroup) error { @@ -264,14 +278,14 @@ func applyInfraModules(ctx context.Context, cfgFile, envName string) error { //n } allowProviderTypeFallback := providerTypeCounts[g.provType] == 1 scopedCurrent := filterCurrentStateForProvider(current, g.provType, moduleRef, g.specs, allowProviderTypeFallback) - return applyWithProviderAndStore(ctx, provider, g.provType, g.specs, scopedCurrent, store, os.Stderr, envName) + return applyWithProviderAndStore(ctx, provider, g.provType, g.specs, scopedCurrent, store, os.Stderr, envName, cfgFile, runHydrated) } for _, moduleRef := range groupOrder { if err := applyGroup(moduleRef, groups[moduleRef]); err != nil { - return fmt.Errorf("provider %q: %w", moduleRef, err) + return runHydrated, fmt.Errorf("provider %q: %w", moduleRef, err) } } - return nil + return runHydrated, nil } func filterCurrentStateForProvider(current []interfaces.ResourceState, providerType, moduleRef string, specs []interfaces.ResourceSpec, allowProviderTypeFallback bool) []interfaces.ResourceState { @@ -357,10 +371,14 @@ func resourceSpecProviderRef(spec interfaces.ResourceSpec) string { // envName labels the failure step-summary output (e.g. "staging", "prod"); // pass empty string when not running in a CI context or when env metadata is // unavailable. -func applyWithProviderAndStore(ctx context.Context, provider interfaces.IaCProvider, providerType string, specs []interfaces.ResourceSpec, current []interfaces.ResourceState, store infraStateStore, w io.Writer, envName string) error { +func applyWithProviderAndStore(ctx context.Context, provider interfaces.IaCProvider, providerType string, specs []interfaces.ResourceSpec, current []interfaces.ResourceState, store infraStateStore, w io.Writer, envName string, cfgFile string, hydratedOut map[string]string) error { if store == nil { store = &noopStateStore{} } + secretsProvider, secretsErr := loadSecretsProviderForRouting(cfgFile) + if secretsErr != nil { + return secretsErr + } // Resolve abstract sizing tiers into concrete provider-specific values // (e.g. Size: "m" → instance_type: "s-1vcpu-2gb") for each spec that @@ -513,6 +531,12 @@ func applyWithProviderAndStore(ctx context.Context, provider interfaces.IaCProvi return fmt.Errorf("apply: %w", err) } if result != nil { + // Pre-flight: if any driver emitted sensitive outputs but no + // secrets.Provider is configured, fail fast with a complete + // diagnostic before per-resource persistence kicks off. + if err := requireSecretsProviderForSensitiveOutputs(secretsProvider, result); err != nil { + return fmt.Errorf("state write rejected: %w", err) + } // Persist state for every successfully provisioned resource. for _, r := range result.Resources { fmt.Printf(" ✓ %s (%s)\n", r.Name, r.Type) @@ -547,13 +571,20 @@ func applyWithProviderAndStore(ctx context.Context, provider interfaces.IaCProvi ConfigHash: configHashMap(appliedCfg), AppliedConfig: appliedCfg, AppliedConfigSource: "apply", - Outputs: r.Outputs, - Dependencies: dependencies, - CreatedAt: now, - UpdatedAt: now, + // Outputs is set by persistResourceWithSecretRouting after Route. + Dependencies: dependencies, + CreatedAt: now, + UpdatedAt: now, } - if saveErr := store.SaveResource(ctx, rs); saveErr != nil { - return fmt.Errorf("%s/%s: persist state after apply: %w", r.Type, r.Name, saveErr) + driver, _ := provider.ResourceDriver(r.Type) // best-effort for compensating Delete; nil-safe + hyd, persistErr := persistResourceWithSecretRouting(ctx, store, secretsProvider, driver, rs, r, persistModeApply) + if persistErr != nil { + return persistErr + } + if hydratedOut != nil { + for k, v := range hyd { + hydratedOut[k] = v + } } } @@ -589,6 +620,20 @@ func adoptExistingResources(ctx context.Context, provider interfaces.IaCProvider currentByName[current[i].Name] = struct{}{} } + // Precompute prior-state lookup once per adoption pass. persistReadMode + // inherits routed-secret placeholders from prior state for sensitive + // keys. Without the cache, each Read would re-ListResources + linear + // scan — O(n²) on filesystem-backed stores. priorByName remains stable + // for the duration of this loop (newly-adopted records are appended to + // store and to current[] but the placeholder-inheritance lookup only + // needs the pre-loop snapshot). + priorByName := make(map[string]*interfaces.ResourceState) + if all, listErr := store.ListResources(ctx); listErr == nil { + for i := range all { + priorByName[all[i].Name] = &all[i] + } + } + drivers := make(map[string]interfaces.ResourceDriver) for _, spec := range specs { if _, exists := currentByName[spec.Name]; exists { @@ -634,8 +679,14 @@ func adoptExistingResources(ctx context.Context, provider interfaces.IaCProvider if err := validateStateProviderID(provider, providerType, state); err != nil { return nil, err } - if saveErr := store.SaveResource(ctx, state); saveErr != nil { - return nil, fmt.Errorf("%s/%s: persist adopted state: %w", spec.Type, spec.Name, saveErr) + // Sanitize-only via persistResourceWithSecretRouting in read mode: + // drivers may declare Sensitive on Read but we MUST NOT call + // provider.Set from a Read path (cache-pollution risk per design §4.4). + // Provider passed nil; helper is nil-safe in read mode. + // Cached-prior variant: avoids per-resource ListResources scans + // when adoption pass touches many resources. + if _, err := persistResourceWithSecretRoutingCachedPrior(ctx, store, nil, driver, state, *live, persistModeRead, priorByName); err != nil { + return nil, fmt.Errorf("%s/%s: persist adopted state: %w", spec.Type, spec.Name, err) } fmt.Printf(" Adopted existing %s %q (id=%s)\n", spec.Type, spec.Name, state.ProviderID) current = append(current, state) @@ -744,6 +795,261 @@ func cloneMap(in map[string]any) map[string]any { return out } +// persistMode controls how persistResourceWithSecretRouting handles a +// driver output: persistModeApply routes sensitive fields through the +// provider; persistModeRead is sanitize-only (no provider writes — used +// by adoption / refresh paths to avoid cache pollution). +type persistMode int + +const ( + persistModeApply persistMode = iota + persistModeRead +) + +// persistResourceWithSecretRouting builds rs.Outputs from out (routing +// sensitive fields through provider in apply mode), calls +// store.SaveResource, and returns the hydrated routed-secret map for +// in-process hand-off. On SaveResource failure after provider.Set +// succeeded (apply mode only), invokes driver.Delete + provider.Delete +// to compensate the partial cloud-resource creation. Returns a wrapped +// error naming both the original SaveResource failure and the +// compensating-Delete outcome. +// +// In read mode, the helper does NOT call provider.Set; instead it consults +// the prior state via store.GetResource and inherits any +// PlaceholderPrefix entries; new sensitive keys (declared by the driver +// at Read time but not previously routed) are dropped from sanitized. +// +// Returns nil hydrated map in read mode (consumers don't need post-apply +// hand-off for a Read). +// +// store is the wfctl-internal infraStateStore interface, not +// interfaces.IaCStateStore. The helper lives in cmd/wfctl; using the +// package-private interface keeps the boundary clean and matches existing +// callers (applyWithProviderAndStore, adoptExistingResources). +func persistResourceWithSecretRouting( + ctx context.Context, + store infraStateStore, + provider secrets.Provider, + driver interfaces.ResourceDriver, + rs interfaces.ResourceState, + out interfaces.ResourceOutput, + mode persistMode, +) (map[string]string, error) { + switch mode { + case persistModeApply: + return persistApplyMode(ctx, store, provider, driver, rs, out) + case persistModeRead: + return nil, persistReadMode(ctx, store, rs, out, nil) + default: + return nil, fmt.Errorf("persistResourceWithSecretRouting: unknown mode %d", mode) + } +} + +// persistResourceWithSecretRoutingCachedPrior is the loop-friendly variant for +// callers that drive multiple Reads in a tight loop (refresh, adoption). It +// accepts a precomputed prior-state lookup map so persistReadMode does not +// have to ListResources + linear-scan per call (O(n²) on filesystem-backed +// stores). When mode != persistModeRead the priorByName map is ignored. +// +// Pass priorByName=nil to fall back to per-call ListResources behaviour +// (equivalent to persistResourceWithSecretRouting). +func persistResourceWithSecretRoutingCachedPrior( + ctx context.Context, + store infraStateStore, + provider secrets.Provider, + driver interfaces.ResourceDriver, + rs interfaces.ResourceState, + out interfaces.ResourceOutput, + mode persistMode, + priorByName map[string]*interfaces.ResourceState, +) (map[string]string, error) { + switch mode { + case persistModeApply: + return persistApplyMode(ctx, store, provider, driver, rs, out) + case persistModeRead: + return nil, persistReadMode(ctx, store, rs, out, priorByName) + default: + return nil, fmt.Errorf("persistResourceWithSecretRoutingCachedPrior: unknown mode %d", mode) + } +} + +func persistApplyMode( + ctx context.Context, + store infraStateStore, + provider secrets.Provider, + driver interfaces.ResourceDriver, + rs interfaces.ResourceState, + out interfaces.ResourceOutput, +) (map[string]string, error) { + sanitized, hydrated, err := sensitive.Route(ctx, provider, rs.Name, &out) + if err != nil { + return nil, fmt.Errorf("%s/%s: route sensitive outputs: %w", rs.Type, rs.Name, err) + } + rs.Outputs = sanitized + if saveErr := store.SaveResource(ctx, rs); saveErr != nil { + // Compensating Delete: if we routed secrets, the matching cloud + // resource is real but the state record didn't land. Roll back so + // a re-Apply doesn't double-create. + if len(hydrated) > 0 { + compErr := compensateAfterSaveFailure(provider, driver, rs, hydrated) + if compErr != nil { + return nil, fmt.Errorf("%s/%s: persist state after apply: %w (compensating delete failed: %v)", rs.Type, rs.Name, saveErr, compErr) + } + return nil, fmt.Errorf("%s/%s: persist state after apply: %w (compensating delete succeeded)", rs.Type, rs.Name, saveErr) + } + return nil, fmt.Errorf("%s/%s: persist state after apply: %w", rs.Type, rs.Name, saveErr) + } + return hydrated, nil +} + +func persistReadMode( + ctx context.Context, + store infraStateStore, + rs interfaces.ResourceState, + out interfaces.ResourceOutput, + priorByName map[string]*interfaces.ResourceState, +) error { + // Sanitize-only: inherit placeholders from prior state for sensitive + // keys; drop newly-declared sensitive keys. Do NOT call provider.Set. + // + // Prior state lookup: prefer the caller-supplied map (precomputed once + // per refresh/adoption pass) to avoid O(n²) ListResources scans. Fall + // back to ListResources + filter when the cache is nil — keeps the + // helper safe for ad-hoc one-shot callers (infraStateStore has no + // GetResource). + var prior *interfaces.ResourceState + if priorByName != nil { + prior = priorByName[rs.Name] + } else if all, listErr := store.ListResources(ctx); listErr == nil { + for i := range all { + if all[i].Name == rs.Name { + prior = &all[i] + break + } + } + } + sanitized := make(map[string]any, len(out.Outputs)) + for k, v := range out.Outputs { + sanitized[k] = v + } + for k, flag := range out.Sensitive { + if !flag { + continue + } + // If prior state has a placeholder for this key, preserve it. + if prior != nil { + if pv, ok := prior.Outputs[k]; ok && sensitive.IsPlaceholder(pv) { + sanitized[k] = pv + continue + } + } + // Otherwise drop the field from sanitized — we don't have a + // previously-routed secret and we won't pollute the provider + // from a Read. + delete(sanitized, k) + } + rs.Outputs = sanitized + if err := store.SaveResource(ctx, rs); err != nil { + return fmt.Errorf("%s/%s: persist state after read: %w", rs.Type, rs.Name, err) + } + return nil +} + +// compensateAfterSaveFailure rolls back routed secrets and the underlying +// cloud resource when SaveResource fails after provider.Set succeeded. +// Uses a fresh 30-second context: the apply context may already be +// canceled (operator Ctrl-C) but compensation must proceed to avoid +// orphaning cloud resources + routed secrets. +func compensateAfterSaveFailure( + provider secrets.Provider, + driver interfaces.ResourceDriver, + rs interfaces.ResourceState, + hydrated map[string]string, +) error { + ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second) + defer cancel() + var errs []error + if driver != nil { + ref := interfaces.ResourceRef{Name: rs.Name, Type: rs.Type, ProviderID: rs.ProviderID} + if delErr := driver.Delete(ctx, ref); delErr != nil { + errs = append(errs, fmt.Errorf("driver.Delete: %w", delErr)) + } + } + if provider != nil { + // Reverse-engineer the original output keys from the secret names. + // Format: "_"; strip the prefix. + for secretName := range hydrated { + if delErr := provider.Delete(ctx, secretName); delErr != nil && !errors.Is(delErr, secrets.ErrNotFound) { + errs = append(errs, fmt.Errorf("provider.Delete(%s): %w", secretName, delErr)) + } + } + } + if len(errs) > 0 { + return errors.Join(errs...) + } + return nil +} + +// loadSecretsProviderForRouting returns the configured secrets.Provider +// for this apply run, or (nil, nil) when secretsCfg is absent (the +// caller's downstream sensitive.Route will hard-fail if any driver +// emits sensitive outputs). +func loadSecretsProviderForRouting(cfgFile string) (secrets.Provider, error) { + if cfgFile == "" { + return nil, nil + } + cfg, err := parseSecretsConfig(cfgFile) + if err != nil { + return nil, fmt.Errorf("parse secrets config for sensitive routing: %w", err) + } + if cfg == nil { + return nil, nil + } + return resolveSecretsProvider(cfg) +} + +// hasSensitiveOutputs reports whether r has any Sensitive[k]==true entry. +func hasSensitiveOutputs(r *interfaces.ResourceOutput) bool { + for _, v := range r.Sensitive { + if v { + return true + } + } + return false +} + +// sensitiveKeysFor returns the sorted list of keys with Sensitive[k]==true. +func sensitiveKeysFor(r *interfaces.ResourceOutput) []string { + var keys []string + for k, v := range r.Sensitive { + if v { + keys = append(keys, k) + } + } + sort.Strings(keys) + return keys +} + +// requireSecretsProviderForSensitiveOutputs returns an error if any of the +// resources in result has sensitive outputs but provider is nil. Surfaced +// before the per-resource persistence loop so an apply with sensitive +// drivers and no provider fails fast with a complete diagnostic instead +// of partial state writes. +func requireSecretsProviderForSensitiveOutputs(provider secrets.Provider, result *interfaces.ApplyResult) error { + if provider != nil || result == nil { + return nil + } + for i := range result.Resources { + if hasSensitiveOutputs(&result.Resources[i]) { + return fmt.Errorf( + "secrets.Provider not configured but driver emitted sensitive outputs (resource %q has Sensitive keys %v); add `secrets:` block to your config or use `secrets: { provider: env }`", + result.Resources[i].Name, sensitiveKeysFor(&result.Resources[i])) + } + } + return nil +} + // isAbstractSize reports whether s is one of the canonical abstract size tiers // (xs/s/m/l/xl). Provider-specific slugs such as "db-s-1vcpu-1gb" return false // so that ResolveSizing is not called for already-concrete values. @@ -798,11 +1104,24 @@ func loadPlanFromFile(path string) (interfaces.IaCPlan, error) { // ComputePlan. It loads IaCProvider plugins from cfgFile, groups plan actions // by iac.provider module, and calls provider.Apply for each group. State is // persisted exactly as in the live-diff path. -func applyFromPrecomputedPlan(ctx context.Context, plan interfaces.IaCPlan, cfgFile, envName string) error { //nolint:cyclop +// +// The same-process hydrated routed-secret map (sensitive.Route output) is +// accumulated across provider groups and returned to the caller so the +// post-apply syncInfraOutputSecrets step can rehydrate placeholders without +// having to re-Get from a write-only secrets provider (e.g. GitHub Actions +// secrets, where Set succeeds but Get returns ErrUnsupported). Returns an +// empty (non-nil) map when no driver emitted sensitive outputs. +func applyFromPrecomputedPlan(ctx context.Context, plan interfaces.IaCPlan, cfgFile, envName string) (map[string]string, error) { //nolint:cyclop + // runHydrated accumulates routed-secret values across all provider groups + // in this plan. Returned to the caller so the post-apply + // syncInfraOutputSecrets step can rehydrate placeholders without re-Get + // from write-only secrets providers. Mirrors applyInfraModules's pattern. + runHydrated := make(map[string]string) + // Load full config to resolve iac.provider module definitions. cfg, err := config.LoadFromFile(cfgFile) if err != nil { - return fmt.Errorf("load config: %w", err) + return runHydrated, fmt.Errorf("load config: %w", err) } // Build provider lookup (mirrors applyInfraModules). @@ -833,7 +1152,7 @@ func applyFromPrecomputedPlan(ctx context.Context, plan interfaces.IaCPlan, cfgF // Resolve state store (noop when no iac.state module is configured). store, storeErr := resolveStateStore(cfgFile, envName) if storeErr != nil { - return fmt.Errorf("open state store: %w", storeErr) + return runHydrated, fmt.Errorf("open state store: %w", storeErr) } // Group plan actions by iac.provider module, preserving action order. @@ -857,12 +1176,12 @@ func applyFromPrecomputedPlan(ctx context.Context, plan interfaces.IaCPlan, cfgF } } if moduleRef == "" { - return fmt.Errorf("plan action for %q: missing 'provider' field in resource config (delete actions require a current state record)", action.Resource.Name) + return runHydrated, fmt.Errorf("plan action for %q: missing 'provider' field in resource config (delete actions require a current state record)", action.Resource.Name) } if _, exists := groups[moduleRef]; !exists { def, ok := providerDefs[moduleRef] if !ok { - return fmt.Errorf("plan action for %q references provider %q which is not declared as an iac.provider module", action.Resource.Name, moduleRef) + return runHydrated, fmt.Errorf("plan action for %q references provider %q which is not declared as an iac.provider module", action.Resource.Name, moduleRef) } groups[moduleRef] = &actionGroup{provType: def.provType, provCfg: def.provCfg} groupOrder = append(groupOrder, moduleRef) @@ -881,16 +1200,16 @@ func applyFromPrecomputedPlan(ctx context.Context, plan interfaces.IaCPlan, cfgF fmt.Printf("Applying %d resource(s) via provider %q (%s) from plan...\n", len(g.actions), moduleRef, g.provType) provider, closer, err := resolveIaCProvider(ctx, g.provType, g.provCfg) if err != nil { - return fmt.Errorf("provider %q (%s): load provider: %w", moduleRef, g.provType, err) + return runHydrated, fmt.Errorf("provider %q (%s): load provider: %w", moduleRef, g.provType, err) } - applyErr := applyPrecomputedPlanWithStore(ctx, groupPlan, provider, g.provType, store, os.Stderr, envName) + applyErr := applyPrecomputedPlanWithStore(ctx, groupPlan, provider, g.provType, store, os.Stderr, envName, cfgFile, runHydrated) if closer != nil { if cerr := closer.Close(); cerr != nil { fmt.Fprintf(os.Stderr, "warning: provider %q shutdown: %v\n", g.provType, cerr) } } if applyErr != nil { - return fmt.Errorf("provider %q: %w", moduleRef, applyErr) + return runHydrated, fmt.Errorf("provider %q: %w", moduleRef, applyErr) } } @@ -900,17 +1219,21 @@ func applyFromPrecomputedPlan(ctx context.Context, plan interfaces.IaCPlan, cfgF } else { fmt.Printf("applied %d actions from plan\n", n) } - return nil + return runHydrated, nil } // applyPrecomputedPlanWithStore executes a pre-computed plan group via // provider.Apply and persists state for each provisioned resource. It is the // precomputed-plan counterpart of applyWithProviderAndStore, skipping // ComputePlan / adoptExistingResources entirely. -func applyPrecomputedPlanWithStore(ctx context.Context, plan interfaces.IaCPlan, provider interfaces.IaCProvider, providerType string, store infraStateStore, w io.Writer, envName string) error { +func applyPrecomputedPlanWithStore(ctx context.Context, plan interfaces.IaCPlan, provider interfaces.IaCProvider, providerType string, store infraStateStore, w io.Writer, envName string, cfgFile string, hydratedOut map[string]string) error { if store == nil { store = &noopStateStore{} } + secretsProvider, secretsErr := loadSecretsProviderForRouting(cfgFile) + if secretsErr != nil { + return secretsErr + } if len(plan.Actions) == 0 { fmt.Println(" No changes — infrastructure is up-to-date.") return nil @@ -998,6 +1321,10 @@ func applyPrecomputedPlanWithStore(ctx context.Context, plan interfaces.IaCPlan, } if result != nil { + // Pre-flight: same hard-fail as live-diff path. + if err := requireSecretsProviderForSensitiveOutputs(secretsProvider, result); err != nil { + return fmt.Errorf("state write rejected: %w", err) + } for _, r := range result.Resources { fmt.Printf(" ✓ %s (%s)\n", r.Name, r.Type) @@ -1029,13 +1356,20 @@ func applyPrecomputedPlanWithStore(ctx context.Context, plan interfaces.IaCPlan, ConfigHash: configHashMap(appliedCfg), AppliedConfig: appliedCfg, AppliedConfigSource: "apply", - Outputs: r.Outputs, - Dependencies: dependencies, - CreatedAt: now, - UpdatedAt: now, + // Outputs is set by persistResourceWithSecretRouting after Route. + Dependencies: dependencies, + CreatedAt: now, + UpdatedAt: now, } - if saveErr := store.SaveResource(ctx, rs); saveErr != nil { - return fmt.Errorf("%s/%s: persist state after apply: %w", r.Type, r.Name, saveErr) + driver, _ := provider.ResourceDriver(r.Type) + hyd, persistErr := persistResourceWithSecretRouting(ctx, store, secretsProvider, driver, rs, r, persistModeApply) + if persistErr != nil { + return persistErr + } + if hydratedOut != nil { + for k, v := range hyd { + hydratedOut[k] = v + } } } diff --git a/cmd/wfctl/infra_apply_allow_replace_test.go b/cmd/wfctl/infra_apply_allow_replace_test.go index 443c5d3f..46ff8135 100644 --- a/cmd/wfctl/infra_apply_allow_replace_test.go +++ b/cmd/wfctl/infra_apply_allow_replace_test.go @@ -220,7 +220,7 @@ func TestApplyWithProviderAndStore_ProtectedReplace_WithoutAllowReplace_Errors(t } var w bytes.Buffer - err := applyWithProviderAndStore(context.Background(), provider, "stub", specs, nil, nil, &w, "test") + err := applyWithProviderAndStore(context.Background(), provider, "stub", specs, nil, nil, &w, "test", "", nil) if err == nil { t.Fatal("expected gate error before dispatch") } @@ -264,7 +264,7 @@ func TestApplyWithProviderAndStore_ProtectedReplace_WithAllowReplace_Proceeds(t } var w bytes.Buffer - err := applyWithProviderAndStore(context.Background(), provider, "stub", specs, nil, nil, &w, "test") + err := applyWithProviderAndStore(context.Background(), provider, "stub", specs, nil, nil, &w, "test", "", nil) if err == nil || !errors.Is(err, dispatched) { t.Fatalf("expected gate to allow apply through to dispatch (sentinel %v); got %v", dispatched, err) } @@ -295,7 +295,7 @@ func TestApplyPrecomputedPlanWithStore_ProtectedReplace_WithoutAllowReplace_Erro } var w bytes.Buffer - err := applyPrecomputedPlanWithStore(context.Background(), plan, provider, "stub", nil, &w, "test") + err := applyPrecomputedPlanWithStore(context.Background(), plan, provider, "stub", nil, &w, "test", "", nil) if err == nil { t.Fatal("expected gate error before dispatch via precomputed-plan path") } diff --git a/cmd/wfctl/infra_apply_jit_loader_test.go b/cmd/wfctl/infra_apply_jit_loader_test.go index 15ca9036..976a17bb 100644 --- a/cmd/wfctl/infra_apply_jit_loader_test.go +++ b/cmd/wfctl/infra_apply_jit_loader_test.go @@ -95,7 +95,7 @@ modules: } t.Cleanup(func() { resolveIaCProvider = origLoader }) - if err := applyInfraModules(t.Context(), cfgPath, ""); err != nil { + if _, err := applyInfraModules(t.Context(), cfgPath, ""); err != nil { t.Fatalf("applyInfraModules: %v", err) } diff --git a/cmd/wfctl/infra_apply_plan_test.go b/cmd/wfctl/infra_apply_plan_test.go index da950464..c1a7de4a 100644 --- a/cmd/wfctl/infra_apply_plan_test.go +++ b/cmd/wfctl/infra_apply_plan_test.go @@ -337,7 +337,7 @@ func TestInfraApplyPrecomputedPlan_PersistsState(t *testing.T) { }, } - err := applyPrecomputedPlanWithStore(context.Background(), plan, provider, "fake-cloud", store, io.Discard, "") + err := applyPrecomputedPlanWithStore(context.Background(), plan, provider, "fake-cloud", store, io.Discard, "", "", nil) if err != nil { t.Fatalf("applyPrecomputedPlanWithStore: %v", err) } @@ -448,7 +448,7 @@ modules: // sides hash nil/empty spec slices the same way. // The key assertion: applyFromPrecomputedPlan must NOT error on the delete action. _ = specs - err = applyFromPrecomputedPlan(context.Background(), plan, cfgPath, "") + _, err = applyFromPrecomputedPlan(context.Background(), plan, cfgPath, "") // The apply itself won't error even if the config has my-db (hash mismatch // would catch that) — we just want to confirm no "missing provider" error. // With the delete action resolved via Current.ProviderRef, provider.Apply is called. diff --git a/cmd/wfctl/infra_apply_refresh.go b/cmd/wfctl/infra_apply_refresh.go index ba7a7425..8b29dcea 100644 --- a/cmd/wfctl/infra_apply_refresh.go +++ b/cmd/wfctl/infra_apply_refresh.go @@ -1,5 +1,20 @@ package main +// Note: as of v0.27.0 (engine-side sensitive-output routing), no in-tree +// call site dispatches driver.Diff against state.Outputs that may contain +// sensitive.PlaceholderPrefix entries. Per-provider Diff implementations +// receive desired/current via gRPC and are out of scope for engine-side +// masking. iac/sensitive.MaskSensitiveForDiff is exported as a helper +// for future in-tree consumers (e.g., a future engine-side drift command +// that compares state-Outputs to live ResourceOutput before calling +// Diff). When such a call site lands, wire MaskSensitiveForDiff( +// driver.SensitiveKeys(), desired.Config, current.Outputs) before the +// Diff dispatch to prevent placeholder-vs-plaintext false-positives. +// +// Verified at v0.27.0 by `grep -rn "driver\.Diff(\|d\.Diff(" cmd/wfctl/ +// iac/`: only iac/conformance/scenario_grpc_roundtrip.go matches, and +// it's a conformance-test tool that synthesizes its own desired/current. + import ( "context" "fmt" diff --git a/cmd/wfctl/infra_apply_sensitive_routing_test.go b/cmd/wfctl/infra_apply_sensitive_routing_test.go new file mode 100644 index 00000000..ef3ab6d1 --- /dev/null +++ b/cmd/wfctl/infra_apply_sensitive_routing_test.go @@ -0,0 +1,318 @@ +package main + +import ( + "context" + "errors" + "fmt" + "strings" + "testing" + "time" + + "github.com/GoCodeAlone/workflow/interfaces" + "github.com/GoCodeAlone/workflow/secrets" +) + +// envTestProvider is a tiny in-memory secrets.Provider for test assertions. +type envTestProvider struct { + values map[string]string + failSet bool +} + +func newEnvTestProvider() *envTestProvider { return &envTestProvider{values: map[string]string{}} } +func (p *envTestProvider) Name() string { return "env-test" } +func (p *envTestProvider) Get(_ context.Context, k string) (string, error) { + v, ok := p.values[k] + if !ok { + return "", secrets.ErrNotFound + } + return v, nil +} +func (p *envTestProvider) Set(_ context.Context, k, v string) error { + if p.failSet { + return fmt.Errorf("set rejected") + } + p.values[k] = v + return nil +} +func (p *envTestProvider) Delete(_ context.Context, k string) error { + delete(p.values, k) + return nil +} +func (p *envTestProvider) List(_ context.Context) ([]string, error) { + out := make([]string, 0, len(p.values)) + for k := range p.values { + out = append(out, k) + } + return out, nil +} + +// stubInfraStore captures SaveResource calls; implements infraStateStore. +type stubInfraStore struct { + saved []interfaces.ResourceState + saveErr error + deleted []string +} + +func (s *stubInfraStore) ListResources(_ context.Context) ([]interfaces.ResourceState, error) { + return s.saved, nil +} +func (s *stubInfraStore) SaveResource(_ context.Context, st interfaces.ResourceState) error { + if s.saveErr != nil { + return s.saveErr + } + s.saved = append(s.saved, st) + return nil +} +func (s *stubInfraStore) DeleteResource(_ context.Context, n string) error { + s.deleted = append(s.deleted, n) + return nil +} + +// stubSensitiveDriver records Delete calls (for compensating-Delete tests). +// Implements interfaces.ResourceDriver. +type stubSensitiveDriver struct { + deleteCalls []interfaces.ResourceRef + deleteErr error +} + +func (d *stubSensitiveDriver) Create(_ context.Context, _ interfaces.ResourceSpec) (*interfaces.ResourceOutput, error) { + return nil, nil +} +func (d *stubSensitiveDriver) Read(_ context.Context, _ interfaces.ResourceRef) (*interfaces.ResourceOutput, error) { + return nil, nil +} +func (d *stubSensitiveDriver) Update(_ context.Context, _ interfaces.ResourceRef, _ interfaces.ResourceSpec) (*interfaces.ResourceOutput, error) { + return nil, nil +} +func (d *stubSensitiveDriver) Delete(_ context.Context, ref interfaces.ResourceRef) error { + d.deleteCalls = append(d.deleteCalls, ref) + return d.deleteErr +} +func (d *stubSensitiveDriver) Diff(_ context.Context, _ interfaces.ResourceSpec, _ *interfaces.ResourceOutput) (*interfaces.DiffResult, error) { + return nil, nil +} +func (d *stubSensitiveDriver) HealthCheck(_ context.Context, _ interfaces.ResourceRef) (*interfaces.HealthResult, error) { + return nil, nil +} +func (d *stubSensitiveDriver) Scale(_ context.Context, _ interfaces.ResourceRef, _ int) (*interfaces.ResourceOutput, error) { + return nil, nil +} +func (d *stubSensitiveDriver) SensitiveKeys() []string { return nil } + +func TestPersistResourceWithSecretRouting_RoutesSensitiveAndSanitizesState(t *testing.T) { + prov := newEnvTestProvider() + store := &stubInfraStore{} + drv := &stubSensitiveDriver{} + out := interfaces.ResourceOutput{ + Name: "myres", Type: "infra.spaces_key", ProviderID: "AKIA", + Outputs: map[string]any{"access_key": "AK", "secret_key": "SK", "bucket": "b"}, + Sensitive: map[string]bool{"access_key": true, "secret_key": true}, + } + rs := interfaces.ResourceState{ + ID: "myres", Name: "myres", Type: "infra.spaces_key", + Provider: "digitalocean", ProviderID: "AKIA", + CreatedAt: time.Now(), UpdatedAt: time.Now(), + } + hydrated, err := persistResourceWithSecretRouting(context.Background(), store, prov, drv, rs, out, persistModeApply) + if err != nil { + t.Fatalf("persist: %v", err) + } + if len(store.saved) != 1 { + t.Fatalf("expected 1 saved, got %d", len(store.saved)) + } + state := store.saved[0] + if state.Outputs["secret_key"] != "secret_ref://myres_secret_key" { + t.Errorf("state secret_key not sanitized: %v", state.Outputs["secret_key"]) + } + if state.Outputs["access_key"] != "secret_ref://myres_access_key" { + t.Errorf("state access_key not sanitized: %v", state.Outputs["access_key"]) + } + if state.Outputs["bucket"] != "b" { + t.Errorf("state bucket lost: %v", state.Outputs["bucket"]) + } + if prov.values["myres_secret_key"] != "SK" { + t.Errorf("provider missing secret_key value") + } + if hydrated["myres_secret_key"] != "SK" { + t.Errorf("hydrated missing secret_key: %v", hydrated) + } +} + +func TestPersistResourceWithSecretRouting_NoProviderHardFails(t *testing.T) { + store := &stubInfraStore{} + drv := &stubSensitiveDriver{} + out := interfaces.ResourceOutput{ + Outputs: map[string]any{"secret_key": "SK"}, + Sensitive: map[string]bool{"secret_key": true}, + } + rs := interfaces.ResourceState{Name: "myres"} + _, err := persistResourceWithSecretRouting(context.Background(), store, nil, drv, rs, out, persistModeApply) + if err == nil { + t.Fatal("expected error when provider nil and sensitive non-empty") + } + if !strings.Contains(err.Error(), "myres") { + t.Errorf("error should name resource, got %q", err.Error()) + } + if len(store.saved) != 0 { + t.Error("state should NOT be saved when routing fails") + } +} + +func TestPersistResourceWithSecretRouting_SaveFailureCompensatesWithDelete(t *testing.T) { + prov := newEnvTestProvider() + store := &stubInfraStore{saveErr: errors.New("disk full")} + drv := &stubSensitiveDriver{} + out := interfaces.ResourceOutput{ + Name: "myres", ProviderID: "AKIA", + Outputs: map[string]any{"secret_key": "SK"}, + Sensitive: map[string]bool{"secret_key": true}, + } + rs := interfaces.ResourceState{Name: "myres", Type: "infra.spaces_key", ProviderID: "AKIA"} + _, err := persistResourceWithSecretRouting(context.Background(), store, prov, drv, rs, out, persistModeApply) + if err == nil { + t.Fatal("expected error from SaveResource") + } + if !strings.Contains(err.Error(), "disk full") { + t.Errorf("error should wrap original SaveResource err, got %q", err.Error()) + } + if len(drv.deleteCalls) != 1 { + t.Errorf("expected 1 compensating Delete call, got %d", len(drv.deleteCalls)) + } + if drv.deleteCalls[0].ProviderID != "AKIA" { + t.Errorf("compensating Delete used wrong ProviderID: %v", drv.deleteCalls[0]) + } + if _, ok := prov.values["myres_secret_key"]; ok { + t.Errorf("compensating Delete should have removed routed secret; got %v", prov.values) + } +} + +func TestPersistResourceWithSecretRouting_NoSensitivePassesThrough(t *testing.T) { + store := &stubInfraStore{} + drv := &stubSensitiveDriver{} + out := interfaces.ResourceOutput{ + Outputs: map[string]any{"bucket": "b"}, + } + rs := interfaces.ResourceState{Name: "myres"} + hydrated, err := persistResourceWithSecretRouting(context.Background(), store, nil, drv, rs, out, persistModeApply) + if err != nil { + t.Fatalf("persist: %v", err) + } + if len(hydrated) != 0 { + t.Errorf("hydrated should be empty: %v", hydrated) + } + if len(store.saved) != 1 { + t.Fatalf("expected 1 saved, got %d", len(store.saved)) + } + if store.saved[0].Outputs["bucket"] != "b" { + t.Errorf("non-sensitive output corrupted: %v", store.saved[0].Outputs) + } +} + +func TestPersistResourceWithSecretRouting_Idempotent(t *testing.T) { + prov := newEnvTestProvider() + store := &stubInfraStore{} + drv := &stubSensitiveDriver{} + out := interfaces.ResourceOutput{ + Outputs: map[string]any{"secret_key": "SK"}, + Sensitive: map[string]bool{"secret_key": true}, + } + rs := interfaces.ResourceState{Name: "myres"} + for i := 0; i < 2; i++ { + _, err := persistResourceWithSecretRouting(context.Background(), store, prov, drv, rs, out, persistModeApply) + if err != nil { + t.Fatalf("persist iter %d: %v", i, err) + } + } + if prov.values["myres_secret_key"] != "SK" { + t.Errorf("provider value lost on re-Apply: %v", prov.values) + } + if len(store.saved) != 2 { + t.Errorf("expected 2 saved, got %d", len(store.saved)) + } +} + +func TestRequireSecretsProviderForSensitiveOutputs_PreflightDetect(t *testing.T) { + result := &interfaces.ApplyResult{ + Resources: []interfaces.ResourceOutput{ + {Name: "ok", Outputs: map[string]any{"bucket": "b"}}, + {Name: "needs_routing", Outputs: map[string]any{"secret_key": "SK"}, Sensitive: map[string]bool{"secret_key": true}}, + }, + } + err := requireSecretsProviderForSensitiveOutputs(nil, result) + if err == nil { + t.Fatal("expected pre-flight to reject") + } + if !strings.Contains(err.Error(), "needs_routing") { + t.Errorf("error should name the offending resource, got %q", err.Error()) + } +} + +func TestRequireSecretsProviderForSensitiveOutputs_ProviderConfigured_PassesThrough(t *testing.T) { + prov := newEnvTestProvider() + result := &interfaces.ApplyResult{ + Resources: []interfaces.ResourceOutput{ + {Name: "ok", Outputs: map[string]any{"secret_key": "SK"}, Sensitive: map[string]bool{"secret_key": true}}, + }, + } + if err := requireSecretsProviderForSensitiveOutputs(prov, result); err != nil { + t.Fatalf("provider configured should pass; got %v", err) + } +} + +func TestPersistResourceWithSecretRouting_ReadModeSanitizeOnly_PreservesPriorPlaceholder(t *testing.T) { + prov := newEnvTestProvider() // should not be touched + // Pre-existing state has a placeholder + store := &stubInfraStore{ + saved: []interfaces.ResourceState{ + {Name: "myres", Outputs: map[string]any{"secret_key": "secret_ref://myres_secret_key", "bucket": "b"}}, + }, + } + out := interfaces.ResourceOutput{ + Outputs: map[string]any{"bucket": "b"}, // Read can't re-emit secret_key + Sensitive: map[string]bool{"secret_key": true}, + } + rs := interfaces.ResourceState{Name: "myres"} + _, err := persistResourceWithSecretRouting(context.Background(), store, prov, &stubSensitiveDriver{}, rs, out, persistModeRead) + if err != nil { + t.Fatalf("persist read-mode: %v", err) + } + if len(prov.values) != 0 { + t.Errorf("Read mode must NOT call provider.Set; got %v", prov.values) + } + if len(store.saved) != 2 { + t.Fatalf("expected 2 saves (initial + this), got %d", len(store.saved)) + } + latest := store.saved[1] + if latest.Outputs["secret_key"] != "secret_ref://myres_secret_key" { + t.Errorf("Read mode lost prior placeholder: %v", latest.Outputs["secret_key"]) + } + if latest.Outputs["bucket"] != "b" { + t.Errorf("Read mode lost bucket: %v", latest.Outputs["bucket"]) + } +} + +func TestPersistResourceWithSecretRouting_ReadModeNewSensitiveKey_Dropped(t *testing.T) { + prov := newEnvTestProvider() + store := &stubInfraStore{} + out := interfaces.ResourceOutput{ + Outputs: map[string]any{"secret_key": "FRESH-FROM-CLOUD-CACHE", "bucket": "b"}, + Sensitive: map[string]bool{"secret_key": true}, + } + rs := interfaces.ResourceState{Name: "myres"} + _, err := persistResourceWithSecretRouting(context.Background(), store, prov, &stubSensitiveDriver{}, rs, out, persistModeRead) + if err != nil { + t.Fatalf("persist read-mode: %v", err) + } + if len(prov.values) != 0 { + t.Errorf("Read mode must NOT call provider.Set even for newly-declared sensitive; got %v", prov.values) + } + if len(store.saved) != 1 { + t.Fatalf("expected 1 save, got %d", len(store.saved)) + } + if _, ok := store.saved[0].Outputs["secret_key"]; ok { + t.Errorf("newly-declared sensitive (no prior placeholder) should be dropped; got %v", store.saved[0].Outputs["secret_key"]) + } + if store.saved[0].Outputs["bucket"] != "b" { + t.Errorf("non-sensitive bucket lost: %v", store.saved[0].Outputs["bucket"]) + } +} diff --git a/cmd/wfctl/infra_apply_test.go b/cmd/wfctl/infra_apply_test.go index d5156c17..fc71e398 100644 --- a/cmd/wfctl/infra_apply_test.go +++ b/cmd/wfctl/infra_apply_test.go @@ -202,7 +202,7 @@ modules: } defer func() { resolveIaCProvider = orig }() - if err := applyInfraModules(context.Background(), cfgPath, ""); err != nil { + if _, err := applyInfraModules(context.Background(), cfgPath, ""); err != nil { t.Fatalf("applyInfraModules: %v", err) } @@ -309,7 +309,7 @@ modules: } t.Cleanup(func() { resolveIaCProvider = orig }) - if err := applyInfraModules(context.Background(), cfgPath, ""); err != nil { + if _, err := applyInfraModules(context.Background(), cfgPath, ""); err != nil { t.Fatalf("applyInfraModules: %v", err) } @@ -401,7 +401,7 @@ modules: } t.Cleanup(func() { resolveIaCProvider = orig }) - if err := applyInfraModules(context.Background(), cfgPath, ""); err != nil { + if _, err := applyInfraModules(context.Background(), cfgPath, ""); err != nil { t.Fatalf("applyInfraModules: %v", err) } @@ -446,7 +446,7 @@ modules: t.Fatalf("write config: %v", err) } - err := applyInfraModules(context.Background(), cfgPath, "") + _, err := applyInfraModules(context.Background(), cfgPath, "") if err == nil { t.Fatal("expected duplicate resource name error, got nil") } @@ -510,7 +510,7 @@ modules: } t.Cleanup(func() { resolveIaCProvider = orig }) - if err := applyInfraModules(context.Background(), cfgPath, ""); err != nil { + if _, err := applyInfraModules(context.Background(), cfgPath, ""); err != nil { t.Fatalf("applyInfraModules: %v", err) } @@ -662,7 +662,7 @@ func TestApplyWithProvider_NoChanges(t *testing.T) { }} fake := &applyCapture{} - if err := applyWithProviderAndStore(context.Background(), fake, "fake-cloud", []interfaces.ResourceSpec{spec}, current, nil, io.Discard, ""); err != nil { + if err := applyWithProviderAndStore(context.Background(), fake, "fake-cloud", []interfaces.ResourceSpec{spec}, current, nil, io.Discard, "", "", nil); err != nil { t.Fatalf("applyWithProviderAndStore: %v", err) } @@ -704,7 +704,7 @@ func TestApplyWithProvider_AdoptsExistingDNSBeforeComputePlan(t *testing.T) { provider := &readBackedProvider{driver: driver} store := &fakeStateStore{} - if err := applyWithProviderAndStore(t.Context(), provider, "digitalocean", []interfaces.ResourceSpec{spec}, nil, store, io.Discard, ""); err != nil { + if err := applyWithProviderAndStore(t.Context(), provider, "digitalocean", []interfaces.ResourceSpec{spec}, nil, store, io.Discard, "", "", nil); err != nil { t.Fatalf("applyWithProviderAndStore: %v", err) } @@ -794,7 +794,7 @@ func TestApplyWithProvider_DNSAdoptionFailedUpdateKeepsLiveAppliedConfig(t *test } store := &fakeStateStore{} - err := applyWithProviderAndStore(t.Context(), provider, "digitalocean", []interfaces.ResourceSpec{spec}, nil, store, io.Discard, "") + err := applyWithProviderAndStore(t.Context(), provider, "digitalocean", []interfaces.ResourceSpec{spec}, nil, store, io.Discard, "", "", nil) if err == nil { t.Fatal("expected apply failure, got nil") } @@ -833,7 +833,7 @@ func TestApplyWithProvider_DNSAdoptionFallsBackToNameWhenDomainOmitted(t *testin } provider := &readBackedProvider{driver: driver} - if err := applyWithProviderAndStore(t.Context(), provider, "digitalocean", []interfaces.ResourceSpec{spec}, nil, &fakeStateStore{}, io.Discard, ""); err != nil { + if err := applyWithProviderAndStore(t.Context(), provider, "digitalocean", []interfaces.ResourceSpec{spec}, nil, &fakeStateStore{}, io.Discard, "", "", nil); err != nil { t.Fatalf("applyWithProviderAndStore: %v", err) } if len(driver.reads) != 1 { @@ -862,7 +862,7 @@ func TestApplyWithProvider_DNSAdoptionSaveFailureFailsBeforeApply(t *testing.T) provider := &readBackedProvider{driver: driver} store := &fakeStateStore{saveErr: errors.New("disk full")} - err := applyWithProviderAndStore(t.Context(), provider, "digitalocean", []interfaces.ResourceSpec{spec}, nil, store, io.Discard, "") + err := applyWithProviderAndStore(t.Context(), provider, "digitalocean", []interfaces.ResourceSpec{spec}, nil, store, io.Discard, "", "", nil) if err == nil { t.Fatal("expected adoption save failure, got nil") } @@ -894,7 +894,7 @@ func TestApplyWithProvider_DNSAdoptionRequiresWritableStateStore(t *testing.T) { } provider := &readBackedProvider{driver: driver} - err := applyWithProviderAndStore(t.Context(), provider, "digitalocean", []interfaces.ResourceSpec{spec}, nil, nil, io.Discard, "") + err := applyWithProviderAndStore(t.Context(), provider, "digitalocean", []interfaces.ResourceSpec{spec}, nil, nil, io.Discard, "", "", nil) if err == nil { t.Fatal("expected adoption to fail with noop state store") } @@ -927,7 +927,7 @@ func TestApplyWithProvider_AdoptsResourceThroughDriverLocator(t *testing.T) { provider := &readBackedProvider{driver: driver} store := &fakeStateStore{} - err := applyWithProviderAndStore(t.Context(), provider, "digitalocean", []interfaces.ResourceSpec{spec}, nil, store, io.Discard, "") + err := applyWithProviderAndStore(t.Context(), provider, "digitalocean", []interfaces.ResourceSpec{spec}, nil, store, io.Discard, "", "", nil) if err != nil { t.Fatalf("applyWithProviderAndStore: %v", err) } @@ -963,7 +963,7 @@ func TestApplyWithProvider_SkipsAdoptionWhenAppDriverHasNoLocator(t *testing.T) } provider := &noDriverApplyProvider{} - err := applyWithProviderAndStore(t.Context(), provider, "digitalocean", []interfaces.ResourceSpec{spec}, nil, &fakeStateStore{}, io.Discard, "") + err := applyWithProviderAndStore(t.Context(), provider, "digitalocean", []interfaces.ResourceSpec{spec}, nil, &fakeStateStore{}, io.Discard, "", "", nil) if err != nil { t.Fatalf("applyWithProviderAndStore: %v", err) } @@ -996,7 +996,7 @@ func TestApplyWithProvider_DNSAdoptionRejectsMalformedProviderID(t *testing.T) { provider := &readBackedProvider{driver: driver} store := &fakeStateStore{} - err := applyWithProviderAndStore(t.Context(), provider, "digitalocean", []interfaces.ResourceSpec{spec}, nil, store, io.Discard, "") + err := applyWithProviderAndStore(t.Context(), provider, "digitalocean", []interfaces.ResourceSpec{spec}, nil, store, io.Discard, "", "", nil) if err == nil { t.Fatal("expected malformed ProviderID error") } @@ -1026,7 +1026,7 @@ func TestApplyWithProvider_DNSReadNotFoundKeepsCreateBehavior(t *testing.T) { provider := &readBackedProvider{driver: driver} store := &fakeStateStore{} - if err := applyWithProviderAndStore(t.Context(), provider, "digitalocean", []interfaces.ResourceSpec{spec}, nil, store, io.Discard, ""); err != nil { + if err := applyWithProviderAndStore(t.Context(), provider, "digitalocean", []interfaces.ResourceSpec{spec}, nil, store, io.Discard, "", "", nil); err != nil { t.Fatalf("applyWithProviderAndStore: %v", err) } @@ -1055,7 +1055,7 @@ func TestApplyWithProvider_DNSReadErrorFailsBeforeApply(t *testing.T) { driver := &readDriver{readErr: errors.New("provider API unavailable")} provider := &readBackedProvider{driver: driver} - err := applyWithProviderAndStore(t.Context(), provider, "digitalocean", []interfaces.ResourceSpec{spec}, nil, nil, io.Discard, "") + err := applyWithProviderAndStore(t.Context(), provider, "digitalocean", []interfaces.ResourceSpec{spec}, nil, nil, io.Discard, "", "", nil) if err == nil { t.Fatal("expected read error, got nil") } @@ -1078,7 +1078,7 @@ func TestApplyWithProvider_DNSReadNilLiveOutputFailsBeforeApply(t *testing.T) { driver := &readDriver{} provider := &readBackedProvider{driver: driver} - err := applyWithProviderAndStore(t.Context(), provider, "digitalocean", []interfaces.ResourceSpec{spec}, nil, &fakeStateStore{}, io.Discard, "") + err := applyWithProviderAndStore(t.Context(), provider, "digitalocean", []interfaces.ResourceSpec{spec}, nil, &fakeStateStore{}, io.Discard, "", "", nil) if err == nil { t.Fatal("expected nil live output adoption error") } @@ -1108,7 +1108,7 @@ func TestApplyWithProvider_DNSReadEmptyProviderIDFailsBeforeApply(t *testing.T) provider := &readBackedProvider{driver: driver} store := &fakeStateStore{} - err := applyWithProviderAndStore(t.Context(), provider, "digitalocean", []interfaces.ResourceSpec{spec}, nil, store, io.Discard, "") + err := applyWithProviderAndStore(t.Context(), provider, "digitalocean", []interfaces.ResourceSpec{spec}, nil, store, io.Discard, "", "", nil) if err == nil { t.Fatal("expected error when adopted live output has empty ProviderID") } @@ -1146,7 +1146,7 @@ func TestApplyWithProvider_DeletesRemovedResource(t *testing.T) { fake := &applyCapture{} store := &fakeStateStore{} - if err := applyWithProviderAndStore(context.Background(), fake, "fake-cloud", specs, current, store, io.Discard, ""); err != nil { + if err := applyWithProviderAndStore(context.Background(), fake, "fake-cloud", specs, current, store, io.Discard, "", "", nil); err != nil { t.Fatalf("applyWithProviderAndStore: %v", err) } @@ -1246,7 +1246,7 @@ func TestApplyWithProvider_SavesStateForSuccessfulResources(t *testing.T) { } store := &fakeStateStore{} - if err := applyWithProviderAndStore(t.Context(), fake, "fake-cloud", specs, nil, store, io.Discard, ""); err != nil { + if err := applyWithProviderAndStore(t.Context(), fake, "fake-cloud", specs, nil, store, io.Discard, "", "", nil); err != nil { t.Fatalf("apply: %v", err) } @@ -1302,7 +1302,7 @@ func TestApplyWithProvider_SavesStateOnPartialFailure(t *testing.T) { } store := &fakeStateStore{} - err := applyWithProviderAndStore(t.Context(), fake, "fake-cloud", specs, nil, store, io.Discard, "") + err := applyWithProviderAndStore(t.Context(), fake, "fake-cloud", specs, nil, store, io.Discard, "", "", nil) if err == nil { t.Fatal("expected error on partial failure, got nil") } @@ -1328,7 +1328,7 @@ func TestApplyWithProvider_StoreSaveFailureFails(t *testing.T) { } store := &fakeStateStore{saveErr: fmt.Errorf("disk full")} - err := applyWithProviderAndStore(t.Context(), fake, "fake-cloud", specs, nil, store, io.Discard, "") + err := applyWithProviderAndStore(t.Context(), fake, "fake-cloud", specs, nil, store, io.Discard, "", "", nil) if err == nil { t.Fatal("expected save failure after apply, got nil") } @@ -1362,7 +1362,7 @@ modules: t.Fatalf("write config: %v", err) } - err := applyInfraModules(context.Background(), cfgPath, "staging") + _, err := applyInfraModules(context.Background(), cfgPath, "staging") if err == nil { t.Fatal("expected error for disabled provider, got nil") } @@ -1388,7 +1388,7 @@ modules: t.Fatalf("write config: %v", err) } - err := applyInfraModules(context.Background(), cfgPath, "") + _, err := applyInfraModules(context.Background(), cfgPath, "") if err == nil { t.Fatal("expected error for missing provider, got nil") } @@ -1448,7 +1448,7 @@ func TestApplyInfraModules_CallsResolveSizing_ForEachSpec(t *testing.T) { }, } - if err := applyWithProviderAndStore(t.Context(), fake, "fake-cloud", specs, nil, nil, io.Discard, ""); err != nil { + if err := applyWithProviderAndStore(t.Context(), fake, "fake-cloud", specs, nil, nil, io.Discard, "", "", nil); err != nil { t.Fatalf("applyWithProviderAndStore: %v", err) } @@ -1571,7 +1571,7 @@ modules: _ = r.Close() }) - err := applyInfraModules(context.Background(), cfgPath, "") + _, err := applyInfraModules(context.Background(), cfgPath, "") w.Close() os.Stderr = oldStderr @@ -1612,7 +1612,7 @@ func TestApply_StateRecordsAppliedConfigSourceApply(t *testing.T) { } store := &fakeStateStore{} - if err := applyWithProviderAndStore(t.Context(), fake, "test", []interfaces.ResourceSpec{spec}, nil, store, io.Discard, ""); err != nil { + if err := applyWithProviderAndStore(t.Context(), fake, "test", []interfaces.ResourceSpec{spec}, nil, store, io.Discard, "", "", nil); err != nil { t.Fatalf("apply: %v", err) } @@ -1647,7 +1647,7 @@ func TestAdoption_StateRecordsAppliedConfigSourceAdoption(t *testing.T) { provider := &readBackedProvider{driver: driver} store := &fakeStateStore{} - if err := applyWithProviderAndStore(t.Context(), provider, "digitalocean", []interfaces.ResourceSpec{spec}, nil, store, io.Discard, ""); err != nil { + if err := applyWithProviderAndStore(t.Context(), provider, "digitalocean", []interfaces.ResourceSpec{spec}, nil, store, io.Discard, "", "", nil); err != nil { t.Fatalf("adopt: %v", err) } diff --git a/cmd/wfctl/infra_apply_troubleshoot_test.go b/cmd/wfctl/infra_apply_troubleshoot_test.go index 2bb3407f..228c368b 100644 --- a/cmd/wfctl/infra_apply_troubleshoot_test.go +++ b/cmd/wfctl/infra_apply_troubleshoot_test.go @@ -83,7 +83,7 @@ func TestInfraApply_EmitsDiagnosticsOnFailure(t *testing.T) { // spec.Type must be non-empty so ref.Type is set and ResourceDriver is called. specs := []interfaces.ResourceSpec{{Name: "bmw-staging", Type: "app_platform"}} var diagBuf bytes.Buffer - err := applyWithProviderAndStore(context.Background(), provider, "digitalocean", specs, nil, nil, &diagBuf, "") + err := applyWithProviderAndStore(context.Background(), provider, "digitalocean", specs, nil, nil, &diagBuf, "", "", nil) if err == nil { t.Fatal("expected error from failing apply") } @@ -107,7 +107,7 @@ func TestInfraApply_NonTroubleshooterNocrash(t *testing.T) { provider := &plainFailProvider{applyErr: errors.New("boom")} var diagBuf bytes.Buffer specs := []interfaces.ResourceSpec{{Name: "x", Type: "app_platform"}} - err := applyWithProviderAndStore(context.Background(), provider, "digitalocean", specs, nil, nil, &diagBuf, "") + err := applyWithProviderAndStore(context.Background(), provider, "digitalocean", specs, nil, nil, &diagBuf, "", "", nil) if err == nil { t.Fatal("expected error") } @@ -135,7 +135,7 @@ func TestInfraApply_WritesStepSummaryOnFailure(t *testing.T) { specs := []interfaces.ResourceSpec{{Name: "bmw-staging", Type: "app_platform"}} var diagBuf bytes.Buffer - _ = applyWithProviderAndStore(context.Background(), provider, "digitalocean", specs, nil, nil, &diagBuf, "staging") + _ = applyWithProviderAndStore(context.Background(), provider, "digitalocean", specs, nil, nil, &diagBuf, "staging", "", nil) data, err := os.ReadFile(summaryPath) if err != nil { diff --git a/cmd/wfctl/infra_apply_v2_loader_test.go b/cmd/wfctl/infra_apply_v2_loader_test.go index b09985b4..22769963 100644 --- a/cmd/wfctl/infra_apply_v2_loader_test.go +++ b/cmd/wfctl/infra_apply_v2_loader_test.go @@ -119,7 +119,7 @@ modules: // observe NeedsReplace=true, emit "replace" action. // 4. wfctlhelpers.ApplyPlan decomposes replace into // driver.Delete + driver.Create, populating ReplaceIDMap. - if err := applyInfraModules(t.Context(), cfgPath, ""); err != nil { + if _, err := applyInfraModules(t.Context(), cfgPath, ""); err != nil { t.Fatalf("applyInfraModules: %v", err) } @@ -213,7 +213,8 @@ modules: // Reuses the package-level captureStderr helper from // infra_outputs_test.go (panic-safe stderr capture). out, fnErr := captureStderr(t, func() error { - return applyInfraModules(t.Context(), cfgPath, "") + _, err := applyInfraModules(t.Context(), cfgPath, "") + return err }) if fnErr != nil { t.Fatalf("applyInfraModules: %v", fnErr) diff --git a/cmd/wfctl/infra_apply_v2_test.go b/cmd/wfctl/infra_apply_v2_test.go index 3aea251b..914c43ca 100644 --- a/cmd/wfctl/infra_apply_v2_test.go +++ b/cmd/wfctl/infra_apply_v2_test.go @@ -42,7 +42,7 @@ func TestApplyWithProviderAndStore_PassesLiveProviderToComputePlan(t *testing.T) } var w bytes.Buffer - err := applyWithProviderAndStore(context.Background(), fake, "stub", specs, nil, nil, &w, "test") + err := applyWithProviderAndStore(context.Background(), fake, "stub", specs, nil, nil, &w, "test", "", nil) if !errors.Is(err, stop) { t.Fatalf("expected sentinel error %v, got %v", stop, err) } @@ -98,7 +98,7 @@ func TestApplyWithProviderAndStore_V2RoutesThroughWfctlhelpers(t *testing.T) { } var w bytes.Buffer - err := applyWithProviderAndStore(context.Background(), v2Provider, "stub", specs, nil, nil, &w, "test") + err := applyWithProviderAndStore(context.Background(), v2Provider, "stub", specs, nil, nil, &w, "test", "", nil) if !errors.Is(err, v2Stop) { t.Fatalf("expected v2 sentinel error %v, got %v", v2Stop, err) } @@ -135,7 +135,7 @@ func TestApplyWithProviderAndStore_V1FallsThroughToProviderApply(t *testing.T) { specs := []interfaces.ResourceSpec{{Name: "vpc", Type: "infra.vpc"}} var w bytes.Buffer - if err := applyWithProviderAndStore(context.Background(), v1Provider, "stub", specs, nil, nil, &w, "test"); err != nil { + if err := applyWithProviderAndStore(context.Background(), v1Provider, "stub", specs, nil, nil, &w, "test", "", nil); err != nil { t.Fatalf("applyWithProviderAndStore: %v", err) } if !v1Provider.applyCalled.Load() { @@ -180,7 +180,7 @@ func TestApplyWithProviderAndStore_V2PrintsDriftReport(t *testing.T) { specs := []interfaces.ResourceSpec{{Name: "vpc", Type: "infra.vpc"}} var w bytes.Buffer - if err := applyWithProviderAndStore(context.Background(), v2Provider, "stub", specs, nil, nil, &w, "test"); err != nil { + if err := applyWithProviderAndStore(context.Background(), v2Provider, "stub", specs, nil, nil, &w, "test", "", nil); err != nil { t.Fatalf("applyWithProviderAndStore: %v", err) } if !strings.Contains(w.String(), "BUCKET_NAME") { @@ -225,7 +225,7 @@ func TestApplyWithProviderAndStore_V2PrintsDriftReportOnPartialFailure(t *testin specs := []interfaces.ResourceSpec{{Name: "vpc", Type: "infra.vpc"}} var w bytes.Buffer - err := applyWithProviderAndStore(context.Background(), v2Provider, "stub", specs, nil, nil, &w, "test") + err := applyWithProviderAndStore(context.Background(), v2Provider, "stub", specs, nil, nil, &w, "test", "", nil) if err == nil { t.Fatal("expected error from ApplyPlan to propagate, got nil") } @@ -270,7 +270,7 @@ func TestApplyWithProviderAndStore_V1Path_DeclarerReturnsEmpty(t *testing.T) { specs := []interfaces.ResourceSpec{{Name: "vpc", Type: "infra.vpc"}} var w bytes.Buffer - if err := applyWithProviderAndStore(context.Background(), v1Provider, "stub", specs, nil, nil, &w, "test"); err != nil { + if err := applyWithProviderAndStore(context.Background(), v1Provider, "stub", specs, nil, nil, &w, "test", "", nil); err != nil { t.Fatalf("applyWithProviderAndStore: %v", err) } if v2Called.Load() { diff --git a/cmd/wfctl/infra_audit_state_secrets.go b/cmd/wfctl/infra_audit_state_secrets.go new file mode 100644 index 00000000..7d1b9da5 --- /dev/null +++ b/cmd/wfctl/infra_audit_state_secrets.go @@ -0,0 +1,249 @@ +package main + +import ( + "context" + "errors" + "flag" + "fmt" + "io" + "sort" + "strings" + + "github.com/GoCodeAlone/workflow/iac/sensitive" + "github.com/GoCodeAlone/workflow/secrets" +) + +// runInfraAuditStateSecrets is the CLI entry point for +// `wfctl infra audit-state-secrets`. +// +// Walks every entry in IaCStateStore. For each Outputs[k] that is: +// - a "secret_ref://" placeholder → confirm secrets.Provider has . +// - a plaintext value matching secrets.DefaultSensitiveKeys() → flag legacy. +// - a "secret://" string → flag mistaken config-reference in state. +// +// Then walks secrets.Provider.List() (when supported) for any +// "_" name whose is NOT in IaCStateStore → +// orphan, candidate for prune. +// +// Exit codes: +// +// 0 no findings +// 1 findings (legacy plaintext, missing routed values, orphan secrets, +// mistaken config-references) +// 2 audit error (cannot read state, parse error, etc.) +func runInfraAuditStateSecrets(args []string, w io.Writer) int { + fs := flag.NewFlagSet("infra audit-state-secrets", flag.ContinueOnError) + fs.SetOutput(w) + var configFile string + fs.StringVar(&configFile, "c", "infra.yaml", "Config file") + fs.StringVar(&configFile, "config", "infra.yaml", "Config file") + var prune bool + fs.BoolVar(&prune, "prune", false, "Delete confirmed orphan secrets from secrets.Provider") + if err := fs.Parse(args); err != nil { + return 2 + } + + cfg, err := parseSecretsConfig(configFile) + if err != nil { + fmt.Fprintf(w, "audit-state-secrets: parse %q: %v\n", configFile, err) + return 2 + } + if cfg == nil { + fmt.Fprintln(w, "audit-state-secrets: no secrets config; nothing to audit") + return 0 + } + prov, err := resolveSecretsProvider(cfg) + if err != nil { + fmt.Fprintf(w, "audit-state-secrets: resolve provider: %v\n", err) + return 2 + } + + store, err := resolveStateStore(configFile, "") + if err != nil { + fmt.Fprintf(w, "audit-state-secrets: open state store: %v\n", err) + return 2 + } + // noop store contributes no state; allow read-only audit to proceed + // (provider may still surface orphan findings) but REFUSE --prune: + // without an authoritative iac.state backend, every listed secret + // would appear orphan and be eligible for deletion. That risks + // catastrophic data loss when iac.state config is missing/incorrect. + if prune && isNoopStateStore(store) { + fmt.Fprintln(w, "audit-state-secrets: --prune requires an iac.state backend; "+ + "no orphans can be authoritatively determined without one. "+ + "Configure an iac.state module (filesystem, spaces, or postgres) and rerun.") + return 2 + } + + return runAuditStateSecretsWithPrune(context.Background(), w, store, prov, prune) +} + +// runAuditStateSecrets is the testable entry point (no flag parsing). +func runAuditStateSecrets(ctx context.Context, w io.Writer, store infraStateStore, prov secrets.Provider) int { + return runAuditStateSecretsWithPrune(ctx, w, store, prov, false) +} + +// runAuditStateSecretsWithPrune is the testable entry point with --prune +// behaviour parameterised. Returns exit-code per the contract above. +func runAuditStateSecretsWithPrune(ctx context.Context, w io.Writer, store infraStateStore, prov secrets.Provider, prune bool) int { + states, err := store.ListResources(ctx) + if err != nil { + fmt.Fprintf(w, "audit-state-secrets: list state: %v\n", err) + return 2 + } + + findings := 0 + stateNames := map[string]struct{}{} + for i := range states { + stateNames[states[i].Name] = struct{}{} + } + + defaultSensitive := map[string]struct{}{} + for _, k := range secrets.DefaultSensitiveKeys() { + defaultSensitive[k] = struct{}{} + } + + // Collect the union of sensitive-output key names actually observed in + // state records (any Outputs[k] whose value is a secret_ref:// + // placeholder). Drivers may declare sensitive keys that aren't in + // secrets.DefaultSensitiveKeys() — without harvesting these, orphan + // detection's stripKnownSensitiveSuffix would fail to recover the + // resource-name prefix and misflag valid routed secrets as orphans + // (and --prune would delete them). Defaults remain in the suffix set + // as a fallback for first-run scenarios where state hasn't recorded + // any placeholders yet. + observedSensitive := map[string]struct{}{} + for k := range defaultSensitive { + observedSensitive[k] = struct{}{} + } + for i := range states { + for k, v := range states[i].Outputs { + if sensitive.IsPlaceholder(v) { + observedSensitive[k] = struct{}{} + } + } + } + + // Walk state for placeholder/plaintext/config-ref findings. + // Sort states by name for stable output. + sort.SliceStable(states, func(i, j int) bool { return states[i].Name < states[j].Name }) + + for i := range states { + st := &states[i] + keys := make([]string, 0, len(st.Outputs)) + for k := range st.Outputs { + keys = append(keys, k) + } + sort.Strings(keys) + + for _, k := range keys { + v := st.Outputs[k] + s, isStr := v.(string) + if !isStr { + continue + } + switch { + case sensitive.IsPlaceholder(v): + secretName := strings.TrimPrefix(s, sensitive.PlaceholderPrefix) + _, getErr := prov.Get(ctx, secretName) + switch { + case getErr == nil: + continue + case errors.Is(getErr, secrets.ErrUnsupported): + fmt.Fprintf(w, "ADVISORY (Get unsupported): cannot verify routed value for %s/%s -> %q on this provider\n", st.Name, k, secretName) + continue + case errors.Is(getErr, secrets.ErrNotFound): + fmt.Fprintf(w, "FINDING (missing routed value): %s/%s expects routed secret %q but provider does not have it\n", st.Name, k, secretName) + findings++ + default: + // Provider faulted (network, auth, transient). Don't + // silently swallow — surface as a finding so the audit + // exits non-zero and operator gets actionable signal, + // rather than a false "no findings". + fmt.Fprintf(w, "FINDING (provider error): %s/%s -> %q: %v\n", st.Name, k, secretName, getErr) + findings++ + } + case strings.HasPrefix(s, secrets.SecretPrefix): + fmt.Fprintf(w, "FINDING (config-reference in state): %s/%s contains user-config-style %q (expected resolved value or %s placeholder)\n", st.Name, k, s, sensitive.PlaceholderPrefix) + findings++ + default: + if _, isSensName := defaultSensitive[k]; isSensName && s != "" { + fmt.Fprintf(w, "FINDING (legacy plaintext): %s/%s = ; rotate via wfctl infra bootstrap --force-rotate or re-apply\n", st.Name, k) + findings++ + } + } + } + } + + // Walk provider for orphan secrets. + names, err := prov.List(ctx) + switch { + case err == nil: + sort.Strings(names) + for _, name := range names { + res := stripSensitiveSuffix(name, observedSensitive) + if _, ok := stateNames[res]; ok { + continue + } + if prune { + if delErr := prov.Delete(ctx, name); delErr != nil { + fmt.Fprintf(w, "PRUNE FAILED: %q: %v\n", name, delErr) + findings++ + } else { + fmt.Fprintf(w, "pruned orphan secret %q\n", name) + } + continue + } + fmt.Fprintf(w, "FINDING (orphan secret): %q has no matching state resource; rerun with --prune to delete\n", name) + findings++ + } + case errors.Is(err, secrets.ErrUnsupported): + fmt.Fprintln(w, "ADVISORY (list unsupported): provider does not support List(); orphan-secret detection skipped on this host") + default: + fmt.Fprintf(w, "audit-state-secrets: list provider secrets: %v\n", err) + return 2 + } + + if findings > 0 { + fmt.Fprintf(w, "\naudit-state-secrets: %d finding(s)\n", findings) + return 1 + } + fmt.Fprintln(w, "audit-state-secrets: no findings") + return 0 +} + +// stripSensitiveSuffix returns the resource-name prefix of a routed-secret +// name. The caller supplies the suffix universe (typically the union of +// secrets.DefaultSensitiveKeys() and any sensitive keys actually observed +// in state-record placeholders, so driver-declared non-default sensitive +// keys are not misclassified as orphans). +// +// When the suffix set is nil/empty, falls back to defaults so the helper +// remains safe for ad-hoc callers. +// +// Prefers the longest matching suffix to avoid mis-stripping nested +// patterns (e.g. a name ending in "_admin_password" should strip the +// 17-char "_admin_password" not the 9-char "_password" when both are +// declared sensitive). +func stripSensitiveSuffix(name string, suffixes map[string]struct{}) string { + if len(suffixes) == 0 { + for _, k := range secrets.DefaultSensitiveKeys() { + suf := "_" + k + if strings.HasSuffix(name, suf) { + return name[:len(name)-len(suf)] + } + } + return name + } + bestLen := 0 + for k := range suffixes { + suf := "_" + k + if strings.HasSuffix(name, suf) && len(suf) > bestLen { + bestLen = len(suf) + } + } + if bestLen > 0 { + return name[:len(name)-bestLen] + } + return name +} diff --git a/cmd/wfctl/infra_audit_state_secrets_test.go b/cmd/wfctl/infra_audit_state_secrets_test.go new file mode 100644 index 00000000..3afa34d5 --- /dev/null +++ b/cmd/wfctl/infra_audit_state_secrets_test.go @@ -0,0 +1,161 @@ +package main + +import ( + "bytes" + "context" + "strings" + "testing" + + "github.com/GoCodeAlone/workflow/interfaces" + "github.com/GoCodeAlone/workflow/secrets" +) + +// writeOnlyAuditProvider mimics GitHubSecretsProvider Get/List ErrUnsupported. +type writeOnlyAuditProvider struct { + envTestProvider // embed for Set/Delete pass-through; values map shared +} + +func (p *writeOnlyAuditProvider) Get(_ context.Context, _ string) (string, error) { + return "", secrets.ErrUnsupported +} +func (p *writeOnlyAuditProvider) List(_ context.Context) ([]string, error) { + return nil, secrets.ErrUnsupported +} + +func TestAuditStateSecrets_NoFindings_ExitZero(t *testing.T) { + store := &stubInfraStore{ + saved: []interfaces.ResourceState{ + {Name: "ok", Outputs: map[string]any{"bucket": "b", "region": "nyc3"}}, + }, + } + prov := newEnvTestProvider() + w := &bytes.Buffer{} + rc := runAuditStateSecrets(context.Background(), w, store, prov) + if rc != 0 { + t.Errorf("rc = %d, want 0; output:\n%s", rc, w.String()) + } + if !strings.Contains(w.String(), "no findings") { + t.Errorf("expected 'no findings', got: %s", w.String()) + } +} + +func TestAuditStateSecrets_OrphanInProvider(t *testing.T) { + // Provider has a routed-secret "ghost_secret_key" but state has no ghost resource. + store := &stubInfraStore{ + saved: []interfaces.ResourceState{{Name: "live", Outputs: map[string]any{"bucket": "b"}}}, + } + prov := newEnvTestProvider() + prov.values["ghost_secret_key"] = "ORPHAN" + w := &bytes.Buffer{} + rc := runAuditStateSecrets(context.Background(), w, store, prov) + if rc != 1 { + t.Errorf("rc = %d, want 1", rc) + } + if !strings.Contains(w.String(), "orphan") || !strings.Contains(w.String(), "ghost_secret_key") { + t.Errorf("expected orphan finding for ghost_secret_key; got:\n%s", w.String()) + } +} + +func TestAuditStateSecrets_LegacyPlaintext(t *testing.T) { + store := &stubInfraStore{ + saved: []interfaces.ResourceState{ + {Name: "legacy", Outputs: map[string]any{"secret_key": "PLAINTEXT-SECRET"}}, + }, + } + prov := newEnvTestProvider() + w := &bytes.Buffer{} + rc := runAuditStateSecrets(context.Background(), w, store, prov) + if rc != 1 { + t.Errorf("rc = %d, want 1", rc) + } + if !strings.Contains(w.String(), "legacy plaintext") || !strings.Contains(w.String(), "legacy") { + t.Errorf("expected legacy plaintext finding; got:\n%s", w.String()) + } +} + +func TestAuditStateSecrets_PlaceholderMissingValue(t *testing.T) { + // State has a placeholder but provider doesn't have the secret. + store := &stubInfraStore{ + saved: []interfaces.ResourceState{ + {Name: "broken", Outputs: map[string]any{"secret_key": "secret_ref://broken_secret_key"}}, + }, + } + prov := newEnvTestProvider() + w := &bytes.Buffer{} + rc := runAuditStateSecrets(context.Background(), w, store, prov) + if rc != 1 { + t.Errorf("rc = %d, want 1", rc) + } + if !strings.Contains(w.String(), "missing routed value") || !strings.Contains(w.String(), "broken_secret_key") { + t.Errorf("expected missing-routed-value finding; got:\n%s", w.String()) + } +} + +func TestAuditStateSecrets_MistakenSecretConfigRefInState(t *testing.T) { + // State contains a "secret://" string (user-config syntax leaked into state). + store := &stubInfraStore{ + saved: []interfaces.ResourceState{ + {Name: "weird", Outputs: map[string]any{"token": "secret://my_token"}}, + }, + } + prov := newEnvTestProvider() + w := &bytes.Buffer{} + rc := runAuditStateSecrets(context.Background(), w, store, prov) + if rc != 1 { + t.Errorf("rc = %d, want 1", rc) + } + if !strings.Contains(w.String(), "config-reference in state") { + t.Errorf("expected config-reference-in-state finding; got:\n%s", w.String()) + } +} + +func TestAuditStateSecrets_Prune_DeletesOrphans(t *testing.T) { + store := &stubInfraStore{ + saved: []interfaces.ResourceState{{Name: "live", Outputs: map[string]any{"bucket": "b"}}}, + } + prov := newEnvTestProvider() + prov.values["ghost_secret_key"] = "ORPHAN" + w := &bytes.Buffer{} + rc := runAuditStateSecretsWithPrune(context.Background(), w, store, prov, true) + if rc != 0 { + t.Errorf("rc = %d, want 0 after prune; output:\n%s", rc, w.String()) + } + if _, ok := prov.values["ghost_secret_key"]; ok { + t.Errorf("orphan secret not pruned: %v", prov.values) + } + if !strings.Contains(w.String(), "pruned") { + t.Errorf("expected 'pruned' line in output, got: %s", w.String()) + } +} + +func TestAuditStateSecrets_ListUnsupported_ReportsAdvisory(t *testing.T) { + store := &stubInfraStore{ + saved: []interfaces.ResourceState{ + {Name: "ok", Outputs: map[string]any{"secret_key": "secret_ref://ok_secret_key"}}, + }, + } + prov := &writeOnlyAuditProvider{envTestProvider: envTestProvider{values: map[string]string{}}} + w := &bytes.Buffer{} + rc := runAuditStateSecrets(context.Background(), w, store, prov) + if rc == 2 { + t.Errorf("rc=2 reserved for hard audit errors; should not fire on write-only providers; output:\n%s", w.String()) + } + if !strings.Contains(w.String(), "unsupported") { + t.Errorf("expected write-only provider advisory; got:\n%s", w.String()) + } +} + +func TestAuditStateSecrets_ValidPlaceholderWithMatchingProvider_NoFinding(t *testing.T) { + store := &stubInfraStore{ + saved: []interfaces.ResourceState{ + {Name: "ok", Outputs: map[string]any{"secret_key": "secret_ref://ok_secret_key"}}, + }, + } + prov := newEnvTestProvider() + prov.values["ok_secret_key"] = "VALID" + w := &bytes.Buffer{} + rc := runAuditStateSecrets(context.Background(), w, store, prov) + if rc != 0 { + t.Errorf("rc = %d, want 0 (placeholder + matching provider value); output:\n%s", rc, w.String()) + } +} diff --git a/cmd/wfctl/infra_output_secrets.go b/cmd/wfctl/infra_output_secrets.go index 80697c74..8e1ab65f 100644 --- a/cmd/wfctl/infra_output_secrets.go +++ b/cmd/wfctl/infra_output_secrets.go @@ -8,6 +8,7 @@ import ( "strings" "github.com/GoCodeAlone/workflow/config" + "github.com/GoCodeAlone/workflow/iac/sensitive" "github.com/GoCodeAlone/workflow/interfaces" "github.com/GoCodeAlone/workflow/secrets" ) @@ -34,7 +35,14 @@ func buildStateOutputsMap(states []interfaces.ResourceState) map[string]map[stri // wfCfg may be nil (e.g. tests that only care about base-name resolution). // When envName is empty no resolution is performed and the source module name // is used verbatim. -func resolveInfraOutput(wfCfg *config.WorkflowConfig, source, envName string, stateOutputs map[string]map[string]any) (string, error) { +// +// hydrated is the in-memory routed-secret map from the same-process apply +// (may be nil). When the state field is a sensitive.PlaceholderPrefix +// string, resolveInfraOutput rehydrates from hydrated FIRST; if the +// secret is not in hydrated, it returns a documented error explaining +// the cold-start constraint (write-only providers cannot rehydrate +// without same-process hand-off). +func resolveInfraOutput(wfCfg *config.WorkflowConfig, source, envName string, stateOutputs map[string]map[string]any, hydrated map[string]string) (string, error) { if source == "" { return "", fmt.Errorf("infra_output: source is required (format: \"module.field\")") } @@ -74,6 +82,18 @@ func resolveInfraOutput(wfCfg *config.WorkflowConfig, source, envName string, st if !ok { return "", fmt.Errorf("infra_output: field %q not found in outputs of module %q", field, moduleName) } + // If state has a routed-secret placeholder, prefer the hydrated map. + if sensitive.IsPlaceholder(val) { + secretName := strings.TrimPrefix(val.(string), sensitive.PlaceholderPrefix) + if hv, hok := hydrated[secretName]; hok { + return hv, nil + } + return "", fmt.Errorf( + "infra_output: field %q of module %q is a routed-secret placeholder %q; not in same-process hydrated map "+ + "(write-only providers like GitHub Actions cannot rehydrate cold; rerun apply or reference the secret directly via secret://%s)", + field, moduleName, val, secretName, + ) + } s, ok := val.(string) if !ok { return "", fmt.Errorf("infra_output: output field %q of module %q is %T, expected string", field, moduleName, val) @@ -97,7 +117,15 @@ func stateKeys(m map[string]map[string]any) []string { // wfCfg and envName are used to resolve source module names through per-env // overrides so that "bmw-database.uri" finds "bmw-staging-db" in state when // --env staging renames the module. -func syncInfraOutputSecrets(ctx context.Context, secretsCfg *SecretsConfig, provider secrets.Provider, states []interfaces.ResourceState, wfCfg *config.WorkflowConfig, envName string) error { +// +// hydrated is the in-memory map of routed-secret values from the +// just-completed apply (populated by sensitive.Route). When a state output +// field is a sensitive.PlaceholderPrefix string, resolveInfraOutput +// rehydrates from this map so the generator can read the real value +// without going through provider.Get (which is unsupported on write-only +// providers like GitHub Actions). May be nil for callers that don't +// have a same-process apply hand-off (e.g., wfctl infra outputs CLI). +func syncInfraOutputSecrets(ctx context.Context, secretsCfg *SecretsConfig, provider secrets.Provider, states []interfaces.ResourceState, wfCfg *config.WorkflowConfig, envName string, hydrated map[string]string) error { if secretsCfg == nil { return nil } @@ -160,7 +188,7 @@ func syncInfraOutputSecrets(ctx context.Context, secretsCfg *SecretsConfig, prov continue } - value, err := resolveInfraOutput(wfCfg, gen.Source, envName, stateOutputs) + value, err := resolveInfraOutput(wfCfg, gen.Source, envName, stateOutputs, hydrated) if err != nil { return fmt.Errorf("generate infra_output secret %q: %w", gen.Key, err) } diff --git a/cmd/wfctl/infra_output_secrets_test.go b/cmd/wfctl/infra_output_secrets_test.go index da301db3..f6124c91 100644 --- a/cmd/wfctl/infra_output_secrets_test.go +++ b/cmd/wfctl/infra_output_secrets_test.go @@ -109,7 +109,7 @@ func TestBuildStateOutputsMap_Empty(t *testing.T) { func TestSyncInfraOutputSecrets_NilConfig(t *testing.T) { p := newSimpleProvider() - err := syncInfraOutputSecrets(context.Background(), nil, p, sampleStates(), nil, "") + err := syncInfraOutputSecrets(context.Background(), nil, p, sampleStates(), nil, "", nil) if err != nil { t.Fatalf("nil config should be no-op: %v", err) } @@ -125,7 +125,7 @@ func TestSyncInfraOutputSecrets_NoInfraOutputGens(t *testing.T) { {Key: "JWT_SECRET", Type: "random_hex", Length: 32}, }, } - err := syncInfraOutputSecrets(context.Background(), cfg, p, sampleStates(), nil, "") + err := syncInfraOutputSecrets(context.Background(), cfg, p, sampleStates(), nil, "", nil) if err != nil { t.Fatalf("no infra_output generators should be no-op: %v", err) } @@ -141,7 +141,7 @@ func TestSyncInfraOutputSecrets_WritesSecret(t *testing.T) { {Key: "STAGING_DATABASE_URL", Type: "infra_output", Source: "bmw-database.uri"}, }, } - err := syncInfraOutputSecrets(context.Background(), cfg, p, sampleStates(), nil, "") + err := syncInfraOutputSecrets(context.Background(), cfg, p, sampleStates(), nil, "", nil) if err != nil { t.Fatalf("syncInfraOutputSecrets: %v", err) } @@ -158,7 +158,7 @@ func TestSyncInfraOutputSecrets_WritesMultiple(t *testing.T) { {Key: "REDIS_URL", Type: "infra_output", Source: "bmw-cache.url"}, }, } - err := syncInfraOutputSecrets(context.Background(), cfg, p, sampleStates(), nil, "") + err := syncInfraOutputSecrets(context.Background(), cfg, p, sampleStates(), nil, "", nil) if err != nil { t.Fatalf("syncInfraOutputSecrets: %v", err) } @@ -178,7 +178,7 @@ func TestSyncInfraOutputSecrets_SkipsExisting(t *testing.T) { {Key: "DATABASE_URL", Type: "infra_output", Source: "bmw-database.uri"}, }, } - err := syncInfraOutputSecrets(context.Background(), cfg, p, sampleStates(), nil, "") + err := syncInfraOutputSecrets(context.Background(), cfg, p, sampleStates(), nil, "", nil) if err != nil { t.Fatalf("syncInfraOutputSecrets: %v", err) } @@ -198,7 +198,7 @@ func TestSyncInfraOutputSecrets_WriteOnlyProviderSkips(t *testing.T) { {Key: "DATABASE_URL", Type: "infra_output", Source: "bmw-database.uri"}, }, } - err := syncInfraOutputSecrets(context.Background(), cfg, p, sampleStates(), nil, "") + err := syncInfraOutputSecrets(context.Background(), cfg, p, sampleStates(), nil, "", nil) if err != nil { t.Fatalf("syncInfraOutputSecrets: %v", err) } @@ -217,7 +217,7 @@ func TestSyncInfraOutputSecrets_WriteOnlyProviderWrites(t *testing.T) { {Key: "DATABASE_URL", Type: "infra_output", Source: "bmw-database.uri"}, }, } - err := syncInfraOutputSecrets(context.Background(), cfg, p, sampleStates(), nil, "") + err := syncInfraOutputSecrets(context.Background(), cfg, p, sampleStates(), nil, "", nil) if err != nil { t.Fatalf("syncInfraOutputSecrets: %v", err) } @@ -233,7 +233,7 @@ func TestSyncInfraOutputSecrets_MissingModule(t *testing.T) { {Key: "X", Type: "infra_output", Source: "nonexistent.uri"}, }, } - err := syncInfraOutputSecrets(context.Background(), cfg, p, sampleStates(), nil, "") + err := syncInfraOutputSecrets(context.Background(), cfg, p, sampleStates(), nil, "", nil) if err == nil { t.Fatal("expected error for missing module in state") } @@ -246,7 +246,7 @@ func TestSyncInfraOutputSecrets_EmptyStates(t *testing.T) { {Key: "X", Type: "infra_output", Source: "bmw-database.uri"}, }, } - err := syncInfraOutputSecrets(context.Background(), cfg, p, nil, nil, "") + err := syncInfraOutputSecrets(context.Background(), cfg, p, nil, nil, "", nil) if err == nil { t.Fatal("expected error when state has no matching module") } diff --git a/cmd/wfctl/infra_plan_apply_equivalence_test.go b/cmd/wfctl/infra_plan_apply_equivalence_test.go index ab1e36d1..7330bb0c 100644 --- a/cmd/wfctl/infra_plan_apply_equivalence_test.go +++ b/cmd/wfctl/infra_plan_apply_equivalence_test.go @@ -118,7 +118,7 @@ modules: } t.Cleanup(func() { resolveIaCProvider = orig }) - if err := applyInfraModules(context.Background(), cfgPath, "staging"); err != nil { + if _, err := applyInfraModules(context.Background(), cfgPath, "staging"); err != nil { t.Fatalf("applyInfraModules: %v", err) } diff --git a/cmd/wfctl/infra_provider_dispatch_test.go b/cmd/wfctl/infra_provider_dispatch_test.go index 0593e9c9..d08d72ce 100644 --- a/cmd/wfctl/infra_provider_dispatch_test.go +++ b/cmd/wfctl/infra_provider_dispatch_test.go @@ -268,7 +268,7 @@ modules: return interfaces.IaCPlan{}, nil // empty plan → applyWithProviderAndStore returns immediately } - if err := applyInfraModules(context.Background(), cfgPath, ""); err != nil { + if _, err := applyInfraModules(context.Background(), cfgPath, ""); err != nil { t.Fatalf("applyInfraModules: %v", err) } diff --git a/cmd/wfctl/infra_refresh_outputs.go b/cmd/wfctl/infra_refresh_outputs.go index cb3ad605..c8763240 100644 --- a/cmd/wfctl/infra_refresh_outputs.go +++ b/cmd/wfctl/infra_refresh_outputs.go @@ -241,7 +241,26 @@ func refreshOneProviderGroup( continue } states[idx] = fresh - if err := store.SaveResource(ctx, fresh); err != nil { + // Build a ResourceOutput so persistResourceWithSecretRouting (read mode) + // can sanitize sensitive keys from the refreshed cloud outputs. Use + // driver.SensitiveKeys() as the per-call Sensitive map source for + // refresh: refresh paths don't have a per-call Sensitive declaration + // from a driver method, but SensitiveKeys() is the static driver + // declaration that approximates the same intent. + ro := interfaces.ResourceOutput{ + Name: fresh.Name, Type: fresh.Type, ProviderID: fresh.ProviderID, + Outputs: fresh.Outputs, + } + if drv, derr := provider.ResourceDriver(fresh.Type); derr == nil && drv != nil { + sk := drv.SensitiveKeys() + if len(sk) > 0 { + ro.Sensitive = make(map[string]bool, len(sk)) + for _, k := range sk { + ro.Sensitive[k] = true + } + } + } + if _, err := persistResourceWithSecretRouting(ctx, store, nil, nil, fresh, ro, persistModeRead); err != nil { return fmt.Errorf("provider %q: persist refreshed %q: %w", def.moduleName, fresh.Name, err) } *updated++ diff --git a/cmd/wfctl/infra_resolve_state.go b/cmd/wfctl/infra_resolve_state.go index 2e446f29..26c6d79c 100644 --- a/cmd/wfctl/infra_resolve_state.go +++ b/cmd/wfctl/infra_resolve_state.go @@ -120,7 +120,10 @@ func buildResolvedSecretsFromState( if gen.Type != "infra_output" { continue } - val, err := resolveInfraOutput(cfg, gen.Source, envName, stateOutputs) + // nil hydrated: this caller is offline drift-resolution / preview; + // no same-process apply hand-off available. Sensitive placeholders + // will fail-and-skip (the loop continues). + val, err := resolveInfraOutput(cfg, gen.Source, envName, stateOutputs, nil) if err != nil { continue } diff --git a/cmd/wfctl/infra_rotate_and_prune.go b/cmd/wfctl/infra_rotate_and_prune.go index c7539fe9..f70fe3ea 100644 --- a/cmd/wfctl/infra_rotate_and_prune.go +++ b/cmd/wfctl/infra_rotate_and_prune.go @@ -118,7 +118,13 @@ func writeRecoveryRecord(rec recoveryRecord) error { if err := os.MkdirAll(dir, 0700); err != nil { return fmt.Errorf("rotate-and-prune: create state dir %s: %w", dir, err) } - data, err := json.MarshalIndent(rec, "", " ") + // gosec G117 false-positive: AccessKey here is the public DO Spaces + // access-key identifier (analogous to an AWS access key ID), NOT the + // credential secret-key, which is stored separately per ADR 0017 + // split-storage. The recovery file persists the access_key intentionally + // so operators can recover from partial-failure prune via the + // --recovery-from-last-rotation flag. File perms are 0600 + parent 0700. + data, err := json.MarshalIndent(rec, "", " ") //nolint:gosec // G117: access_key is a public identifier; secret_key is stored separately (ADR 0017) if err != nil { return fmt.Errorf("rotate-and-prune: marshal recovery record: %w", err) } diff --git a/cmd/wfctl/infra_secrets_env_test.go b/cmd/wfctl/infra_secrets_env_test.go index 5daf3193..cace8f7d 100644 --- a/cmd/wfctl/infra_secrets_env_test.go +++ b/cmd/wfctl/infra_secrets_env_test.go @@ -32,7 +32,7 @@ func TestInfraOutput_EnvResolvesModuleSource(t *testing.T) { // Function under test: resolve infra_output source with envName. // Must transform "bmw-database.uri" → lookup "bmw-staging-db" → "uri" field. - val, err := resolveInfraOutput(wfCfg, "bmw-database.uri", "staging", fakeState) + val, err := resolveInfraOutput(wfCfg, "bmw-database.uri", "staging", fakeState, nil) if err != nil { t.Fatalf("resolveInfraOutput: %v", err) } @@ -56,7 +56,7 @@ func TestInfraOutput_NoEnvUsesBaseName(t *testing.T) { "bmw-database": {"uri": "postgresql://base"}, } - val, err := resolveInfraOutput(wfCfg, "bmw-database.uri", "", fakeState) + val, err := resolveInfraOutput(wfCfg, "bmw-database.uri", "", fakeState, nil) if err != nil { t.Fatalf("resolveInfraOutput: %v", err) } @@ -86,7 +86,7 @@ func TestInfraOutput_ExplicitlyDisabledModuleErrors(t *testing.T) { "bmw-database": {"uri": "postgresql://base"}, } - _, err := resolveInfraOutput(wfCfg, "bmw-database.uri", "staging", fakeState) + _, err := resolveInfraOutput(wfCfg, "bmw-database.uri", "staging", fakeState, nil) if err == nil { t.Fatal("expected error when module is explicitly disabled for env") } diff --git a/cmd/wfctl/infra_state_store_test.go b/cmd/wfctl/infra_state_store_test.go index 9fe70450..565933e5 100644 --- a/cmd/wfctl/infra_state_store_test.go +++ b/cmd/wfctl/infra_state_store_test.go @@ -211,7 +211,7 @@ modules: t.Fatalf("write config: %v", err) } - err := applyInfraModules(context.Background(), cfgPath, "") + _, err := applyInfraModules(context.Background(), cfgPath, "") if err == nil { t.Fatal("expected corrupt state error, got nil") } @@ -266,7 +266,7 @@ modules: } t.Cleanup(func() { resolveIaCProvider = orig }) - if err := applyInfraModules(context.Background(), cfgPath, ""); err != nil { + if _, err := applyInfraModules(context.Background(), cfgPath, ""); err != nil { t.Fatalf("applyInfraModules: %v", err) } diff --git a/docs/WFCTL.md b/docs/WFCTL.md index 74575672..26b6cf20 100644 --- a/docs/WFCTL.md +++ b/docs/WFCTL.md @@ -97,6 +97,8 @@ graph TD infra --> infra-state["state"] infra --> infra-outputs["outputs"] infra --> infra-refresh-outputs["refresh-outputs"] + infra --> infra-audit-secrets["audit-secrets"] + infra --> infra-audit-state-secrets["audit-state-secrets"] infra-state --> infra-state-list["list"] infra-state --> infra-state-export["export"] @@ -2616,3 +2618,39 @@ env: ``` Files: `.github/workflows/ci.yml`, `benchmark.yml`, `pre-release.yml`, `release.yml`, `dependency-update.yml`. New workflow files that invoke `go test` or `wfctl` should add the same env var. + +## `infra audit-state-secrets` + +Audit `state.Outputs` against the configured `secrets.Provider` for orphan +secrets, missing routed values, legacy plaintext, and mistaken +config-references in state. + +``` +wfctl infra audit-state-secrets [--config infra.yaml] [--prune] +``` + +**Findings:** + +- **orphan secret** — provider has `<resource>_<key>` but no state + resource named `<resource>` exists. +- **missing routed value** — state has `secret_ref://...` placeholder + but provider does not have the secret. +- **legacy plaintext** — state has plaintext value at a key matching + `secrets.DefaultSensitiveKeys()` (e.g., `secret_key`, `password`). +- **config-reference in state** — state contains `secret://...` (user + config syntax leaked into a persisted state field). + +**Exit codes:** 0 = no findings; 1 = findings; 2 = audit error. + +**`--prune`:** delete confirmed orphan secrets from the provider. +Idempotent; safe to rerun. + +**Write-only providers** (e.g., GitHub Actions, where `Get` returns +`ErrUnsupported`): emits structured ADVISORY lines for placeholders it +cannot verify, but does not exit non-zero on those alone. Orphan-secret +detection is also skipped on write-only providers (List unsupported). + +Distinct from `audit-secrets` which audits the `secrets.generate` config +block for anti-patterns. Run both as part of regular hygiene. See +`DOCUMENTATION.md#sensitive-output-routing-v0270` for the full +sensitive-output routing model. diff --git a/docs/plans/2026-05-09-engine-sensitive-output-routing-design.md b/docs/plans/2026-05-09-engine-sensitive-output-routing-design.md new file mode 100644 index 00000000..6ba0f4c8 --- /dev/null +++ b/docs/plans/2026-05-09-engine-sensitive-output-routing-design.md @@ -0,0 +1,396 @@ +# Engine-side sensitive-output routing through `secrets.Provider` + +**Status:** design — revised after adversarial-design-review cycle 1 +**Target tag:** workflow `v0.27.0` +**Branch:** `design/engine-sensitive-output-routing` (worktree: `_worktrees/engine-sensitive-output-routing`) +**Author:** brainstorming pipeline (autonomous lead) +**Related:** +- ADR 0015 — Approach B Hybrid split-storage for spaces-key (state ID + secret_key separate) +- workflow PR #581 — `wfctl infra audit-secrets` (config anti-pattern audit; **distinct** from the new state-vs-provider audit added by this PR) +- prior pattern — `cmd/wfctl/infra_output_secrets.go` (`syncInfraOutputSecrets`) + +## Revision history + +- **rev1 (2026-05-09):** addressed adversarial-design-review findings: dropped `Router.Restore` (replaced with hold-and-hand-off in-memory map); restricted routing trigger to per-call `ResourceOutput.Sensitive` only (NOT `SensitiveKeys()`); added new `wfctl infra audit-state-secrets` task to scope; enumerated all five state-write call sites; clarified resource-name source as explicit parameter; added drift-detection masking task; addressed orphan-cloud-resource-on-Save-failure with compensating-Delete; restricted `Route` invocation to Create/Update only. + +## 1. Problem + +The DO plugin's `SpacesKeyDriver.Create` returns a freshly-minted access-key + secret_key pair. The DO API emits `secret_key` exactly once on creation; on subsequent `Read` calls the API does **not** re-emit it. The driver therefore MUST surface `secret_key` in `*ResourceOutput.Outputs` on Create — otherwise the value is lost forever. + +Today the engine (call sites in `cmd/wfctl/infra_apply.go:516-557` and `:1000-1040`) takes that `*ResourceOutput.Outputs` map and writes it verbatim into `interfaces.ResourceState.Outputs`, which the configured `IaCStateStore` (filesystem JSON, GCS blob, S3, Postgres, …) persists. **`secret_key` ends up plaintext in the state record.** That violates the user mandate ("plugin shouldn't have an expectation of GH secrets being available; engine should route") and ADR 0015's split-storage contract. + +The user mandate further requires that the routing layer live in the **engine**, not in any plugin. A plugin compiled into a wfctl-from-CI run, a wfctl-from-CLI run, a workflow-cloud server run, or a third-party host must all transparently get sensitive-output handling without each host writing its own routing. + +## 2. Goals + +1. **Engine routes per-call `Sensitive`-flagged outputs to `secrets.Provider`** before the state-write call (Create + Update only; not Read/Adoption/Refresh — see §4.4). +2. **State backend never sees a sensitive value** — `ResourceState.Outputs` carries only non-sensitive fields plus a `secret_ref://<resource>_<key>` placeholder string for each routed field. +3. **Plugin stays platform-agnostic** — `ResourceDriver` API is unchanged; the plugin only declares per-call `Sensitive: {k: true}` for outputs that need routing. +4. **Symmetric Delete cleanup** — when a resource is deleted, the engine deletes the routed secrets from `secrets.Provider`. +5. **Backward-compatible** — existing plugins (no `Sensitive` flag set) continue to work as today; existing state records (plaintext sensitive values) are not corrupted. +6. **Plugin-agnostic configuration** — the same routing layer works whether the configured `secrets.Provider` is GitHub Actions, Vault, env, file, AWS Secrets Manager, etc., subject to the **write-only-provider constraint** (§4.6). +7. **In-process hand-off, not state-cold-rehydrate** — post-apply consumers receive routed values via an engine-held in-memory map. State-cold rehydrate is **not supported** in v0.27.0; consumers needing cold rehydrate use the existing `secret://<key>` resolver against `secrets.Provider` directly with documented secret names. + +## 3. Non-goals + +- **No new secret-storage backends.** Reuse `secrets.Provider` implementations as-is. +- **No encryption-at-rest of state.** Out of scope; that's a state-backend concern. +- **No retroactive scrubbing** of existing state records that contain plaintext secrets. The new `wfctl infra audit-state-secrets` (§4.7) reports them; operators rotate via the standard rotate flow. +- **No new gRPC plugin contract.** `ResourceDriver` interface is unchanged. +- **No multi-secret-provider per state record.** One `secrets.Provider` per apply run, configured via the same `secretsCfg` already used by `bootstrapSecrets` and `syncInfraOutputSecrets`. +- **No per-output secret-name templating.** Naming is mechanical: `<resource_name>_<output_key>`. +- **No state-cold rehydrate API** (was `Router.Restore`; removed per adversarial-review). Consumers needing routed secrets at cold-start use `secret://<key>` resolver directly. +- **No reuse of `SensitiveKeys()` for routing trigger** (was `OR`'d with `Sensitive`; dropped per adversarial-review). `SensitiveKeys()` continues to drive **display masking only**. + +## 4. Approach + +### 4.1 New surface — free functions in `iac/sensitive` package + +```go +// Package sensitive routes sensitive driver outputs through secrets.Provider. +package sensitive + +// Route routes sensitive fields from out through provider, keyed under +// "<resourceName>_<outputKey>". Returns: +// - sanitized: a copy of out.Outputs with sensitive values replaced by +// "secret_ref://<resourceName>_<key>" placeholders. Suitable for state. +// - hydrated: a flat map[secret_name]value of routed secrets, suitable for +// in-process hand-off to post-apply consumers (syncInfraOutputSecrets, +// pipeline-run, etc). Includes ONLY routed fields — non-sensitive fields +// stay in sanitized. +// +// Routing trigger is exclusively out.Sensitive[k]==true (per-call dynamic). +// SensitiveKeys() is NOT consulted (that's display-masking-only). +// +// Sensitive keys whose value is absent from out.Outputs are SKIPPED — no +// provider.Set call, no placeholder inserted (the engine has no value to +// route; existing routed-secret in provider stays as-is). This is the +// idempotent re-Apply contract. +// +// On any provider.Set failure, returns the error WITHOUT having written +// state. Caller treats as apply failure for that resource. The caller's +// recovery flow (§4.5) is responsible for compensating the partial-Set. +// +// resourceName MUST be the canonical state key (rs.Name at the call site), +// NOT out.Name — adoption/refresh paths may have empty out.Name. +func Route( + ctx context.Context, + provider secrets.Provider, + resourceName string, + out *interfaces.ResourceOutput, +) (sanitized map[string]any, hydrated map[string]string, err error) + +// Revoke deletes routed secrets for resourceName. mergedKeys is the union of: +// - keys with "secret_ref://" placeholders in pre-delete state.Outputs +// - secrets.DefaultSensitiveKeys() (legacy-state best-effort: pre-rev1 state +// records have plaintext values, no placeholders, so we use the heuristic +// name match) +// Errors are aggregated and returned but DO NOT block the caller's +// state-delete (the caller logs and proceeds; orphan secrets are reported +// by `wfctl infra audit-state-secrets`). +func Revoke( + ctx context.Context, + provider secrets.Provider, + resourceName string, + mergedKeys []string, +) error + +// IsPlaceholder reports whether v is a "secret_ref://..." placeholder string. +// Used by drift comparison and display paths. +func IsPlaceholder(v any) bool +``` + +The placeholder format `secret_ref://<resource>_<key>` is **distinct** from the existing `secret://<key>` convention in `secrets/secrets.go:13`. Justification: `secret://` is **user-supplied** in config (resolved by `Resolver.Resolve`); `secret_ref://` is **engine-generated** in state (never user-typed). The two-namespace split lets `audit-state-secrets` (§4.7) detect a state record where someone hand-typed `secret://X` (probably a bug — they should be storing the resolved value, not a config reference). Verified `grep -rn "secret_ref://"` returns zero hits in workflow + workflow-plugin-* repos as of `8de95b4f`. + +### 4.2 Wiring at the state-write boundary + +There are **five** state-write call sites (enumerated by `grep -n "store.SaveResource\|Outputs:.*r.Outputs\|Outputs:.*live.Outputs" cmd/wfctl/*.go`): + +| # | File:Line | Context | Routing | +|---|---|---|---| +| 1 | `infra_apply.go:550-557` | `applyWithProviderAndStore` post-Apply | **Route** (Create/Update outputs) | +| 2 | `infra_apply.go:1032-1040` | In-process apply path post-Apply | **Route** (Create/Update outputs) | +| 3 | `infra_apply.go:637` | `adoptExistingResources` (post-Read) | **Sanitize-only** (§4.4) | +| 4 | `infra_apply.go:705` | `resourceStateFromLiveOutput` builder | **Sanitize-only** (§4.4) | +| 5 | `infra_refresh_outputs.go:244` | `runInfraRefreshOutputs` post-Read | **Sanitize-only** (§4.4) | + +"**Route**" = `sensitive.Route(ctx, provider, rs.Name, out)`; sanitized goes to state, hydrated returned to caller for in-process hand-off. + +"**Sanitize-only**" = if an output value matches a routed-secret name pattern (i.e., a sensitive key is present), DO NOT call `provider.Set` (Read paths must not pollute the secret store with potentially-stale cloud values, per cache-pollution adversarial finding); instead, replace with placeholder if a routed secret already exists in provider OR drop the field if no routed secret is registered. The latter case is the "Read can't re-emit" path (DO `secret_key`). + +A new shared helper centralises the call-site logic: + +```go +// persistResourceWithSecretRouting builds rs from out, routes (or sanitizes, +// per mode) sensitive fields, calls store.SaveResource, and returns the +// hydrated routed-secret map for in-memory hand-off to post-apply consumers +// (mode=apply only; mode=read returns nil hydrated map). +// +// On store.SaveResource failure AFTER provider.Set succeeded (mode=apply), +// the helper invokes driver.Delete to compensate the partial cloud-resource +// creation, then returns a wrapped error naming both the original +// SaveResource failure and the compensating-Delete outcome. Operators reading +// the error can distinguish "save failed, cloud cleaned up" from "save failed, +// cloud orphan persists" — the latter requires manual intervention. +func persistResourceWithSecretRouting( + ctx context.Context, + store interfaces.IaCStateStore, + provider secrets.Provider, // may be nil if no driver emits sensitive + driver interfaces.ResourceDriver, + rs interfaces.ResourceState, + out interfaces.ResourceOutput, + mode persistMode, // apply | read +) (hydrated map[string]string, err error) +``` + +`ApplyPlan` itself stays unchanged. + +### 4.3 Router activation policy + +The `provider` is constructed at apply entry IFF `secretsCfg != nil` via the existing `resolveSecretsProvider`. If `secretsCfg == nil` AND any driver returns a `*ResourceOutput` with non-empty `Sensitive` map, the helper returns: + +``` +state write rejected: resource "coredump-deploy-key" has sensitive output keys [secret_key] +but no secrets.Provider is configured. Configure secrets.* in your workflow config or +add `secrets: { provider: env }` for ad-hoc local runs. +``` + +**Default for ad-hoc/local runs**: the helper documentation recommends `secrets: { provider: env }` (uses `EnvProvider` from `secrets/secrets.go:51`) as the no-config-friendly fallback so the bar to entry is low. This is documented in `DOCUMENTATION.md` as part of the §11 doc-update task. + +### 4.4 Read / adoption / refresh behaviour (Sanitize-only) + +`adoptExistingResources` and `runInfraRefreshOutputs` call `driver.Read` and persist live `ResourceOutput`. For these paths: + +- **`Route` is NOT called** — Read may return cached/stale values; writing them to `secrets.Provider` would risk overwriting a still-valid routed secret with a stale clone (cache-pollution adversarial finding). +- **Sanitize-only**: when an output's key matches a known routed-secret pattern (i.e., the same resource has had a previous Apply that routed this key), insert the `secret_ref://<resource>_<key>` placeholder. Otherwise (no prior routed secret), pass through. For DO-style "Read can't re-emit secret_key", this means: state record on subsequent refreshes still has the placeholder from the original Apply; live Read has no `secret_key`; sanitized state preserves the placeholder. +- **Discovery of "known routed-secret pattern"**: the helper consults the **pre-existing** state record (`store.GetResource(ctx, rs.Name)` if available) and inherits any `secret_ref://` placeholders present. New keys that the driver newly marks `Sensitive` during Read are **dropped** (not routed) — explicit conservative bias. + +This is the **idempotent re-Read invariant**: refresh/adoption never loses or corrupts a previously-routed secret, and never writes anything new to `secrets.Provider`. + +### 4.5 Compensating-Delete on partial failure + +Per adversarial finding #6: if `provider.Set` succeeds and `store.SaveResource` fails, naive rerun of Apply will mint a NEW cloud resource (e.g., new DO Spaces access key), leaking the old one. The helper avoids this with a **compensating Delete**: + +1. `driver.Create` (or Update) succeeds → `out` produced. +2. `provider.Set(<resource>_<key>, value)` for each sensitive field — if any fails, return error WITHOUT calling `SaveResource` and WITHOUT rerunning Set for already-set fields. Operator reruns; idempotent overwrite is fine for fields not yet attempted; partially-Set fields are overwritten with the same value on rerun. +3. `store.SaveResource(ctx, rs)` — if this fails AND we already called Set: + 1. Call `driver.Delete(ctx, refFromRS(rs))` to clean up the cloud resource. + 2. Call `provider.Delete(ctx, <resource>_<key>)` for each Set we made, to clean up the orphan secret. + 3. Return a wrapped error: `state save failed (compensating delete: <ok|err>): <orig save err>`. + +If the compensating Delete also fails, the error names what's still leaked — operators see exactly what to clean up manually. This is the "engine never silently leaks" contract. + +### 4.6 Write-only-provider handling (`secrets.Provider.Get == ErrUnsupported`) + +`GitHubSecretsProvider.Get` returns `ErrUnsupported` (`secrets/github_provider.go:52`). This is by GitHub Actions API design — secrets are write-only after Set. Implications: + +- **Route** does not call `Get`; only `Set` and `Delete`. Compatible with write-only providers. +- **Hydrated in-memory hand-off** (the replacement for the rejected `Restore` API): the engine holds the just-Set value in `hydrated map[string]string` at apply time. Post-apply consumers in the **same process** receive the map and use it directly. State-cold-rehydrate is **not supported** — operators on GitHub-only hosts who want post-apply consumers to access routed secrets must run those consumers within the same `wfctl infra apply` invocation (already the case for `syncInfraOutputSecrets`). +- **Cold-start consumers** (e.g., a workflow-cloud server reading state weeks later) consume routed secrets via the existing `secret://<key>` resolver directly. Documented secret names are stable: `<resource_name>_<output_key>`. On a write-only provider, cold-start consumers cannot read routed secrets; they must re-apply or accept the constraint. This is a **fundamental property of the write-only provider** (GitHub Actions design), not a workflow limitation. + +### 4.7 New audit surface — `wfctl infra audit-state-secrets` + +Adversarial-review Critical finding: the original design cited `wfctl infra audit-secrets` as recovery, but that command audits CONFIG (`secrets.generate` block), not state-vs-provider. Adding a new command: + +``` +wfctl infra audit-state-secrets [--config infra.yaml] [--prune] + Walks every entry in IaCStateStore. For each Outputs[k] that is: + - a "secret_ref://<name>" placeholder → confirm secrets.Provider has <name> via List() or Get() + - a plaintext value matching secrets.DefaultSensitiveKeys() → flag legacy plaintext (manual rotation needed) + - a "secret://<key>" string (user-typed bug) → flag mistaken config reference in state + Then walks secrets.Provider.List() (where supported) for any "<resource>_<key>" name whose <resource> is NOT in IaCStateStore → orphan, candidate for prune. + + --prune: deletes confirmed orphan secrets from secrets.Provider (gated on --yes for non-interactive). + +Exit codes: + 0 no findings + 1 findings (legacy plaintext, missing routed values, orphan secrets) + 2 audit error (cannot read state, provider unsupported, etc.) +``` + +This is the recovery surface for: (a) failure-window orphans (§4.5 on success-after-Set-but-before-Save), (b) Revoke failures (§4.5 of original design — moved to compensating Delete in rev1), (c) legacy plaintext-state migration triage. Distinct from `audit-secrets` (config audit); operators run both. + +### 4.8 Drift-detection compatibility + +After this PR, `state.Outputs` has `secret_ref://...` placeholders; live `ResourceOutput.Outputs` from `Read` has either no value (DO `secret_key`) or plaintext. Naive `driver.Diff(desired, current)` would report drift on every refresh. + +**Plan task**: in every Diff/drift call site, mask sensitive keys before comparison. Specifically: +- `cmd/wfctl/infra_apply_refresh.go` — refresh loop's Diff comparison +- `iac/wfctlhelpers` — none currently call Diff; future-proof +- Any provider's `DetectDrift` — out of scope for this PR (per-provider; documented as follow-up; existing default behaviour already returns nil drift on most types since drift detection is opt-in via `DriftConfigDetector` per `interfaces/iac_provider.go:119`) + +The masking helper: + +```go +// MaskSensitiveForDiff returns copies of desired and current with sensitive keys +// (any key in current matching "secret_ref://*" prefix, OR named in +// driver.SensitiveKeys()) elided from both, so drift comparison sees a +// consistent view. Idempotent for non-routed outputs. +func MaskSensitiveForDiff(driverKeys []string, desired, current map[string]any) (map[string]any, map[string]any) +``` + +### 4.9 In-memory hand-off to post-apply consumers + +`syncInfraOutputSecrets` (currently at `cmd/wfctl/infra.go:1450`) and the post-apply pipeline-run consumer both consume `state.Outputs`. After this PR they receive sanitized state with placeholders. The new `hydrated` map flows through: + +```go +// runInfraApply (cmd/wfctl/infra.go) flow: +// 1. apply with persistResourceWithSecretRouting → returns hydrated map per resource +// 2. accumulate per-resource hydrated maps into runHydrated map[resourceName]map[secretName]value +// 3. pass runHydrated into post-apply consumers +hydratedAll := map[string]map[string]string{} // resource -> secret_name -> value +// ... applyWithProviderAndStore returns hydratedAll +syncInfraOutputSecrets(ctx, secretsCfg, secretsProvider, states, wfCfg, envName, hydratedAll) +``` + +`syncInfraOutputSecrets` checks `hydratedAll[moduleName][outputKey]` BEFORE `state.Outputs[outputKey]`. If the placeholder is in state and hydrated map has the value, use the hydrated value. If the placeholder is in state and hydrated map is empty (not this-process's apply), the consumer either fetches from `secrets.Provider.Get` (if supported) or skips with a documented warning. + +## 5. Data flow + +``` + ┌─────────────────────────────────────────────────┐ + │ Driver.Create returns *ResourceOutput │ + │ Outputs: {access_key, secret_key, bucket, …} │ + │ Sensitive: {secret_key: true, access_key: true}│ + └────────────────────┬────────────────────────────┘ + │ + ▼ + ┌─────────────────────────────────────────────────┐ + │ wfctlhelpers.ApplyPlan — UNCHANGED │ + │ result.Resources = [output, …] │ + └────────────────────┬────────────────────────────┘ + │ + ▼ + ┌─────────────────────────────────────────────────┐ + │ infra_apply.persistResourceWithSecretRouting │ + │ 1. sanitized, hydrated, err = │ + │ sensitive.Route(ctx, provider, rs.Name, out) │ + │ → for each k in out.Sensitive where │ + │ out.Outputs[k] is present: │ + │ secrets.Provider.Set("<resource>_<k>",│ + │ out.Outputs[k]) │ + │ sanitized[k] = "secret_ref://<resource>_<k>" │ + │ hydrated["<resource>_<k>"] = value │ + │ → other keys pass through to sanitized │ + │ 2. rs.Outputs = sanitized │ + │ 3. err := store.SaveResource(ctx, rs) │ + │ on err: compensating Delete (§4.5) │ + │ 4. return hydrated to caller for hand-off │ + └────────────────────┬────────────────────────────┘ + │ hydrated map + ▼ + ┌─────────────────────────────────────────────────┐ + │ syncInfraOutputSecrets / pipeline-run │ + │ reads (resource, key) from hydrated FIRST, │ + │ falls back to provider.Get if absent │ + └─────────────────────────────────────────────────┘ +``` + +## 6. Failure modes + +| Failure | Behaviour | +|---|---| +| `secrets.Provider.Set` fails for the first routed key | Return error from `Route`; no state written; no further Set calls. Operator reruns; idempotent overwrite is fine. | +| `secrets.Provider.Set` fails for a later key (after some succeeded) | Return error from `Route`; no state written. Earlier-Set values are NOT cleaned up automatically (operator reruns; idempotent overwrite is fine). | +| `store.SaveResource` fails after Set succeeded | Compensating `driver.Delete` + `provider.Delete` for each routed key; if compensation succeeds, error names "save failed, cloud cleaned up"; if compensation fails, error names exact orphans. Operator runs `wfctl infra audit-state-secrets --prune`. | +| `secrets.Provider` not configured AND driver emits sensitive outputs | Hard fail at the persist boundary with named resource + remediation pointing to `secrets: { provider: env }` for local runs. | +| `provider.Get` fails / unsupported (write-only host) for in-memory hand-off path | Engine uses hydrated in-memory map (populated by the same-process Apply). Cold-start cross-process consumers fall back to provider.Get directly via `secret://<key>` resolver (with a clear "write-only provider, cannot rehydrate" message if Get unsupported). | +| `Revoke` (compensating-delete branch) fails | Aggregated error returned but state-delete proceeds. `audit-state-secrets --prune` is recovery. | +| Plugin returns `Sensitive: {k: true}` but `Outputs[k]` absent | Route silently skips that key; no Set call; **no placeholder inserted in sanitized either** (idempotent re-Read; pre-existing placeholder from prior Apply is preserved by Sanitize-only Read paths in §4.4). | +| Plugin emits plaintext sensitive output without setting `Sensitive` flag | Plaintext leaks to state. **Mitigation**: `wfctl infra audit-state-secrets` detects via `DefaultSensitiveKeys()` heuristic match. Plugin reviewer is the long-term fix. Not a regression. | +| Backward-compat: existing state record has plaintext `secret_key` | Sanitize-only Read paths leave plaintext alone; subsequent Apply (Route) replaces with placeholder. Operator-driven rotation cleans up legacy state via the standard rotate flow. | +| Drift detection: state has placeholder, live has no value (DO secret_key) | `MaskSensitiveForDiff` elides sensitive keys from both sides before Diff. No false-positive drift. | +| Drift detection: state has placeholder, live has plaintext (e.g., re-emitting connection_string) | Same masking applies; drift on these fields is suppressed. Drift in **non-sensitive** fields is still detected normally. | +| Cache-pollution: Read returns stale-cached secret_key | Sanitize-only Read paths NEVER write to provider. The routed value in provider is whatever the most recent successful Apply Set; never overwritten by Read. | + +## 7. Testing + +### Unit (`iac/sensitive/route_test.go`) + +- `Route` with no Sensitive map → outputs verbatim; hydrated empty. +- `Route` with `Sensitive: {k:true}` and `Outputs[k] = "secret"` → `provider.Set("<res>_<k>", "secret")` called once + sanitized has `secret_ref://<res>_<k>` + hydrated has `<res>_<k>: secret`. +- `Route` with sensitive key absent from `Outputs` → no `Set` call; no placeholder; hydrated does not include that key. +- `Route` with `provider.Set` error → returns error; no further Set calls. +- `Route` with `provider == nil` AND non-empty Sensitive map → returns error naming the resource + keys. +- `Route` with empty resourceName → returns error (defensive — must be canonical name). +- `Revoke` calls `provider.Delete` for each key; aggregates errors (does not return on first error). +- `IsPlaceholder("secret_ref://x")` → true; non-prefix string → false; non-string → false. + +### Integration (`cmd/wfctl/infra_apply_sensitive_routing_test.go`) + +- Apply with stub driver returning sensitive outputs + env-provider → state record has placeholders; env vars contain values; hydrated map returned. +- Apply with sensitive driver but no `secretsCfg` → returns hard-fail error naming resource + sensitive keys. +- Re-apply (idempotent) → second Set overwrites; placeholder unchanged; hydrated map identical. +- `store.SaveResource` failure injected → compensating Delete observed (driver Delete + provider Delete called); error names compensation outcome. +- Adoption path with stub driver returning Sensitive map → Sanitize-only behaviour: no Set call; pre-existing state placeholder preserved; new sensitive key dropped. +- Refresh path with same → identical Sanitize-only behaviour. +- Delete path → Revoke calls observed for both placeholder-derived keys AND legacy `DefaultSensitiveKeys` heuristics. +- `syncInfraOutputSecrets` with hydrated map for current-apply secrets → uses hydrated value; without hydrated → falls back to provider.Get. +- Drift detection on placeholder-state vs. live-without-secret-key → no drift reported (`MaskSensitiveForDiff` working). + +### Local CI gates (per `feedback_local_ci_validation_for_ci_touching_tasks`) + +Before requesting review, implementer runs: + +- `go test ./iac/sensitive/... ./cmd/wfctl/... ./secrets/...` +- `go test -race ./iac/sensitive/... ./cmd/wfctl/...` +- `go vet ./...` +- `golangci-lint run ./iac/sensitive/... ./cmd/wfctl/...` + +## 8. Migration & rollback + +### Migration + +- Plugins: no code change required to keep working as-is. Plugins that want engine-side routing add `Sensitive: {k: true}` to their `ResourceOutput` returns on Create/Update. +- DO plugin: separate PR; not in this scope. Adds `Sensitive: {"secret_key": true}` to `SpacesKeyDriver.Create`. +- Operators: **no action required for greenfield envs**. Operators with pre-existing state records run `wfctl infra audit-state-secrets` (new in this PR) to enumerate plaintext-sensitive keys, then rotate via the standard `wfctl infra bootstrap --force-rotate <name>` flow (which under v0.27.0 will route to provider via the new path). + +### Rollback + +This PR affects runtime apply behaviour (state-write). Rollback path: + +1. **Pin to v0.26.x** at `setup-wfctl@vN`. The engine reverts to plaintext-state behaviour. +2. **State records written under v0.27.0** still contain `secret_ref://...` placeholders. v0.26.x consumers do NOT understand this; they will treat the placeholder as the literal value (e.g., `infra_output` generator copies "secret_ref://..." into a downstream secret). Recovery: rotate the affected secrets via `wfctl infra bootstrap --force-rotate <name>` running v0.27.0 first to generate plaintext state, OR manually edit the state record (filesystem JSON) to inline the value from `secrets.Provider`. Documented runbook in §11 doc-update task. +3. The new package `iac/sensitive`, helper `persistResourceWithSecretRouting`, and `audit-state-secrets` command are additive; reverting the call sites to v0.26.x literal `Outputs: r.Outputs` shape is a one-commit revert of `infra_apply.go` + `infra_refresh_outputs.go`. + +## 9. Assumptions + +The following assumptions are load-bearing. + +1. **Single configured `secrets.Provider` per apply run.** `resolveSecretsProvider(secretsCfg)` returns one provider. +2. **`secrets.Provider.Set` is idempotent overwrite** for production providers. Verified: `EnvProvider.Set` → `os.Setenv`; `FileProvider.Set` → `os.WriteFile`; `GitHubSecretsProvider.Set` → `PUT /secrets/<key>` (RFC: PUT is idempotent); Vault Transit / AWS Secrets Manager — overwrites on `Set` per provider docs. +3. **Plugin authors will populate `ResourceOutput.Sensitive` on Create/Update for sensitive outputs.** Without it, plaintext leaks to state. `audit-state-secrets` is the detection surface; workflow-plugin-reviewer skill update is the long-term fix. +4. **`secret_ref://` is a free namespace.** Verified: zero hits in workflow + workflow-plugin-* repos as of `8de95b4f`. +5. **State records on disk / in cloud-storage are NOT human-edited.** A plaintext value of literally `"secret_ref://x"` does not occur. Same trust assumption as `secret://...` config references. +6. **Same-process hand-off is sufficient for `syncInfraOutputSecrets` and pipeline-run.** Verified by reading `cmd/wfctl/infra.go:1414-1450`: both run within the same `runInfraApply` invocation. +7. **Cold-start consumers know secret names.** `<resource_name>_<output_key>` is stable and documented. Operators write `secret://coredump-deploy-key_secret_key` in dependent configs to pull from `secrets.Provider`. +8. **Read paths can detect known routed-secret keys via pre-existing state placeholders.** §4.4's Sanitize-only logic uses `store.GetResource` to read the prior state; `IaCStateStore.GetResource` is a stable interface method (`interfaces/iac_state.go:16`). + +## 10. Top doubts (after rev1) + +1. **Plugin discovery of the Sensitive contract** — plugins that don't set `Sensitive` silently leak plaintext to state. `audit-state-secrets` heuristic detection is best-effort. Long-term fix is workflow-plugin-reviewer skill check; out of scope for v0.27.0 but flagged as follow-up. +2. **Compensating Delete edge cases** — if `driver.Delete` requires a `ProviderID` that's only available in `out` (not yet persisted to state), the helper threads it correctly; but if Delete itself partially fails (e.g., DO API rate-limit during cleanup), the orphan persists. `audit-state-secrets` detects. Acceptable. +3. **Drift masking false-negatives** — `MaskSensitiveForDiff` masks sensitive keys from both sides. If an operator rotates a routed secret out-of-band (manually deletes from `secrets.Provider`), the state placeholder still exists but the value is gone. Drift detection won't catch this; `audit-state-secrets` will (as "missing routed value"). Acceptable; documented. + +## 11. Estimated scope (rev1) + +| Component | Files touched | LoC delta | +|---|---|---| +| New `iac/sensitive` package (Route, Revoke, IsPlaceholder, MaskSensitiveForDiff) | 2 (`route.go`, `route_test.go`) | +500 | +| State-write helper + 5 call-site refactors | 2 (`cmd/wfctl/infra_apply.go`, `infra_refresh_outputs.go`) | +120 / -40 | +| Hydrated-map plumbing through `runInfraApply` | 2 (`cmd/wfctl/infra.go`, `cmd/wfctl/infra_output_secrets.go`) | +60 | +| Compensating-Delete logic in helper | (within `infra_apply.go`) | +50 | +| New `wfctl infra audit-state-secrets` command | 2 (`cmd/wfctl/infra_audit_state_secrets.go`, `_test.go`) | +400 | +| Drift masking helper + wiring | 1 (`cmd/wfctl/infra_apply_refresh.go`) | +60 | +| Integration tests | 1 (`cmd/wfctl/infra_apply_sensitive_routing_test.go`) | +400 | +| Doc update — `DOCUMENTATION.md` + `docs/WFCTL.md` | 2 | +80 | +| **Total** | ~12 files | ~+1.6k net | + +## 12. Out-of-scope / follow-ups + +1. DO plugin update to set `Sensitive: {"secret_key": true}` on `SpacesKeyDriver.Create` — separate PR after v0.27.0 tag. +2. AWS, GCP, Azure plugins — same pattern, separate PRs each. +3. workflow-plugin-reviewer skill update — add "plugin-sensitive-keys-declared" check (lint surface). +4. State-record migration tool to retroactively route legacy plaintext secrets — separate work. +5. UI / wfctl display of `secret_ref://...` placeholders — should they render as `(routed: <key>)`? Cosmetic, follow-up. +6. Drift detection for **provider-side** `DetectDrift` paths (per-provider plugins) — out of scope; per-provider follow-up; existing default behaviour returns nil. diff --git a/docs/plans/2026-05-09-engine-sensitive-output-routing.md b/docs/plans/2026-05-09-engine-sensitive-output-routing.md new file mode 100644 index 00000000..ea6fedf1 --- /dev/null +++ b/docs/plans/2026-05-09-engine-sensitive-output-routing.md @@ -0,0 +1,2415 @@ +# Engine-side sensitive-output routing Implementation Plan + +> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task. + +**Goal:** Engine routes per-call `Sensitive`-flagged outputs from `ResourceDriver` calls through the configured `secrets.Provider` before state persistence, keeping plugins platform-agnostic and never letting sensitive values reach the state backend. + +**Architecture:** New `iac/sensitive` package exposes free functions (`Route`, `Revoke`, `IsPlaceholder`, `MaskSensitiveForDiff`). A new helper `persistResourceWithSecretRouting` in `cmd/wfctl/infra_apply.go` is the single funnel for all five state-write call sites. `Route` (Create/Update only) sets sensitive values into `secrets.Provider` and returns sanitized `secret_ref://...` placeholders for state plus a hydrated in-memory map for same-process consumers. Read/Adoption/Refresh paths are sanitize-only (no provider writes — prevents cache pollution). On `SaveResource` failure after `Set` succeeded, the helper compensates with `driver.Delete` + `provider.Delete` to avoid orphan cloud resources. New `wfctl infra audit-state-secrets` command is the recovery surface for orphans, legacy plaintext, and missing routed values. + +**Tech Stack:** Go 1.22+, stdlib only (no new deps). Existing packages: `github.com/GoCodeAlone/workflow/interfaces`, `.../secrets`, `.../iac/wfctlhelpers`. New package: `.../iac/sensitive`. + +**Base branch:** `design/engine-sensitive-output-routing` (worktree `_worktrees/engine-sensitive-output-routing`) + +--- + +## Scope Manifest + +**PR Count:** 1 +**Tasks:** 6 +**Estimated Lines of Change:** ~1.7k (informational; not enforced) + +**Out of scope:** +- DO plugin update to set `Sensitive: {"secret_key": true}` on `SpacesKeyDriver.Create` — separate PR after v0.27.0 tag. +- AWS / GCP / Azure plugin updates — separate per-plugin PRs. +- workflow-plugin-reviewer skill update for "plugin-sensitive-keys-declared" lint. +- State-record migration tool to retroactively route legacy plaintext secrets. +- UI display of `secret_ref://...` placeholders. +- Per-provider `DetectDrift` masking updates (per-plugin follow-up). +- Schema change to `interfaces.ResourceState` to add explicit `RoutedSecrets` map (deferred per design §10 doubt 3). +- Wiring `MaskSensitiveForDiff` into engine-side Diff call sites (helper ships in Task 1; in-tree consumers ride a follow-up PR if needed — currently no `cmd/wfctl` site sees state.Outputs flow into `driver.Diff`; per-provider Diffs receive desired/current via gRPC and are out of scope). +- Import path (`infra.go:1101` `Outputs: cloneMap(imported.Outputs)`) — accepted as legacy-plaintext on import; operator runs `wfctl infra audit-state-secrets` post-import to triage. Routing imported state requires a behavioural decision (ad-hoc rotate vs. preserve-as-is) that is out of scope here. + +**PR Grouping:** + +| PR # | Title | Tasks | Branch | +|------|-------|-------|--------| +| 1 | feat(iac): engine-side sensitive-output routing through secrets.Provider | Task 1, Task 2, Task 3, Task 4, Task 5, Task 6 | `feat/engine-sensitive-output-routing` | + +**Status:** Locked 2026-05-09T08:11:07Z + +## Public API contract (locked at Task 1) + +After Task 1 lands, the `iac/sensitive` package exports the following names. Tasks 2-6 depend on these being **exported** (capital initial). Implementer MUST NOT lowercase any of these: + +- `Route(ctx, provider, resourceName, out) (sanitized, hydrated, error)` +- `Revoke(ctx, provider, resourceName, mergedKeys) error` +- `IsPlaceholder(v any) bool` +- `MaskSensitiveForDiff(driverKeys, desired, current) (map, map)` +- `Placeholder(resourceName, outputKey string) string` +- `PlaceholderPrefix string` (constant) +- `SecretKey(resourceName, outputKey string) string` + +--- + +## Pre-flight + +Before starting Task 1, the implementer: + +1. Confirms working directory: `/Users/jon/workspace/workflow/_worktrees/engine-sensitive-output-routing`. +2. Confirms branch: `git branch --show-current` → `design/engine-sensitive-output-routing`. +3. Creates the implementation branch: `git checkout -b feat/engine-sensitive-output-routing`. +4. Reads the design doc: `docs/plans/2026-05-09-engine-sensitive-output-routing-design.md`. +5. Reads the cited source files end-to-end: `interfaces/iac_resource_driver.go`, `interfaces/iac_state.go`, `secrets/secrets.go`, `secrets/github_provider.go`, `iac/wfctlhelpers/apply.go`, `cmd/wfctl/infra_apply.go`, `cmd/wfctl/infra_secrets.go`, `cmd/wfctl/infra_output_secrets.go`, `cmd/wfctl/infra_audit_secrets.go`. + +--- + +### Task 1: New `iac/sensitive` package — Route, Revoke, IsPlaceholder, MaskSensitiveForDiff + +**Files:** +- Create: `iac/sensitive/route.go` +- Create: `iac/sensitive/route_test.go` + +**Change class:** Internal logic (new package, free functions). Verification: unit tests. + +**Step 1.1: Write the failing tests for `Route`** + +Create `iac/sensitive/route_test.go`: + +```go +package sensitive + +import ( + "context" + "errors" + "testing" + + "github.com/GoCodeAlone/workflow/interfaces" + "github.com/GoCodeAlone/workflow/secrets" +) + +// fakeProvider records Set/Delete/Get/List calls for assertions. +type fakeProvider struct { + values map[string]string + setErr map[string]error // per-key Set override + delErr map[string]error + setLog []string + delLog []string +} + +func newFakeProvider() *fakeProvider { + return &fakeProvider{values: map[string]string{}, setErr: map[string]error{}, delErr: map[string]error{}} +} + +func (p *fakeProvider) Name() string { return "fake" } +func (p *fakeProvider) Get(_ context.Context, k string) (string, error) { + v, ok := p.values[k] + if !ok { + return "", secrets.ErrNotFound + } + return v, nil +} +func (p *fakeProvider) Set(_ context.Context, k, v string) error { + if err, ok := p.setErr[k]; ok { + return err + } + p.setLog = append(p.setLog, k) + p.values[k] = v + return nil +} +func (p *fakeProvider) Delete(_ context.Context, k string) error { + if err, ok := p.delErr[k]; ok { + return err + } + p.delLog = append(p.delLog, k) + delete(p.values, k) + return nil +} +func (p *fakeProvider) List(_ context.Context) ([]string, error) { + out := make([]string, 0, len(p.values)) + for k := range p.values { + out = append(out, k) + } + return out, nil +} + +func TestRoute_NoSensitive_PassesThrough(t *testing.T) { + p := newFakeProvider() + out := &interfaces.ResourceOutput{ + Name: "x", Type: "infra.spaces_key", + Outputs: map[string]any{"bucket": "b", "endpoint": "e"}, + } + sanitized, hydrated, err := Route(context.Background(), p, "myres", out) + if err != nil { + t.Fatalf("Route: %v", err) + } + if len(p.setLog) != 0 { + t.Errorf("expected no Set calls, got %v", p.setLog) + } + if sanitized["bucket"] != "b" || sanitized["endpoint"] != "e" { + t.Errorf("non-sensitive outputs corrupted: %v", sanitized) + } + if len(hydrated) != 0 { + t.Errorf("expected empty hydrated, got %v", hydrated) + } +} + +func TestRoute_SensitiveValuePresent_RoutesAndSanitizes(t *testing.T) { + p := newFakeProvider() + out := &interfaces.ResourceOutput{ + Outputs: map[string]any{"access_key": "AK", "secret_key": "SECRET", "bucket": "b"}, + Sensitive: map[string]bool{"secret_key": true, "access_key": true}, + } + sanitized, hydrated, err := Route(context.Background(), p, "myres", out) + if err != nil { + t.Fatalf("Route: %v", err) + } + if p.values["myres_secret_key"] != "SECRET" { + t.Errorf("provider did not receive secret_key; got %v", p.values) + } + if p.values["myres_access_key"] != "AK" { + t.Errorf("provider did not receive access_key; got %v", p.values) + } + if sanitized["secret_key"] != "secret_ref://myres_secret_key" { + t.Errorf("sanitized[secret_key] = %v, want placeholder", sanitized["secret_key"]) + } + if sanitized["access_key"] != "secret_ref://myres_access_key" { + t.Errorf("sanitized[access_key] = %v, want placeholder", sanitized["access_key"]) + } + if sanitized["bucket"] != "b" { + t.Errorf("non-sensitive bucket lost: %v", sanitized["bucket"]) + } + if hydrated["myres_secret_key"] != "SECRET" { + t.Errorf("hydrated missing secret_key: %v", hydrated) + } +} + +func TestRoute_SensitiveKeyAbsentFromOutputs_Skipped(t *testing.T) { + p := newFakeProvider() + out := &interfaces.ResourceOutput{ + Outputs: map[string]any{"access_key": "AK"}, + Sensitive: map[string]bool{"secret_key": true, "access_key": true}, + } + sanitized, hydrated, err := Route(context.Background(), p, "myres", out) + if err != nil { + t.Fatalf("Route: %v", err) + } + if _, ok := sanitized["secret_key"]; ok { + t.Errorf("absent sensitive key should NOT yield placeholder, got %v", sanitized["secret_key"]) + } + if _, ok := p.values["myres_secret_key"]; ok { + t.Errorf("provider should not have received secret_key (absent value)") + } + if hydrated["myres_secret_key"] != "" { + t.Errorf("hydrated should not contain absent key") + } + // access_key was present, should be routed + if sanitized["access_key"] != "secret_ref://myres_access_key" { + t.Errorf("access_key routing failed: %v", sanitized["access_key"]) + } +} + +func TestRoute_SensitiveFalseValue_NotRouted(t *testing.T) { + p := newFakeProvider() + out := &interfaces.ResourceOutput{ + Outputs: map[string]any{"bucket_uri": "https://b.example/"}, + Sensitive: map[string]bool{"bucket_uri": false}, + } + sanitized, _, err := Route(context.Background(), p, "myres", out) + if err != nil { + t.Fatalf("Route: %v", err) + } + if sanitized["bucket_uri"] != "https://b.example/" { + t.Errorf("Sensitive=false should not route, got %v", sanitized["bucket_uri"]) + } + if len(p.setLog) != 0 { + t.Errorf("Sensitive=false triggered Set: %v", p.setLog) + } +} + +func TestRoute_ProviderSetError_ReturnsError(t *testing.T) { + p := newFakeProvider() + p.setErr["myres_secret_key"] = errors.New("boom") + out := &interfaces.ResourceOutput{ + Outputs: map[string]any{"secret_key": "S"}, + Sensitive: map[string]bool{"secret_key": true}, + } + _, _, err := Route(context.Background(), p, "myres", out) + if err == nil { + t.Fatal("expected error from Set") + } +} + +func TestRoute_NilProviderWithSensitive_Errors(t *testing.T) { + out := &interfaces.ResourceOutput{ + Outputs: map[string]any{"secret_key": "S"}, + Sensitive: map[string]bool{"secret_key": true}, + } + _, _, err := Route(context.Background(), nil, "myres", out) + if err == nil { + t.Fatal("expected error when provider nil and Sensitive non-empty") + } +} + +func TestRoute_NilProviderWithoutSensitive_OK(t *testing.T) { + out := &interfaces.ResourceOutput{ + Outputs: map[string]any{"bucket": "b"}, + } + sanitized, hydrated, err := Route(context.Background(), nil, "myres", out) + if err != nil { + t.Fatalf("Route: %v", err) + } + if sanitized["bucket"] != "b" { + t.Errorf("nil-provider non-sensitive corrupted: %v", sanitized) + } + if len(hydrated) != 0 { + t.Errorf("expected empty hydrated, got %v", hydrated) + } +} + +func TestRoute_EmptyResourceName_Errors(t *testing.T) { + p := newFakeProvider() + out := &interfaces.ResourceOutput{ + Outputs: map[string]any{"secret_key": "S"}, + Sensitive: map[string]bool{"secret_key": true}, + } + _, _, err := Route(context.Background(), p, "", out) + if err == nil { + t.Fatal("expected error on empty resourceName") + } +} + +func TestRoute_DeterministicSetOrder(t *testing.T) { + // Multiple sensitive keys: ensure Set order is sorted by key (deterministic). + p := newFakeProvider() + out := &interfaces.ResourceOutput{ + Outputs: map[string]any{"a_key": "A", "b_key": "B", "c_key": "C"}, + Sensitive: map[string]bool{"a_key": true, "b_key": true, "c_key": true}, + } + _, _, err := Route(context.Background(), p, "r", out) + if err != nil { + t.Fatalf("Route: %v", err) + } + want := []string{"r_a_key", "r_b_key", "r_c_key"} + if len(p.setLog) != len(want) { + t.Fatalf("setLog len: got %v want %v", p.setLog, want) + } + for i, w := range want { + if p.setLog[i] != w { + t.Errorf("setLog[%d] = %v want %v", i, p.setLog[i], w) + } + } +} + +func TestRevoke_DeletesAllKeys(t *testing.T) { + p := newFakeProvider() + p.values["r_secret_key"] = "S" + p.values["r_access_key"] = "A" + if err := Revoke(context.Background(), p, "r", []string{"secret_key", "access_key"}); err != nil { + t.Fatalf("Revoke: %v", err) + } + if _, ok := p.values["r_secret_key"]; ok { + t.Errorf("secret_key not deleted") + } + if _, ok := p.values["r_access_key"]; ok { + t.Errorf("access_key not deleted") + } +} + +func TestRevoke_AggregatesErrors(t *testing.T) { + p := newFakeProvider() + p.values["r_secret_key"] = "S" + p.values["r_access_key"] = "A" + p.delErr["r_secret_key"] = errors.New("boom1") + p.delErr["r_access_key"] = errors.New("boom2") + err := Revoke(context.Background(), p, "r", []string{"secret_key", "access_key"}) + if err == nil { + t.Fatal("expected aggregated error") + } + // both should appear; order is sorted by key + msg := err.Error() + if !contains(msg, "boom1") || !contains(msg, "boom2") { + t.Errorf("aggregated error missing one or both: %q", msg) + } +} + +func TestRevoke_ContinuesAfterError(t *testing.T) { + // First key errors; second key should still be Deleted. + p := newFakeProvider() + p.values["r_secret_key"] = "S" + p.values["r_access_key"] = "A" + p.delErr["r_secret_key"] = errors.New("boom") + _ = Revoke(context.Background(), p, "r", []string{"secret_key", "access_key"}) + if _, ok := p.values["r_access_key"]; ok { + t.Error("access_key not deleted (Revoke should continue past first error)") + } +} + +func TestIsPlaceholder(t *testing.T) { + cases := map[string]bool{ + "secret_ref://x": true, + "secret_ref://r_key": true, + "secret_ref://": true, // edge: empty key after prefix; still has prefix + "secret://x": false, + "plain": false, + "": false, + } + for in, want := range cases { + if got := IsPlaceholder(in); got != want { + t.Errorf("IsPlaceholder(%q) = %v, want %v", in, got, want) + } + } + // non-string input + if IsPlaceholder(42) { + t.Error("IsPlaceholder(int) should be false") + } + if IsPlaceholder(nil) { + t.Error("IsPlaceholder(nil) should be false") + } +} + +func TestMaskSensitiveForDiff_MasksPlaceholdersAndDriverKeys(t *testing.T) { + desired := map[string]any{"region": "nyc3", "secret_key": "should-mask", "bucket": "b"} + current := map[string]any{"region": "nyc3", "secret_key": "secret_ref://r_secret_key", "bucket": "b"} + d2, c2 := MaskSensitiveForDiff([]string{"secret_key"}, desired, current) + if _, ok := d2["secret_key"]; ok { + t.Errorf("desired should have secret_key elided, got %v", d2["secret_key"]) + } + if _, ok := c2["secret_key"]; ok { + t.Errorf("current should have secret_key elided, got %v", c2["secret_key"]) + } + if d2["region"] != "nyc3" || c2["region"] != "nyc3" { + t.Errorf("non-sensitive keys must survive: d=%v c=%v", d2["region"], c2["region"]) + } +} + +func TestMaskSensitiveForDiff_PlaceholderInDesired(t *testing.T) { + // Edge: a desired map carrying a placeholder shouldn't leak it into Diff. + desired := map[string]any{"k": "secret_ref://r_k"} + current := map[string]any{"k": "secret_ref://r_k"} + d2, c2 := MaskSensitiveForDiff(nil, desired, current) + if _, ok := d2["k"]; ok { + t.Errorf("placeholder in desired should be elided") + } + if _, ok := c2["k"]; ok { + t.Errorf("placeholder in current should be elided") + } +} + +func contains(s, sub string) bool { + for i := 0; i+len(sub) <= len(s); i++ { + if s[i:i+len(sub)] == sub { + return true + } + } + return false +} +``` + +**Step 1.2: Run tests, confirm fail** + +Run: `cd iac/sensitive && go test ./...` +Expected: FAIL (package does not exist; `Route` / `Revoke` / `IsPlaceholder` / `MaskSensitiveForDiff` undefined). + +**Step 1.3: Implement `iac/sensitive/route.go`** + +```go +// Package sensitive routes ResourceOutput fields flagged as Sensitive +// through a secrets.Provider, returning sanitized placeholders for state +// persistence and a hydrated map for in-process consumers. +// +// Per the engine-sensitive-output-routing design (workflow v0.27.0): +// - Route is invoked on Create/Update only. Read/Adoption/Refresh paths +// use Sanitize-only logic (not in this package — see +// cmd/wfctl/infra_apply.go) to prevent cache pollution. +// - The placeholder format "secret_ref://<resource>_<key>" is distinct +// from the user-supplied "secret://<key>" config-reference convention. +// - Routing trigger is exclusively out.Sensitive[k]==true (per-call +// dynamic). ResourceDriver.SensitiveKeys() is NOT consulted here; +// it remains a display-masking signal. +package sensitive + +import ( + "context" + "errors" + "fmt" + "sort" + "strings" + + "github.com/GoCodeAlone/workflow/interfaces" + "github.com/GoCodeAlone/workflow/secrets" +) + +// PlaceholderPrefix is the URI scheme used in state.Outputs values to +// reference a routed secret stored in the configured secrets.Provider. +// Distinct from secrets.SecretPrefix ("secret://") which is for +// user-supplied config references. +const PlaceholderPrefix = "secret_ref://" + +// SecretKey returns the canonical secrets.Provider key for a resource's +// output: "<resourceName>_<outputKey>". Exported so audit-state-secrets +// and other consumers can reverse-engineer routed-secret names. +func SecretKey(resourceName, outputKey string) string { + return resourceName + "_" + outputKey +} + +// Placeholder returns the canonical "secret_ref://<resourceName>_<outputKey>" +// string that replaces a routed value in state.Outputs. +func Placeholder(resourceName, outputKey string) string { + return PlaceholderPrefix + SecretKey(resourceName, outputKey) +} + +// IsPlaceholder reports whether v is a string with the PlaceholderPrefix. +// Non-string values return false. +func IsPlaceholder(v any) bool { + s, ok := v.(string) + if !ok { + return false + } + return strings.HasPrefix(s, PlaceholderPrefix) +} + +// Route routes sensitive fields from out through provider, keying each +// secret as "<resourceName>_<outputKey>". Returns: +// +// - sanitized: a copy of out.Outputs with sensitive values replaced by +// PlaceholderPrefix + SecretKey(resourceName, k). Suitable for +// persistence to interfaces.IaCStateStore. +// - hydrated: a flat map keyed by SecretKey of values that were routed. +// Suitable for in-process hand-off to post-apply consumers in the +// same wfctl invocation. Empty when no fields were routed. +// +// Routing trigger is out.Sensitive[k] == true with out.Outputs[k] present +// (any value, including empty string). When the sensitive key's value is +// absent from out.Outputs the key is silently SKIPPED — neither +// provider.Set is called nor a placeholder inserted (the engine has no +// value to route; existing routed-secret in provider stays as-is). +// +// Errors: +// - resourceName == "" with non-empty Sensitive map → error (defensive; +// out.Name is intentionally NOT consulted, since Read/Adoption paths +// may have empty out.Name). +// - provider == nil with non-empty Sensitive map AND any sensitive key +// present in out.Outputs → error naming the resource and keys. +// - provider.Set returns an error → error wrapping the failed key. Set +// is invoked in sorted order by key for determinism; on first error +// the loop stops and previously-Set values are NOT cleaned up +// (idempotent overwrite on Apply rerun is the recovery path). +// +// Out is not mutated. +func Route( + ctx context.Context, + provider secrets.Provider, + resourceName string, + out *interfaces.ResourceOutput, +) (sanitized map[string]any, hydrated map[string]string, err error) { + if out == nil { + return nil, nil, fmt.Errorf("sensitive.Route: out is nil") + } + // Build sanitized as a copy of Outputs (or empty map). Hydrated is + // allocated lazily — kept nil-or-empty when no routing happens. + sanitized = make(map[string]any, len(out.Outputs)) + for k, v := range out.Outputs { + sanitized[k] = v + } + + // Collect the sensitive keys whose value is present in Outputs. + // Sort for deterministic Set order. + var routableKeys []string + for k, flag := range out.Sensitive { + if !flag { + continue + } + if _, present := out.Outputs[k]; !present { + continue + } + routableKeys = append(routableKeys, k) + } + sort.Strings(routableKeys) + + if len(routableKeys) == 0 { + return sanitized, nil, nil + } + if resourceName == "" { + return nil, nil, fmt.Errorf("sensitive.Route: resourceName is empty (sensitive keys: %v)", routableKeys) + } + if provider == nil { + return nil, nil, fmt.Errorf("sensitive.Route: no secrets.Provider configured but resource %q has sensitive output keys %v", resourceName, routableKeys) + } + + hydrated = make(map[string]string, len(routableKeys)) + for _, k := range routableKeys { + val, err := stringifyOutput(out.Outputs[k]) + if err != nil { + return nil, nil, fmt.Errorf("sensitive.Route: resource %q key %q: %w", resourceName, k, err) + } + secretName := SecretKey(resourceName, k) + if setErr := provider.Set(ctx, secretName, val); setErr != nil { + return nil, nil, fmt.Errorf("sensitive.Route: provider.Set(%q): %w", secretName, setErr) + } + sanitized[k] = Placeholder(resourceName, k) + hydrated[secretName] = val + } + return sanitized, hydrated, nil +} + +// stringifyOutput coerces an output value to string. The secrets.Provider +// API takes string values; non-string sensitive outputs are not supported +// in v0.27.0 (would need encoding decisions out of scope here). +func stringifyOutput(v any) (string, error) { + switch s := v.(type) { + case string: + return s, nil + default: + return "", fmt.Errorf("sensitive output value type %T not supported (must be string)", v) + } +} + +// Revoke deletes routed secrets for resourceName. mergedKeys is the union +// of placeholder-derived keys (caller extracts from pre-delete +// state.Outputs) and any legacy heuristic keys. Errors from +// provider.Delete are aggregated via errors.Join — Revoke does NOT stop +// on the first error so partial cleanup proceeds. Keys that were never +// stored (provider returns secrets.ErrNotFound) are silently treated as +// success. +func Revoke( + ctx context.Context, + provider secrets.Provider, + resourceName string, + mergedKeys []string, +) error { + if provider == nil { + return nil // no-op when no provider configured + } + if resourceName == "" { + return fmt.Errorf("sensitive.Revoke: resourceName is empty") + } + // Sort for determinism (test stability + log readability). + sorted := append([]string(nil), mergedKeys...) + sort.Strings(sorted) + + var errs []error + for _, k := range sorted { + secretName := SecretKey(resourceName, k) + if delErr := provider.Delete(ctx, secretName); delErr != nil { + if errors.Is(delErr, secrets.ErrNotFound) { + continue + } + errs = append(errs, fmt.Errorf("delete %q: %w", secretName, delErr)) + } + } + if len(errs) > 0 { + return errors.Join(errs...) + } + return nil +} + +// MaskSensitiveForDiff returns copies of desired and current with sensitive +// keys elided from BOTH sides. A key is considered sensitive when: +// +// - it is named in driverKeys (i.e., ResourceDriver.SensitiveKeys()), OR +// - its value in current matches the PlaceholderPrefix. +// +// Eliding from both sides ensures driver.Diff or other field-by-field +// comparators don't report drift when state has a placeholder and live +// has a different (or absent) value. Non-sensitive keys are passed +// through unchanged. +// +// Either input may be nil; the corresponding output is also nil. +func MaskSensitiveForDiff(driverKeys []string, desired, current map[string]any) (map[string]any, map[string]any) { + mask := make(map[string]struct{}, len(driverKeys)) + for _, k := range driverKeys { + mask[k] = struct{}{} + } + // Augment with placeholder-derived keys from current. + for k, v := range current { + if IsPlaceholder(v) { + mask[k] = struct{}{} + } + } + // Also augment from desired in case a desired-side placeholder leaked in + // (unusual but defensive). + for k, v := range desired { + if IsPlaceholder(v) { + mask[k] = struct{}{} + } + } + return copyExcept(desired, mask), copyExcept(current, mask) +} + +func copyExcept(in map[string]any, exclude map[string]struct{}) map[string]any { + if in == nil { + return nil + } + out := make(map[string]any, len(in)) + for k, v := range in { + if _, skip := exclude[k]; skip { + continue + } + out[k] = v + } + return out +} +``` + +**Step 1.4: Run tests, confirm pass** + +Run: `cd iac/sensitive && go test ./... -v` +Expected: All tests PASS. Specifically these names: `TestRoute_NoSensitive_PassesThrough`, `TestRoute_SensitiveValuePresent_RoutesAndSanitizes`, `TestRoute_SensitiveKeyAbsentFromOutputs_Skipped`, `TestRoute_SensitiveFalseValue_NotRouted`, `TestRoute_ProviderSetError_ReturnsError`, `TestRoute_NilProviderWithSensitive_Errors`, `TestRoute_NilProviderWithoutSensitive_OK`, `TestRoute_EmptyResourceName_Errors`, `TestRoute_DeterministicSetOrder`, `TestRevoke_DeletesAllKeys`, `TestRevoke_AggregatesErrors`, `TestRevoke_ContinuesAfterError`, `TestIsPlaceholder`, `TestMaskSensitiveForDiff_MasksPlaceholdersAndDriverKeys`, `TestMaskSensitiveForDiff_PlaceholderInDesired`. + +**Step 1.5: Run race + vet + lint locally** + +Run: +- `go test -race ./iac/sensitive/...` → PASS +- `go vet ./iac/sensitive/...` → no output +- `golangci-lint run ./iac/sensitive/...` → no findings + +**Step 1.6: Commit** + +```bash +git add iac/sensitive/route.go iac/sensitive/route_test.go +git commit -m "$(cat <<'EOF' +feat(iac/sensitive): Route/Revoke/IsPlaceholder/MaskSensitiveForDiff helpers + +New iac/sensitive package for engine-side routing of ResourceOutput +sensitive fields through secrets.Provider. Free functions, no struct. +Placeholder format "secret_ref://<resource>_<key>" is distinct from +the existing user-supplied "secret://<key>" config-reference convention. + +Routing trigger is exclusively out.Sensitive[k]==true; SensitiveKeys() +remains a display-masking-only signal per design rev1. + +Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> +EOF +)" +``` + +--- + +### Task 2: `persistResourceWithSecretRouting` helper + Apply call-site rewires (sites 1 & 2) + +**Files:** +- Modify: `cmd/wfctl/infra_apply.go` (add helper + rewire two state-write sites at :550-557 and :1032-1040) +- Create: `cmd/wfctl/infra_apply_sensitive_routing_test.go` + +**Change class:** Internal logic + apply path (runtime-affecting). Verification: unit + integration tests; **rollback note required**. + +**Rollback:** Revert this commit; the helper is additive and the call sites' literal `Outputs: r.Outputs` shape is one-revert away. + +**Step 2.1: Write the failing integration tests** + +Create `cmd/wfctl/infra_apply_sensitive_routing_test.go`: + +```go +package main + +import ( + "context" + "errors" + "fmt" + "strings" + "testing" + "time" + + "github.com/GoCodeAlone/workflow/iac/sensitive" + "github.com/GoCodeAlone/workflow/interfaces" + "github.com/GoCodeAlone/workflow/secrets" +) + +// envProvider is the test secret store: an in-memory env-like map. +type envProvider struct{ values map[string]string } + +func newEnvProvider() *envProvider { return &envProvider{values: map[string]string{}} } +func (p *envProvider) Name() string { return "env" } +func (p *envProvider) Get(_ context.Context, k string) (string, error) { + v, ok := p.values[k] + if !ok { + return "", secrets.ErrNotFound + } + return v, nil +} +func (p *envProvider) Set(_ context.Context, k, v string) error { p.values[k] = v; return nil } +func (p *envProvider) Delete(_ context.Context, k string) error { delete(p.values, k); return nil } +func (p *envProvider) List(_ context.Context) ([]string, error) { + out := make([]string, 0, len(p.values)) + for k := range p.values { + out = append(out, k) + } + return out, nil +} + +// stubStore captures SaveResource calls for assertions. +type stubStore struct { + saved []interfaces.ResourceState + saveErr error + deleted []string +} + +func (s *stubStore) SaveResource(_ context.Context, st interfaces.ResourceState) error { + if s.saveErr != nil { + return s.saveErr + } + s.saved = append(s.saved, st) + return nil +} +func (s *stubStore) GetResource(_ context.Context, name string) (*interfaces.ResourceState, error) { + for i := range s.saved { + if s.saved[i].Name == name { + return &s.saved[i], nil + } + } + return nil, fmt.Errorf("not found") +} +func (s *stubStore) ListResources(_ context.Context) ([]interfaces.ResourceState, error) { + return s.saved, nil +} +func (s *stubStore) DeleteResource(_ context.Context, n string) error { s.deleted = append(s.deleted, n); return nil } +func (s *stubStore) SavePlan(_ context.Context, _ interfaces.IaCPlan) error { return nil } +func (s *stubStore) GetPlan(_ context.Context, _ string) (*interfaces.IaCPlan, error) { return nil, nil } +func (s *stubStore) Lock(_ context.Context, _ string, _ time.Duration) (interfaces.IaCLockHandle, error) { return nil, nil } +func (s *stubStore) Close() error { return nil } + +// stubDriver captures Delete calls (for compensating-Delete tests). +type stubSensitiveDriver struct { + deleteCalls []interfaces.ResourceRef + deleteErr error +} + +func (d *stubSensitiveDriver) Create(_ context.Context, _ interfaces.ResourceSpec) (*interfaces.ResourceOutput, error) { return nil, nil } +func (d *stubSensitiveDriver) Read(_ context.Context, _ interfaces.ResourceRef) (*interfaces.ResourceOutput, error) { return nil, nil } +func (d *stubSensitiveDriver) Update(_ context.Context, _ interfaces.ResourceRef, _ interfaces.ResourceSpec) (*interfaces.ResourceOutput, error) { return nil, nil } +func (d *stubSensitiveDriver) Delete(_ context.Context, ref interfaces.ResourceRef) error { + d.deleteCalls = append(d.deleteCalls, ref) + return d.deleteErr +} +func (d *stubSensitiveDriver) Diff(_ context.Context, _ interfaces.ResourceSpec, _ *interfaces.ResourceOutput) (*interfaces.DiffResult, error) { return nil, nil } +func (d *stubSensitiveDriver) HealthCheck(_ context.Context, _ interfaces.ResourceRef) (*interfaces.HealthResult, error) { return nil, nil } +func (d *stubSensitiveDriver) Scale(_ context.Context, _ interfaces.ResourceRef, _ int) (*interfaces.ResourceOutput, error) { return nil, nil } +func (d *stubSensitiveDriver) SensitiveKeys() []string { return nil } + +func TestPersistResourceWithSecretRouting_RoutesSensitiveAndSanitizesState(t *testing.T) { + prov := newEnvProvider() + store := &stubStore{} + drv := &stubSensitiveDriver{} + out := interfaces.ResourceOutput{ + Name: "myres", Type: "infra.spaces_key", ProviderID: "AKIA", + Outputs: map[string]any{"access_key": "AK", "secret_key": "SK", "bucket": "b"}, + Sensitive: map[string]bool{"access_key": true, "secret_key": true}, + } + rs := interfaces.ResourceState{ + ID: "myres", Name: "myres", Type: "infra.spaces_key", + Provider: "digitalocean", ProviderID: "AKIA", + } + hydrated, err := persistResourceWithSecretRouting(context.Background(), store, prov, drv, rs, out, persistModeApply) + if err != nil { + t.Fatalf("persist: %v", err) + } + if len(store.saved) != 1 { + t.Fatalf("expected 1 saved, got %d", len(store.saved)) + } + state := store.saved[0] + if state.Outputs["secret_key"] != "secret_ref://myres_secret_key" { + t.Errorf("state secret_key not sanitized: %v", state.Outputs["secret_key"]) + } + if state.Outputs["access_key"] != "secret_ref://myres_access_key" { + t.Errorf("state access_key not sanitized: %v", state.Outputs["access_key"]) + } + if state.Outputs["bucket"] != "b" { + t.Errorf("state bucket lost: %v", state.Outputs["bucket"]) + } + if prov.values["myres_secret_key"] != "SK" { + t.Errorf("provider missing secret_key value") + } + if hydrated["myres_secret_key"] != "SK" { + t.Errorf("hydrated missing secret_key: %v", hydrated) + } +} + +func TestPersistResourceWithSecretRouting_NoProviderHardFails(t *testing.T) { + store := &stubStore{} + drv := &stubSensitiveDriver{} + out := interfaces.ResourceOutput{ + Outputs: map[string]any{"secret_key": "SK"}, + Sensitive: map[string]bool{"secret_key": true}, + } + rs := interfaces.ResourceState{Name: "myres"} + _, err := persistResourceWithSecretRouting(context.Background(), store, nil, drv, rs, out, persistModeApply) + if err == nil { + t.Fatal("expected error when provider nil and sensitive non-empty") + } + if !strings.Contains(err.Error(), "myres") { + t.Errorf("error should name resource, got %q", err.Error()) + } + if len(store.saved) != 0 { + t.Error("state should NOT be saved when routing fails") + } +} + +func TestPersistResourceWithSecretRouting_SaveFailureCompensatesWithDelete(t *testing.T) { + prov := newEnvProvider() + store := &stubStore{saveErr: errors.New("disk full")} + drv := &stubSensitiveDriver{} + out := interfaces.ResourceOutput{ + Name: "myres", ProviderID: "AKIA", + Outputs: map[string]any{"secret_key": "SK"}, + Sensitive: map[string]bool{"secret_key": true}, + } + rs := interfaces.ResourceState{Name: "myres", Type: "infra.spaces_key", ProviderID: "AKIA"} + _, err := persistResourceWithSecretRouting(context.Background(), store, prov, drv, rs, out, persistModeApply) + if err == nil { + t.Fatal("expected error from SaveResource") + } + if !strings.Contains(err.Error(), "disk full") { + t.Errorf("error should wrap original SaveResource err, got %q", err.Error()) + } + if len(drv.deleteCalls) != 1 { + t.Errorf("expected 1 compensating Delete call, got %d", len(drv.deleteCalls)) + } + if _, ok := prov.values["myres_secret_key"]; ok { + t.Errorf("compensating Delete should have removed routed secret; got %v", prov.values) + } +} + +func TestPersistResourceWithSecretRouting_NoSensitivePassesThrough(t *testing.T) { + store := &stubStore{} + drv := &stubSensitiveDriver{} + out := interfaces.ResourceOutput{ + Outputs: map[string]any{"bucket": "b"}, + } + rs := interfaces.ResourceState{Name: "myres"} + hydrated, err := persistResourceWithSecretRouting(context.Background(), store, nil, drv, rs, out, persistModeApply) + if err != nil { + t.Fatalf("persist: %v", err) + } + if len(hydrated) != 0 { + t.Errorf("hydrated should be empty: %v", hydrated) + } + if len(store.saved) != 1 { + t.Fatalf("expected 1 saved, got %d", len(store.saved)) + } + if store.saved[0].Outputs["bucket"] != "b" { + t.Errorf("non-sensitive output corrupted: %v", store.saved[0].Outputs) + } +} + +// Idempotent re-Apply: routing twice with same value is safe. +func TestPersistResourceWithSecretRouting_Idempotent(t *testing.T) { + prov := newEnvProvider() + store := &stubStore{} + drv := &stubSensitiveDriver{} + out := interfaces.ResourceOutput{ + Outputs: map[string]any{"secret_key": "SK"}, + Sensitive: map[string]bool{"secret_key": true}, + } + rs := interfaces.ResourceState{Name: "myres"} + for i := 0; i < 2; i++ { + _, err := persistResourceWithSecretRouting(context.Background(), store, prov, drv, rs, out, persistModeApply) + if err != nil { + t.Fatalf("persist iter %d: %v", i, err) + } + } + if prov.values["myres_secret_key"] != "SK" { + t.Errorf("provider value lost on re-Apply: %v", prov.values) + } + if len(store.saved) != 2 { + t.Errorf("expected 2 saved, got %d", len(store.saved)) + } +} +``` + +Note: `infraStateStore` interface in `cmd/wfctl/infra_state_store.go:22` matches the methods defined here on `stubStore`. The test file uses `interfaces.IaCStateStore` directly via the helper signature; the helper takes the broader `interfaces.IaCStateStore` so test-stub coupling is minimal. + +**Step 2.2: Run tests, confirm fail** + +Run: `cd cmd/wfctl && go test -run TestPersistResourceWithSecretRouting -v` +Expected: FAIL — `persistResourceWithSecretRouting` undefined; `persistModeApply` undefined. + +**Step 2.3: Implement the helper in `cmd/wfctl/infra_apply.go`** + +Add near the bottom of `cmd/wfctl/infra_apply.go` (after `cloneMap` at ~line 736): + +```go +// persistMode controls how persistResourceWithSecretRouting handles a +// driver output: persistModeApply routes sensitive fields through the +// provider; persistModeRead is sanitize-only (no provider writes — used +// by adoption / refresh paths to avoid cache pollution). +type persistMode int + +const ( + persistModeApply persistMode = iota + persistModeRead +) + +// persistResourceWithSecretRouting builds rs.Outputs from out (routing +// sensitive fields through provider in apply mode), calls +// store.SaveResource, and returns the hydrated routed-secret map for +// in-process hand-off. On SaveResource failure after provider.Set +// succeeded (apply mode only), invokes driver.Delete + provider.Delete +// to compensate the partial cloud-resource creation. Returns a wrapped +// error naming both the original SaveResource failure and the +// compensating-Delete outcome. +// +// In read mode, the helper does NOT call provider.Set; instead it consults +// the prior state via store.GetResource and inherits any +// PlaceholderPrefix entries; new sensitive keys (declared by the driver +// at Read time but not previously routed) are dropped from sanitized. +// +// Returns nil hydrated map in read mode (consumers don't need post-apply +// hand-off for a Read). +// +// Note: store is the wfctl-internal infraStateStore interface (defined in +// cmd/wfctl/infra_state_store.go) — NOT interfaces.IaCStateStore. The +// helper lives in the cmd/wfctl package; using the package-private +// interface keeps the boundary clean and matches existing callers +// (applyWithProviderAndStore, adoptExistingResources). The fakes in +// _test.go implement infraStateStore (a smaller surface than the engine +// IaCStateStore: SaveResource / GetResource / ListResources / +// DeleteResource / Close). +func persistResourceWithSecretRouting( + ctx context.Context, + store infraStateStore, + provider secrets.Provider, + driver interfaces.ResourceDriver, + rs interfaces.ResourceState, + out interfaces.ResourceOutput, + mode persistMode, +) (map[string]string, error) { + switch mode { + case persistModeApply: + return persistApplyMode(ctx, store, provider, driver, rs, out) + case persistModeRead: + return nil, persistReadMode(ctx, store, rs, out) + default: + return nil, fmt.Errorf("persistResourceWithSecretRouting: unknown mode %d", mode) + } +} + +func persistApplyMode( + ctx context.Context, + store infraStateStore, + provider secrets.Provider, + driver interfaces.ResourceDriver, + rs interfaces.ResourceState, + out interfaces.ResourceOutput, +) (map[string]string, error) { + sanitized, hydrated, err := sensitive.Route(ctx, provider, rs.Name, &out) + if err != nil { + return nil, fmt.Errorf("%s/%s: route sensitive outputs: %w", rs.Type, rs.Name, err) + } + rs.Outputs = sanitized + if saveErr := store.SaveResource(ctx, rs); saveErr != nil { + // Compensating Delete: if we routed secrets, the matching cloud + // resource is real but the state record didn't land. Roll back so + // a re-Apply doesn't double-create. + if len(hydrated) > 0 { + compErr := compensateAfterSaveFailure(ctx, provider, driver, rs, hydrated) + if compErr != nil { + return nil, fmt.Errorf("%s/%s: persist state after apply: %w (compensating delete failed: %v)", rs.Type, rs.Name, saveErr, compErr) + } + return nil, fmt.Errorf("%s/%s: persist state after apply: %w (compensating delete succeeded)", rs.Type, rs.Name, saveErr) + } + return nil, fmt.Errorf("%s/%s: persist state after apply: %w", rs.Type, rs.Name, saveErr) + } + return hydrated, nil +} + +func persistReadMode( + ctx context.Context, + store infraStateStore, + rs interfaces.ResourceState, + out interfaces.ResourceOutput, +) error { + // Sanitize-only: inherit placeholders from prior state for sensitive + // keys; drop newly-declared sensitive keys. Do NOT call provider.Set. + prior, _ := store.GetResource(ctx, rs.Name) // best-effort; nil-safe + sanitized := make(map[string]any, len(out.Outputs)) + for k, v := range out.Outputs { + sanitized[k] = v + } + for k, flag := range out.Sensitive { + if !flag { + continue + } + // If prior state has a placeholder for this key, preserve it. + if prior != nil { + if pv, ok := prior.Outputs[k]; ok && sensitive.IsPlaceholder(pv) { + sanitized[k] = pv + continue + } + } + // Otherwise drop the field from sanitized — we don't have a + // previously-routed secret and we won't pollute the provider + // from a Read. + delete(sanitized, k) + } + rs.Outputs = sanitized + if err := store.SaveResource(ctx, rs); err != nil { + return fmt.Errorf("%s/%s: persist state after read: %w", rs.Type, rs.Name, err) + } + return nil +} + +func compensateAfterSaveFailure( + ctx context.Context, + provider secrets.Provider, + driver interfaces.ResourceDriver, + rs interfaces.ResourceState, + hydrated map[string]string, +) error { + var errs []error + if driver != nil { + ref := interfaces.ResourceRef{Name: rs.Name, Type: rs.Type, ProviderID: rs.ProviderID} + if delErr := driver.Delete(ctx, ref); delErr != nil { + errs = append(errs, fmt.Errorf("driver.Delete: %w", delErr)) + } + } + if provider != nil { + // Reverse-engineer the original output keys from the secret names. + // Format: "<rs.Name>_<output_key>"; strip the prefix. + for secretName := range hydrated { + if delErr := provider.Delete(ctx, secretName); delErr != nil && !errors.Is(delErr, secrets.ErrNotFound) { + errs = append(errs, fmt.Errorf("provider.Delete(%s): %w", secretName, delErr)) + } + } + } + if len(errs) > 0 { + return errors.Join(errs...) + } + return nil +} +``` + +Add the imports if absent (Go fmt will hoist them; explicit list for clarity): + +```go +import ( + "errors" // already present + // ... existing imports ... + "github.com/GoCodeAlone/workflow/iac/sensitive" + "github.com/GoCodeAlone/workflow/secrets" +) +``` + +**Step 2.4: Rewire state-write call sites 1 & 2 in `cmd/wfctl/infra_apply.go`** + +Site 1 — `applyWithProviderAndStore` at lines 540-557. Replace the literal `ResourceState{...Outputs: r.Outputs ...}` block: + +Find the existing block: +```go + now := time.Now().UTC() + rs := interfaces.ResourceState{ + ID: r.Name, + Name: r.Name, + Type: r.Type, + Provider: providerType, + ProviderRef: providerRef, + ProviderID: r.ProviderID, + ConfigHash: configHashMap(appliedCfg), + AppliedConfig: appliedCfg, + AppliedConfigSource: "apply", + Outputs: r.Outputs, + Dependencies: dependencies, + CreatedAt: now, + UpdatedAt: now, + } + if saveErr := store.SaveResource(ctx, rs); saveErr != nil { + return fmt.Errorf("%s/%s: persist state after apply: %w", r.Type, r.Name, saveErr) + } +``` + +Replace with: +```go + now := time.Now().UTC() + rs := interfaces.ResourceState{ + ID: r.Name, + Name: r.Name, + Type: r.Type, + Provider: providerType, + ProviderRef: providerRef, + ProviderID: r.ProviderID, + ConfigHash: configHashMap(appliedCfg), + AppliedConfig: appliedCfg, + AppliedConfigSource: "apply", + // Outputs is set by persistResourceWithSecretRouting after Route. + Dependencies: dependencies, + CreatedAt: now, + UpdatedAt: now, + } + driver, _ := provider.ResourceDriver(r.Type) // best-effort for compensating Delete; nil-safe in helper + h, persistErr := persistResourceWithSecretRouting(ctx, store, secretsProvider, driver, rs, r, persistModeApply) + if persistErr != nil { + return persistErr + } + for k, v := range h { + hydratedAll[k] = v + } +``` + +Site 2 — In-process apply path at lines 1022-1040. Same replacement pattern; identical helper invocation. + +**Required signature changes** (apply both functions): + +1. `applyWithProviderAndStore` (`cmd/wfctl/infra_apply.go:360`) — add `cfgFile string` parameter, change return to `(map[string]string, error)`: + +```go +func applyWithProviderAndStore( + ctx context.Context, + provider interfaces.IaCProvider, + providerType string, + cfgFile string, // NEW + specs []interfaces.ResourceSpec, + current []interfaces.ResourceState, + store infraStateStore, + w io.Writer, + envName string, +) (map[string]string, error) { + hydratedAll := make(map[string]string) + secretsProvider, err := loadSecretsProviderForRouting(cfgFile) + if err != nil { + return nil, err + } + // ... existing body ... + return hydratedAll, nil // single trailing return at every existing return-nil site +} +``` + +2. `applyPrecomputedPlanWithStore` (`cmd/wfctl/infra_apply.go:910`) — same parameter + return changes: + +```go +func applyPrecomputedPlanWithStore( + ctx context.Context, + plan interfaces.IaCPlan, + provider interfaces.IaCProvider, + providerType string, + cfgFile string, // NEW + store infraStateStore, + w io.Writer, + envName string, +) (map[string]string, error) +``` + +Loader helper (in `infra_apply.go`, near the bottom): + +```go +// loadSecretsProviderForRouting returns the configured secrets.Provider for +// this apply run, or (nil, nil) when secretsCfg is absent. The caller's +// downstream sensitive.Route will hard-fail if any driver emits sensitive +// outputs and provider is nil. cfgFile is the same resolved infra.yaml path +// the rest of the apply pipeline uses. +func loadSecretsProviderForRouting(cfgFile string) (secrets.Provider, error) { + cfg, err := parseSecretsConfig(cfgFile) + if err != nil { + return nil, fmt.Errorf("parse secrets config for sensitive routing: %w", err) + } + if cfg == nil { + return nil, nil + } + return resolveSecretsProvider(cfg) +} +``` + +3. Update the **caller** at `cmd/wfctl/infra_apply.go:267` (and the precomputed-plan caller at `:886`) to pass `cfgFile`: + +```go +// Was: +return applyWithProviderAndStore(ctx, provider, g.provType, g.specs, scopedCurrent, store, os.Stderr, envName) +// Becomes: +hyd, err := applyWithProviderAndStore(ctx, provider, g.provType, cfgFile, g.specs, scopedCurrent, store, os.Stderr, envName) +if err != nil { + return err +} +for k, v := range hyd { + runHydrated[k] = v +} +return nil +``` + +`runHydrated` is a function-scope `map[string]string` declared at the top of `applyInfraModules` (or the equivalent caller). After the dispatch loop completes, `runHydrated` is passed into `syncInfraOutputSecrets` (Step 2.5). + +`cfgFile` is already in scope at `infra_apply.go:244` (passed as parameter to `applyInfraModules`). + +4. Update existing tests at `cmd/wfctl/infra_apply_allow_replace_test.go:223,267` (and any other test calling `applyWithProviderAndStore`) to: + - pass empty string for the new `cfgFile` parameter. + - update return-value handling: `_, err := applyWithProviderAndStore(...)`. + +Pre-flight grep to enumerate every call site that must update: + +``` +grep -rn "applyWithProviderAndStore(\|applyPrecomputedPlanWithStore(" cmd/wfctl/ +``` + +5. **Preserve existing pre-save validation.** The original code at `infra_apply.go:520-524` calls `validateOutputProviderID(provider, providerType, &r)` BEFORE state save. The replacement MUST keep this call unchanged, BEFORE invoking `persistResourceWithSecretRouting`: + +```go +// Hard-fail when the driver returns a malformed ProviderID for a strict format. +if err := validateOutputProviderID(provider, providerType, &r); err != nil { + return nil, fmt.Errorf("state write rejected: %w", err) +} +// THEN call persistResourceWithSecretRouting (replaces the old SaveResource call) +``` + +Same preservation requirement at the in-process apply path (`:1004` `validateOutputProviderID` call). + +6. **Pre-flight check: detect sensitive-emitting drivers before the persistence loop.** When `secretsProvider == nil` AND `result.Resources` contains any `*ResourceOutput` with non-empty `Sensitive` map, surface a single clear error BEFORE entering the per-resource persistence loop. This prevents partial apply (one resource erroring mid-loop while previous ones already persisted plaintext-but-sanitized state). + +```go +if secretsProvider == nil { + for i := range result.Resources { + if hasSensitiveOutputs(&result.Resources[i]) { + return nil, fmt.Errorf( + "secrets.Provider not configured but driver emitted sensitive outputs (resource %q has Sensitive map %v); add `secrets:` block to your config or use `secrets: { provider: env }`", + result.Resources[i].Name, sensitiveKeysFor(&result.Resources[i])) + } + } +} +// ... persistence loop unchanged ... + +func hasSensitiveOutputs(r *interfaces.ResourceOutput) bool { + for _, v := range r.Sensitive { + if v { + return true + } + } + return false +} + +func sensitiveKeysFor(r *interfaces.ResourceOutput) []string { + var keys []string + for k, v := range r.Sensitive { + if v { + keys = append(keys, k) + } + } + sort.Strings(keys) + return keys +} +``` + +Same pre-flight in `applyPrecomputedPlanWithStore`. + +7. **Compensation context isolation.** The `compensateAfterSaveFailure` helper inherits the apply ctx; if the apply was canceled, Delete may also fail. Use a fresh 30-second timeout context for compensation specifically: + +```go +func compensateAfterSaveFailure( + parentCtx context.Context, /* logging only */ + provider secrets.Provider, + driver interfaces.ResourceDriver, + rs interfaces.ResourceState, + hydrated map[string]string, +) error { + // Fresh context — compensation must run even on parent ctx cancel. + ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second) + defer cancel() + // ... existing body uses ctx, not parentCtx ... +} +``` + +8. **String-only sensitive values (v0.27.0 limitation).** `stringifyOutput` errors on non-string values. Document this as the v0.27.0 limitation in `iac/sensitive/route.go`'s package comment AND in DOCUMENTATION.md (Task 6 §6.1 "Sensitive output routing" section). Future expansion (e.g., `MarshalSensitive` interface) is out of scope. + +**Step 2.5: Rewire `runInfraApply` to thread `hydratedAll` to `syncInfraOutputSecrets`** + +In `cmd/wfctl/infra.go`, around line 1414-1450, modify `applyInfraModules` (or wherever `applyWithProviderAndStore` is invoked) to capture the returned `hydratedAll` map and pass it to `syncInfraOutputSecrets`. + +Update `syncInfraOutputSecrets` signature in `cmd/wfctl/infra_output_secrets.go:100`: + +```go +func syncInfraOutputSecrets( + ctx context.Context, + secretsCfg *SecretsConfig, + provider secrets.Provider, + states []interfaces.ResourceState, + wfCfg *config.WorkflowConfig, + envName string, + hydrated map[string]string, // NEW: same-process routed-secret hand-off +) error { + // ... existing body ... + // Modify resolveInfraOutput call site to consult hydrated FIRST. +} +``` + +Modify `resolveInfraOutput` (`cmd/wfctl/infra_output_secrets.go:37`): + +```go +func resolveInfraOutput(wfCfg *config.WorkflowConfig, source, envName string, stateOutputs map[string]map[string]any, hydrated map[string]string) (string, error) { + // ... existing module-name resolution ... + + val, ok := outputs[field] + if !ok { + return "", fmt.Errorf("infra_output: field %q not found in outputs of module %q", field, moduleName) + } + // If state has a placeholder, prefer the hydrated map. + if sensitive.IsPlaceholder(val) { + secretName := strings.TrimPrefix(val.(string), sensitive.PlaceholderPrefix) + if hv, hok := hydrated[secretName]; hok { + return hv, nil + } + // Fall back to provider.Get (nil-safe; tests cover write-only-host case). + // The provider is reachable through secretsCfg already loaded by caller. + return "", fmt.Errorf("infra_output: field %q is a routed-secret placeholder %q; not in same-process hydrated map (write-only providers cannot rehydrate cold)", field, val) + } + s, ok := val.(string) + if !ok { + return "", fmt.Errorf("infra_output: output field %q of module %q is %T, expected string", field, moduleName, val) + } + return s, nil +} +``` + +Update all call sites of `resolveInfraOutput` (currently at `infra_output_secrets.go:163` and `infra_resolve_state.go:123`) to pass the `hydrated` map. For the `infra_resolve_state.go` site (drift detection / state preview), pass `nil` — that path doesn't have a hydrated map. The error path on placeholder + nil hydrated is the documented constraint. + +**Step 2.5.1: Update existing tests for the new `hydrated` parameter** + +Update every test file that calls `syncInfraOutputSecrets` or `resolveInfraOutput` to pass `nil` (or appropriate map) for the new parameter: + +``` +grep -rn "syncInfraOutputSecrets(\|resolveInfraOutput(" cmd/wfctl/ +``` + +Expected hits include: +- `cmd/wfctl/infra_output_secrets_test.go:108-146` — multiple `syncInfraOutputSecrets` calls; add `nil` as the trailing argument. +- Any other test at the same call sites. + +For each existing call, append `nil` (or the appropriate map for tests that exercise the hydrated path). + +**Step 2.5.2: Add `nil`-hydrated-map fallback test in `infra_output_secrets_test.go`** + +Add a test that verifies: state has placeholder, hydrated is nil, provider has the value via Get → resolveInfraOutput succeeds. Plus the inverse: state has placeholder, hydrated is nil, provider returns ErrUnsupported → resolveInfraOutput returns the documented "write-only providers cannot rehydrate cold" error. + +**Step 2.6: Run tests, confirm pass** + +Run: `cd cmd/wfctl && go test -run TestPersistResourceWithSecretRouting -v` +Expected: All 5 TestPersistResource* tests PASS. + +Run: `go test ./cmd/wfctl/... ./iac/sensitive/... ./secrets/...` +Expected: All PASS — the rewired call sites do not break existing apply tests. + +Specifically check: `TestApplyPlan_*`, `TestRunInfra*` test names should still pass. If existing tests fail because they don't account for the new helper signature, update them to pass nil/empty for the new params. + +**Step 2.7: Run race + vet + lint** + +Run: +- `go test -race ./cmd/wfctl/... ./iac/sensitive/...` → PASS +- `go vet ./...` → no output +- `golangci-lint run ./cmd/wfctl/... ./iac/sensitive/...` → no findings + +**Step 2.7.bis: Runtime-launch-validation smoke** + +Task 2 modifies the apply runtime path. Per `superpowers:runtime-launch-validation`, build the artifact and exercise it under realistic conditions: + +```sh +go build -o wfctl ./cmd/wfctl +mkdir -p /tmp/wfctl-routing-smoke && cd /tmp/wfctl-routing-smoke +``` + +Create `infra-no-secrets.yaml` (no `secrets:` block) and `infra-with-env.yaml` (with `secrets: { provider: env, config: { prefix: WFCTL_TEST_ } }`). Each declares an iac.state file backend + at least one stub-provider resource. (Implementer borrows from `cmd/wfctl/testdata/` if a sample exists; otherwise constructs minimally.) + +Run two scenarios — capture transcripts in PR body: + +1. **No-secrets-cfg + sensitive-emitting driver** → expected: hard fail with the documented error message naming the resource and sensitive keys. +2. **env-provider configured + sensitive-emitting driver** → expected: apply succeeds; `env | grep WFCTL_TEST_` shows the routed value; `cat <state-file>.json` shows `secret_ref://...` placeholders, no plaintext. + +If a stub provider is not readily available, this validation runs against `mock` provider + an additional unit test that asserts both behaviours via the helper directly. Document the substitution in the PR body. + +**Rollback note for Task 2:** revert this commit; helper + call-site rewires are additive. Existing state files written under v0.27.0 retain `secret_ref://` placeholders; v0.26.x consumers see them as literal strings (rotate affected secrets via `wfctl infra bootstrap --force-rotate <name>` before downgrading). + +**Step 2.8: Commit** + +```bash +git add cmd/wfctl/infra_apply.go cmd/wfctl/infra_apply_sensitive_routing_test.go \ + cmd/wfctl/infra.go cmd/wfctl/infra_output_secrets.go cmd/wfctl/infra_resolve_state.go +git commit -m "$(cat <<'EOF' +feat(wfctl): persistResourceWithSecretRouting + Apply call-site routing + +Introduces persistResourceWithSecretRouting helper that funnels both +state-write call sites in applyWithProviderAndStore + in-process apply. +Routes per-call Sensitive ResourceOutput fields through secrets.Provider +into "secret_ref://" placeholders before SaveResource. + +On SaveResource failure post-Set, compensates with driver.Delete + +provider.Delete to prevent orphan cloud resources. + +In-memory hydrated map flows from apply through to syncInfraOutputSecrets +for same-process consumers (works on write-only GitHub provider). + +Rollback: revert this commit; the helper is additive and call sites +revert to the prior literal Outputs: r.Outputs shape in one diff. + +Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> +EOF +)" +``` + +--- + +### Task 3: Read/Adoption/Refresh sanitize-only at sites 3, 4, 5 + +**Files:** +- Modify: `cmd/wfctl/infra_apply.go` (`adoptExistingResources` site at :637, `resourceStateFromLiveOutput` builder at :689-705) +- Modify: `cmd/wfctl/infra_refresh_outputs.go` (site at :244) + +**Change class:** Internal logic + apply path (runtime-affecting). Verification: unit tests + integration tests. + +**Rollback:** Revert this commit; the sanitize-only logic is additive. + +**Step 3.1: Write failing tests** + +Append to `cmd/wfctl/infra_apply_sensitive_routing_test.go`: + +```go +func TestPersistResourceWithSecretRouting_ReadModeSanitizeOnly_PreservesPriorPlaceholder(t *testing.T) { + prov := newEnvProvider() // should not be touched + // Pre-existing state has a placeholder + store := &stubStore{ + saved: []interfaces.ResourceState{ + {Name: "myres", Outputs: map[string]any{"secret_key": "secret_ref://myres_secret_key", "bucket": "b"}}, + }, + } + out := interfaces.ResourceOutput{ + Outputs: map[string]any{"bucket": "b"}, // Read can't re-emit secret_key + Sensitive: map[string]bool{"secret_key": true}, + } + rs := interfaces.ResourceState{Name: "myres"} + _, err := persistResourceWithSecretRouting(context.Background(), store, prov, &stubSensitiveDriver{}, rs, out, persistModeRead) + if err != nil { + t.Fatalf("persist read-mode: %v", err) + } + if len(prov.values) != 0 { + t.Errorf("Read mode must NOT call provider.Set; got %v", prov.values) + } + if len(store.saved) != 2 { + t.Fatalf("expected 2 saves (initial + this), got %d", len(store.saved)) + } + latest := store.saved[1] + if latest.Outputs["secret_key"] != "secret_ref://myres_secret_key" { + t.Errorf("Read mode lost prior placeholder: %v", latest.Outputs["secret_key"]) + } + if latest.Outputs["bucket"] != "b" { + t.Errorf("Read mode lost bucket: %v", latest.Outputs["bucket"]) + } +} + +func TestPersistResourceWithSecretRouting_ReadModeNewSensitiveKey_Dropped(t *testing.T) { + // No prior state; driver newly declares sensitive on Read. + prov := newEnvProvider() + store := &stubStore{} + out := interfaces.ResourceOutput{ + Outputs: map[string]any{"secret_key": "FRESH-FROM-CLOUD-CACHE", "bucket": "b"}, + Sensitive: map[string]bool{"secret_key": true}, + } + rs := interfaces.ResourceState{Name: "myres"} + _, err := persistResourceWithSecretRouting(context.Background(), store, prov, &stubSensitiveDriver{}, rs, out, persistModeRead) + if err != nil { + t.Fatalf("persist read-mode: %v", err) + } + if len(prov.values) != 0 { + t.Errorf("Read mode must NOT call provider.Set even for newly-declared sensitive; got %v", prov.values) + } + if len(store.saved) != 1 { + t.Fatalf("expected 1 save, got %d", len(store.saved)) + } + if _, ok := store.saved[0].Outputs["secret_key"]; ok { + t.Errorf("newly-declared sensitive (no prior placeholder) should be dropped; got %v", store.saved[0].Outputs["secret_key"]) + } + if store.saved[0].Outputs["bucket"] != "b" { + t.Errorf("non-sensitive bucket lost: %v", store.saved[0].Outputs["bucket"]) + } +} +``` + +**Step 3.2: Run, confirm fail** + +Run: `cd cmd/wfctl && go test -run TestPersistResourceWithSecretRouting_ReadMode -v` +Expected: PASS already from Task 2's persistReadMode (the helper already supports read mode). If FAIL, fix the implementation per the test expectations. + +If they pass already, great — proceed to call-site rewires. + +**Step 3.3: Rewire site 3 (`adoptExistingResources` at :637)** + +In `cmd/wfctl/infra_apply.go`, the adopt path constructs state from `live *interfaces.ResourceOutput` via `resourceStateFromLiveOutput` and calls `store.SaveResource`. Replace with: + +```go + state, err := resourceStateFromLiveOutput(spec, providerType, live) + if err != nil { + return nil, err + } + if err := validateStateProviderID(provider, providerType, state); err != nil { + return nil, err + } + // Sanitize-only via persistResourceWithSecretRouting in read mode: + // driver may declare Sensitive on Read but we MUST NOT call + // provider.Set from a Read path (cache-pollution risk per design §4.4). + // Provider may be nil; helper is nil-safe in read mode. + if _, err := persistResourceWithSecretRouting(ctx, store, nil, driver, state, *live, persistModeRead); err != nil { + return nil, fmt.Errorf("%s/%s: persist adopted state: %w", spec.Type, spec.Name, err) + } +``` + +Remove the original `if saveErr := store.SaveResource(...)` call directly above (the helper now does it). + +**Step 3.4: Rewire site 4 (`resourceStateFromLiveOutput` at :689-705)** + +This is the builder, not a save site. The `Outputs: cloneMap(live.Outputs)` line is correct as-is for the builder; sanitization happens in step 3.3 via the helper. No change needed at site 4 itself. + +**Step 3.5: Rewire site 5 (`infra_refresh_outputs.go:244`)** + +```go + // Replace: + // if err := store.SaveResource(ctx, fresh); err != nil { + // ... + // } + // With: + ro := interfaces.ResourceOutput{ + Name: fresh.Name, Type: fresh.Type, ProviderID: fresh.ProviderID, + Outputs: fresh.Outputs, + } + // driver.Sensitive is per-call; reconstruct from driver.SensitiveKeys + // for backward-compat — refresh path doesn't have the per-call map. + if drv, derr := provider.ResourceDriver(fresh.Type); derr == nil && drv != nil { + sk := drv.SensitiveKeys() + if len(sk) > 0 { + ro.Sensitive = make(map[string]bool, len(sk)) + for _, k := range sk { + ro.Sensitive[k] = true + } + } + } + if _, err := persistResourceWithSecretRouting(ctx, store, nil, nil, fresh, ro, persistModeRead); err != nil { + return fmt.Errorf("refresh outputs %s: %w", fresh.Name, err) + } +``` + +The refresh path uses `SensitiveKeys()` (the static driver declaration) as the masking source since refresh works from the state record, not from a fresh per-call `ResourceOutput.Sensitive` map. This is the **only** consumer of `SensitiveKeys()` for sanitization (still distinct from routing — Read paths NEVER route). + +**Step 3.6: Run all tests** + +Run: `go test ./cmd/wfctl/... ./iac/sensitive/...` +Expected: PASS. Existing adoption + refresh tests must continue to pass; if they assert on `state.Outputs[<sensitive>]` value, update to match the sanitize-only behaviour (placeholder if prior state; absent if no prior). + +**Step 3.7: Run race + vet + lint** + +Run: +- `go test -race ./cmd/wfctl/... ./iac/sensitive/...` → PASS +- `go vet ./...` → no output +- `golangci-lint run ./cmd/wfctl/...` → no findings + +**Step 3.7.bis: Runtime-launch-validation smoke for refresh path** + +Task 3 modifies the refresh + adoption runtime paths. Build wfctl and exercise: + +```sh +go build -o wfctl ./cmd/wfctl +# Using the same fixture from Step 2.7.bis: +cd /tmp/wfctl-routing-smoke +# After Task 2's apply ran successfully (placeholders in state): +./wfctl infra refresh-outputs -c infra-with-env.yaml 2>&1 | tee refresh.log +``` + +Expected: refresh completes; state file's `secret_ref://...` placeholder for `secret_key` is preserved (verified via `cat <state-file>.json | grep secret_ref`); env vars (the routed-secret store) are NOT modified by the Read path (verify by snapshotting `env | grep WFCTL_TEST_` before and after — should be identical). + +**Rollback note for Task 3:** revert this commit; sites revert to direct `store.SaveResource(ctx, fresh)`. State written under Task 3 has the same shape as Task 2 — same downgrade considerations apply. + +**Step 3.8: Commit** + +```bash +git add cmd/wfctl/infra_apply.go cmd/wfctl/infra_refresh_outputs.go cmd/wfctl/infra_apply_sensitive_routing_test.go +git commit -m "$(cat <<'EOF' +feat(wfctl): sanitize-only routing for adoption + refresh paths + +Read paths (adoptExistingResources, runInfraRefreshOutputs) now route +state writes through persistResourceWithSecretRouting in persistModeRead. +The helper inherits placeholders from prior state and drops +newly-declared sensitive keys — never calls provider.Set from a Read +path (prevents cache pollution per design §4.4). + +Rollback: revert this commit; sites revert to direct store.SaveResource. + +Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> +EOF +)" +``` + +--- + +### Task 4: `wfctl infra audit-state-secrets` command + +**Files:** +- Create: `cmd/wfctl/infra_audit_state_secrets.go` +- Create: `cmd/wfctl/infra_audit_state_secrets_test.go` +- Modify: `cmd/wfctl/infra.go` (subcommand wiring) + +**Change class:** New CLI command. Verification: unit tests + `wfctl infra audit-state-secrets --help` + representative invocation (CLI command class). + +**Rollback:** Revert this commit; command is additive. + +**Step 4.1: Write failing tests for `audit-state-secrets`** + +Create `cmd/wfctl/infra_audit_state_secrets_test.go`: + +```go +package main + +import ( + "bytes" + "context" + "strings" + "testing" + + "github.com/GoCodeAlone/workflow/iac/sensitive" + "github.com/GoCodeAlone/workflow/interfaces" + "github.com/GoCodeAlone/workflow/secrets" +) + +func TestAuditStateSecrets_NoFindings_ExitZero(t *testing.T) { + store := &stubStore{ + saved: []interfaces.ResourceState{ + {Name: "ok", Outputs: map[string]any{"bucket": "b", "region": "nyc3"}}, + }, + } + prov := newEnvProvider() + w := &bytes.Buffer{} + rc := runAuditStateSecrets(context.Background(), w, store, prov) + if rc != 0 { + t.Errorf("rc = %d, want 0", rc) + } + if !strings.Contains(w.String(), "no findings") { + t.Errorf("expected 'no findings', got: %s", w.String()) + } +} + +func TestAuditStateSecrets_OrphanInProvider(t *testing.T) { + // Provider has a routed-secret "ghost_secret_key" but state has no ghost resource. + store := &stubStore{ + saved: []interfaces.ResourceState{{Name: "live", Outputs: map[string]any{"bucket": "b"}}}, + } + prov := newEnvProvider() + prov.values["ghost_secret_key"] = "ORPHAN" + w := &bytes.Buffer{} + rc := runAuditStateSecrets(context.Background(), w, store, prov) + if rc != 1 { + t.Errorf("rc = %d, want 1", rc) + } + if !strings.Contains(w.String(), "orphan") || !strings.Contains(w.String(), "ghost_secret_key") { + t.Errorf("expected orphan finding for ghost_secret_key; got: %s", w.String()) + } +} + +func TestAuditStateSecrets_LegacyPlaintext(t *testing.T) { + store := &stubStore{ + saved: []interfaces.ResourceState{ + {Name: "legacy", Outputs: map[string]any{"secret_key": "PLAINTEXT-SECRET"}}, + }, + } + prov := newEnvProvider() + w := &bytes.Buffer{} + rc := runAuditStateSecrets(context.Background(), w, store, prov) + if rc != 1 { + t.Errorf("rc = %d, want 1", rc) + } + if !strings.Contains(w.String(), "legacy plaintext") || !strings.Contains(w.String(), "legacy") { + t.Errorf("expected legacy plaintext finding; got: %s", w.String()) + } +} + +func TestAuditStateSecrets_PlaceholderMissingValue(t *testing.T) { + // State has a placeholder but provider doesn't have the secret. + store := &stubStore{ + saved: []interfaces.ResourceState{ + {Name: "broken", Outputs: map[string]any{"secret_key": "secret_ref://broken_secret_key"}}, + }, + } + prov := newEnvProvider() + w := &bytes.Buffer{} + rc := runAuditStateSecrets(context.Background(), w, store, prov) + if rc != 1 { + t.Errorf("rc = %d, want 1", rc) + } + if !strings.Contains(w.String(), "missing routed value") || !strings.Contains(w.String(), "broken_secret_key") { + t.Errorf("expected missing-routed-value finding; got: %s", w.String()) + } +} + +func TestAuditStateSecrets_MistakenSecretConfigRefInState(t *testing.T) { + // State contains a "secret://" string (user-config syntax leaked into state). + store := &stubStore{ + saved: []interfaces.ResourceState{ + {Name: "weird", Outputs: map[string]any{"token": "secret://my_token"}}, + }, + } + prov := newEnvProvider() + w := &bytes.Buffer{} + rc := runAuditStateSecrets(context.Background(), w, store, prov) + if rc != 1 { + t.Errorf("rc = %d, want 1", rc) + } + if !strings.Contains(w.String(), "config-reference in state") { + t.Errorf("expected config-reference-in-state finding; got: %s", w.String()) + } +} + +func TestAuditStateSecrets_Prune_DeletesOrphans(t *testing.T) { + store := &stubStore{ + saved: []interfaces.ResourceState{{Name: "live", Outputs: map[string]any{"bucket": "b"}}}, + } + prov := newEnvProvider() + prov.values["ghost_secret_key"] = "ORPHAN" + w := &bytes.Buffer{} + rc := runAuditStateSecretsWithPrune(context.Background(), w, store, prov, true) + if rc != 0 { // 0 because pruning resolves the issue + t.Errorf("rc = %d, want 0 after prune", rc) + } + if _, ok := prov.values["ghost_secret_key"]; ok { + t.Errorf("orphan secret not pruned: %v", prov.values) + } +} + +func TestAuditStateSecrets_ListUnsupported_ReportsAdvisory(t *testing.T) { + // GitHub-style provider returning ErrUnsupported for Get; we should still + // audit state-side findings and emit a structured "list unsupported" advisory. + store := &stubStore{ + saved: []interfaces.ResourceState{ + {Name: "ok", Outputs: map[string]any{"secret_key": "secret_ref://ok_secret_key"}}, + }, + } + prov := &writeOnlyProvider{} + w := &bytes.Buffer{} + rc := runAuditStateSecrets(context.Background(), w, store, prov) + // rc 1 because we can't verify the routed value; OR rc 0 with advisory only. + // Design choice: rc 1 (cannot verify), but the advisory is informational. + if rc == 2 { + t.Errorf("rc=2 reserved for hard audit errors; should not fire on write-only providers") + } + if !strings.Contains(w.String(), "list unsupported") && !strings.Contains(w.String(), "Get unsupported") { + t.Errorf("expected write-only provider advisory; got: %s", w.String()) + } +} + +// writeOnlyProvider mimics GitHubSecretsProvider Get/List ErrUnsupported. +type writeOnlyProvider struct{ envProvider } + +func newWriteOnlyProvider() *writeOnlyProvider { return &writeOnlyProvider{envProvider{values: map[string]string{}}} } +func (p *writeOnlyProvider) Get(_ context.Context, _ string) (string, error) { return "", secrets.ErrUnsupported } +func (p *writeOnlyProvider) List(_ context.Context) ([]string, error) { return nil, secrets.ErrUnsupported } +``` + +**Step 4.2: Run, confirm fail** + +Run: `cd cmd/wfctl && go test -run TestAuditStateSecrets -v` +Expected: FAIL — `runAuditStateSecrets` undefined. + +**Step 4.3: Implement `cmd/wfctl/infra_audit_state_secrets.go`** + +```go +package main + +import ( + "context" + "errors" + "flag" + "fmt" + "io" + "sort" + "strings" + + "github.com/GoCodeAlone/workflow/iac/sensitive" + "github.com/GoCodeAlone/workflow/interfaces" + "github.com/GoCodeAlone/workflow/secrets" +) + +// runInfraAuditStateSecrets is the CLI entry point for +// `wfctl infra audit-state-secrets`. +// +// Walks every entry in IaCStateStore. For each Outputs[k] that is: +// - a "secret_ref://<name>" placeholder → confirm secrets.Provider has <name>. +// - a plaintext value matching secrets.DefaultSensitiveKeys() → flag legacy. +// - a "secret://<key>" string → flag mistaken config-reference in state. +// Then walks secrets.Provider.List() (when supported) for any +// "<resource>_<key>" name whose <resource> is NOT in IaCStateStore → +// orphan, candidate for prune. +// +// Exit codes: +// 0 no findings +// 1 findings (legacy plaintext, missing routed values, orphan secrets, +// mistaken config-references) +// 2 audit error (cannot read state, etc.) +func runInfraAuditStateSecrets(args []string, w io.Writer) int { + fs := flag.NewFlagSet("infra audit-state-secrets", flag.ContinueOnError) + fs.SetOutput(w) + var configFile string + fs.StringVar(&configFile, "c", "infra.yaml", "Config file") + fs.StringVar(&configFile, "config", "infra.yaml", "Config file") + var prune bool + fs.BoolVar(&prune, "prune", false, "Delete confirmed orphan secrets from secrets.Provider") + if err := fs.Parse(args); err != nil { + return 2 + } + + cfg, err := parseSecretsConfig(configFile) + if err != nil { + fmt.Fprintf(w, "audit-state-secrets: parse %q: %v\n", configFile, err) + return 2 + } + if cfg == nil { + fmt.Fprintln(w, "audit-state-secrets: no secrets config; nothing to audit") + return 0 + } + prov, err := resolveSecretsProvider(cfg) + if err != nil { + fmt.Fprintf(w, "audit-state-secrets: resolve provider: %v\n", err) + return 2 + } + + store, err := openInfraStateStore(configFile) + if err != nil { + fmt.Fprintf(w, "audit-state-secrets: open state store: %v\n", err) + return 2 + } + defer store.Close() + + return runAuditStateSecretsWithPrune(context.Background(), w, store, prov, prune) +} + +// runAuditStateSecrets is the testable entry point (no flag parsing). +func runAuditStateSecrets(ctx context.Context, w io.Writer, store interfaces.IaCStateStore, prov secrets.Provider) int { + return runAuditStateSecretsWithPrune(ctx, w, store, prov, false) +} + +func runAuditStateSecretsWithPrune(ctx context.Context, w io.Writer, store interfaces.IaCStateStore, prov secrets.Provider, prune bool) int { + states, err := store.ListResources(ctx) + if err != nil { + fmt.Fprintf(w, "audit-state-secrets: list state: %v\n", err) + return 2 + } + + findings := 0 + stateNames := map[string]struct{}{} + for i := range states { + stateNames[states[i].Name] = struct{}{} + } + + defaultSensitive := map[string]struct{}{} + for _, k := range secrets.DefaultSensitiveKeys() { + defaultSensitive[k] = struct{}{} + } + + // Walk state for placeholder/plaintext/config-ref findings. + for i := range states { + st := &states[i] + // Stable iteration order for output. + keys := make([]string, 0, len(st.Outputs)) + for k := range st.Outputs { + keys = append(keys, k) + } + sort.Strings(keys) + + for _, k := range keys { + v := st.Outputs[k] + s, isStr := v.(string) + if !isStr { + continue + } + switch { + case sensitive.IsPlaceholder(v): + secretName := strings.TrimPrefix(s, sensitive.PlaceholderPrefix) + _, getErr := prov.Get(ctx, secretName) + if getErr == nil { + // All good. + continue + } + if errors.Is(getErr, secrets.ErrUnsupported) { + fmt.Fprintf(w, "ADVISORY (list unsupported / Get unsupported): cannot verify routed value for %s/%s -> %q on this provider\n", st.Name, k, secretName) + continue + } + if errors.Is(getErr, secrets.ErrNotFound) { + fmt.Fprintf(w, "FINDING (missing routed value): %s/%s expects routed secret %q but provider does not have it\n", st.Name, k, secretName) + findings++ + } + case strings.HasPrefix(s, secrets.SecretPrefix): + fmt.Fprintf(w, "FINDING (config-reference in state): %s/%s contains user-config-style %q (expected resolved value or %s placeholder)\n", st.Name, k, s, sensitive.PlaceholderPrefix) + findings++ + default: + if _, isSensName := defaultSensitive[k]; isSensName && s != "" { + fmt.Fprintf(w, "FINDING (legacy plaintext): %s/%s = <plaintext>; rotate via wfctl infra bootstrap --force-rotate or re-apply\n", st.Name, k) + findings++ + } + } + } + } + + // Walk provider for orphan secrets. + names, err := prov.List(ctx) + switch { + case err == nil: + sort.Strings(names) + for _, name := range names { + // Names follow "<resource>_<output_key>". Strip the suffix that matches + // any sensitive output key we know; if the prefix isn't a state name, + // it's an orphan. + res := stripKnownSuffix(name) + if _, ok := stateNames[res]; ok { + continue + } + if prune { + if delErr := prov.Delete(ctx, name); delErr != nil { + fmt.Fprintf(w, "PRUNE FAILED: %q: %v\n", name, delErr) + findings++ + } else { + fmt.Fprintf(w, "pruned orphan secret %q\n", name) + } + continue + } + fmt.Fprintf(w, "FINDING (orphan secret): %q has no matching state resource; rerun with --prune to delete\n", name) + findings++ + } + case errors.Is(err, secrets.ErrUnsupported): + fmt.Fprintln(w, "ADVISORY (list unsupported): provider does not support List(); orphan-secret detection skipped on this host") + default: + fmt.Fprintf(w, "audit-state-secrets: list provider secrets: %v\n", err) + return 2 + } + + if findings > 0 { + fmt.Fprintf(w, "\naudit-state-secrets: %d finding(s)\n", findings) + return 1 + } + fmt.Fprintln(w, "audit-state-secrets: no findings") + return 0 +} + +// stripKnownSuffix returns the resource-name prefix of a routed-secret name. +// Tries DefaultSensitiveKeys suffixes; falls back to the original name (which +// will then fail the state-name lookup and be flagged orphan). +func stripKnownSuffix(name string) string { + for _, k := range secrets.DefaultSensitiveKeys() { + suf := "_" + k + if strings.HasSuffix(name, suf) { + return name[:len(name)-len(suf)] + } + } + return name +} +``` + +Wire the subcommand in `cmd/wfctl/infra.go` (around the existing `case "audit-secrets":` dispatch at line 88): + +```go +case "audit-state-secrets": + if rc := runInfraAuditStateSecrets(args[1:], os.Stdout); rc != 0 { + return fmt.Errorf("audit-state-secrets exited with code %d", rc) + } + return nil +``` + +Update help text at line 116: + +``` + audit-state-secrets Audit state.Outputs vs. secrets.Provider for orphans, legacy, missing +``` + +**Step 4.5: Run tests, confirm pass** + +Run: `cd cmd/wfctl && go test -run TestAuditStateSecrets -v` +Expected: All 7 TestAuditStateSecrets* tests PASS. + +Run: `go test ./cmd/wfctl/...` +Expected: full suite PASS. + +**Step 4.6: Manual command verification** + +Run: +``` +cd /Users/jon/workspace/workflow/_worktrees/engine-sensitive-output-routing && go build -o wfctl ./cmd/wfctl +./wfctl infra audit-state-secrets --help +``` +Expected: usage text including `--config`, `--prune`. Exit 0. + +``` +./wfctl infra audit-state-secrets -c testdata/empty-secrets.yaml +``` +(implementer creates `testdata/empty-secrets.yaml` if needed for smoke; expected output: "no findings"; exit 0) + +**Step 4.7: Run race + vet + lint** + +Run: +- `go test -race ./cmd/wfctl/... ./iac/sensitive/...` → PASS +- `go vet ./...` → no output +- `golangci-lint run ./cmd/wfctl/... ./iac/sensitive/...` → no findings + +**Step 4.8: Commit** + +```bash +git add cmd/wfctl/infra_audit_state_secrets.go cmd/wfctl/infra_audit_state_secrets_test.go \ + cmd/wfctl/infra.go +git commit -m "$(cat <<'EOF' +feat(wfctl): infra audit-state-secrets command + +Adds wfctl infra audit-state-secrets command per design §4.7: + - placeholder-without-routed-value findings + - legacy plaintext-in-state findings + - mistaken secret://... config-reference findings + - orphan secret findings (provider has it; no matching state) + - --prune to delete confirmed orphans + +Distinct from existing audit-secrets which audits secrets.generate +config block for anti-patterns. + +Rollback: revert this commit; command is additive. + +Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> +EOF +)" +``` + +--- + +### Task 5: Drift masking enumeration + wiring (or documented no-op) + +**Files:** +- Modify: `cmd/wfctl/infra_apply_refresh.go` (if Diff call sites exist) +- Possibly modify: any other `cmd/wfctl/*.go` that calls `driver.Diff` against state-Outputs + +**Change class:** Internal logic; conditional on enumeration result. Verification: unit tests OR documented enumeration that no in-tree call sites exist. + +**Rollback:** Revert this commit (if any code change); enumeration is documentation-only and harmless to keep. + +**Step 5.1: Enumerate every in-tree Diff call site that sees state-Outputs** + +Run: +```sh +grep -rn "driver\.Diff(\|d\.Diff(\|\.Diff(ctx," --include="*.go" cmd/wfctl/ iac/ +``` + +For each match, classify: +- (a) Receives state.Outputs (potentially with placeholders) as `current` → MUST mask before Diff. +- (b) Receives only fresh cloud Outputs (live Read result) → no masking needed (Read paths are sanitize-only post-Task 3 anyway). +- (c) Test files → skip; tests pass synthetic data. + +**Step 5.2a: If (a)-class sites exist, wire `sensitive.MaskSensitiveForDiff`** + +For each (a)-class site: + +```go +import "github.com/GoCodeAlone/workflow/iac/sensitive" + +// Replace: +result, err := driver.Diff(ctx, desiredSpec, currentOutput) +// With: +maskedDesired, maskedCurrent := sensitive.MaskSensitiveForDiff(driver.SensitiveKeys(), desiredSpec.Config, currentOutput.Outputs) +maskedSpec := desiredSpec +maskedSpec.Config = maskedDesired +maskedOut := *currentOutput +maskedOut.Outputs = maskedCurrent +result, err := driver.Diff(ctx, maskedSpec, &maskedOut) +``` + +Add a unit test per call site that verifies the masking is in effect (state has placeholder; Diff receives a map without the sensitive key). + +**Step 5.2b: If NO (a)-class sites exist, document the no-op** + +Add a comment to `cmd/wfctl/infra_apply_refresh.go` (top-of-file or above the refresh logic): + +```go +// Note: as of v0.27.0, no in-tree call site dispatches driver.Diff against +// state.Outputs that may contain sensitive.PlaceholderPrefix entries. +// Per-provider Diff implementations receive desired/current via gRPC and +// are out of scope for engine-side masking. iac/sensitive.MaskSensitiveForDiff +// is exported for future in-tree consumers. +``` + +Add `iac/sensitive/route_test.go` test `TestMaskSensitiveForDiff_*` (already in Task 1) is the validation surface. + +**Step 5.3: Run tests + lint** + +Run: +- `go test ./cmd/wfctl/... ./iac/sensitive/...` → PASS +- `go vet ./...` → no output +- `golangci-lint run ./cmd/wfctl/...` → no findings + +**Step 5.4: Commit** + +```bash +# If 5.2a path: +git add cmd/wfctl/infra_apply_refresh.go cmd/wfctl/<other-modified>.go +# If 5.2b path: +git add cmd/wfctl/infra_apply_refresh.go + +git commit -m "$(cat <<'EOF' +feat(wfctl): drift-masking wiring for sensitive state outputs + +[For 5.2a: cite specific files] OR +[For 5.2b: documents that no in-tree Diff call site sees state.Outputs +with placeholders; iac/sensitive.MaskSensitiveForDiff stays exported +for future consumers.] + +Rollback: revert this commit; masking is additive. + +Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> +EOF +)" +``` + +--- + +### Task 6: Documentation update + DOCUMENTATION.md / WFCTL.md / CHANGELOG + +**Files:** +- Modify: `DOCUMENTATION.md` +- Modify: `docs/WFCTL.md` +- Modify: `CHANGELOG.md` + +**Change class:** Documentation. Verification: spell-check + render preview + rollback validation (Step 6.4). + +**Step 6.1: Add a section to `DOCUMENTATION.md` titled "Sensitive output routing"** + +Append (or insert in an appropriate place near existing IaC sections): + +```markdown +### Sensitive output routing (v0.27.0+) + +When a `ResourceDriver.Create` or `Update` returns a `*ResourceOutput` +with `Sensitive: {key: true}`, the engine routes those output values +through the configured `secrets.Provider` instead of writing them +plaintext to state. + +**Routed secret naming:** `<resource_name>_<output_key>` (e.g., a +resource named `coredump-deploy-key` with `secret_key` output is stored +under `coredump-deploy-key_secret_key`). + +**State placeholder:** Routed fields appear in `state.Outputs` as +`secret_ref://<resource>_<key>` strings. Distinct from the user-supplied +`secret://<key>` config-reference convention. + +**Required configuration:** A `secrets:` block in your workflow config +(see `secrets.* providers` below). For local/ad-hoc runs, the simplest +option is environment variables: + +```yaml +secrets: + provider: env + config: + prefix: WORKFLOW_ +``` + +**Failure modes:** + +- `secrets.Provider` not configured AND a driver emits sensitive outputs: + apply hard-fails with a named-resource error pointing to this section. +- `secrets.Provider.Set` fails: apply errors; rerun is idempotent. +- `state.SaveResource` fails after `Set` succeeded: the engine compensates + by calling `driver.Delete` + `provider.Delete`, then surfaces a + combined error. + +**Recovery:** `wfctl infra audit-state-secrets` audits state for orphan +secrets, missing routed values, and legacy plaintext fields. Run it +after a failed apply to triage: + +``` +wfctl infra audit-state-secrets --config infra.yaml [--prune] +``` + +**Read paths (refresh, adoption):** never call `provider.Set`; placeholders +are inherited from prior state to avoid cache pollution. + +**Drift detection:** sensitive keys are masked from both desired and +current sides before `driver.Diff` to avoid false-positive drift on +keys that the cloud refuses to re-emit (e.g., DO Spaces `secret_key`). + +**Cold-start consumers:** `secret_ref://` placeholders cannot be +rehydrated cross-process on write-only providers (GitHub Actions). +Same-process consumers (`infra_output:` generators in the same +`wfctl infra apply` invocation) get the routed value via in-memory +hand-off. Cross-process consumers reference the routed secret by name +via `secret://<resource>_<key>` directly. +``` + +**Step 6.2: Add `audit-state-secrets` to `docs/WFCTL.md`** + +In the `wfctl infra` section, add after `audit-secrets`: + +```markdown +#### infra audit-state-secrets + +Audit `state.Outputs` against the configured `secrets.Provider` for orphan +secrets, missing routed values, legacy plaintext, and mistaken +config-references in state. + +``` +wfctl infra audit-state-secrets [--config infra.yaml] [--prune] +``` + +**Findings:** + +- **orphan secret** — provider has `<resource>_<key>` but no state + resource named `<resource>` exists. +- **missing routed value** — state has `secret_ref://...` placeholder + but provider does not have the secret. +- **legacy plaintext** — state has plaintext value at a key matching + `secrets.DefaultSensitiveKeys()` (e.g., `secret_key`, `password`). +- **config-reference in state** — state contains `secret://...` (user + config syntax leaked into a persisted state field). + +**Exit codes:** 0 = no findings; 1 = findings; 2 = audit error. + +**`--prune`:** delete confirmed orphan secrets from the provider. +Idempotent; safe to rerun. + +Distinct from `audit-secrets` which audits the `secrets.generate` config +block for anti-patterns. Run both as part of regular hygiene. +``` + +**Step 6.3: Add CHANGELOG entry** + +In `CHANGELOG.md`, under the next-version section (v0.27.0): + +```markdown +## v0.27.0 (unreleased) + +### Added + +- **Engine-side sensitive-output routing**: `ResourceDriver` outputs + flagged with `Sensitive: {key: true}` on Create/Update are routed + through the configured `secrets.Provider` and replaced in state with + `secret_ref://...` placeholders. Plugins remain platform-agnostic. + See `DOCUMENTATION.md#sensitive-output-routing` and design doc + `docs/plans/2026-05-09-engine-sensitive-output-routing-design.md`. +- New `iac/sensitive` package with `Route`, `Revoke`, `IsPlaceholder`, + and `MaskSensitiveForDiff` free functions. +- New `wfctl infra audit-state-secrets` command (with `--prune`) to + audit state vs. secrets.Provider for orphans, legacy plaintext, + missing routed values, and mistaken config-references. +- Drift masking (`MaskSensitiveForDiff`) prevents false-positive + drift on sensitive keys where the cloud refuses to re-emit. + +### Changed + +- `applyWithProviderAndStore` and the in-process apply path now funnel + state writes through `persistResourceWithSecretRouting`. On + `SaveResource` failure after `provider.Set` succeeded, the engine + invokes a compensating `driver.Delete` + `provider.Delete` to prevent + orphan cloud resources. +- `adoptExistingResources` and `runInfraRefreshOutputs` now use + sanitize-only persistence (no `provider.Set` from Read paths) to + prevent cache pollution. +- `syncInfraOutputSecrets` accepts a `hydrated` map for same-process + routed-secret hand-off. + +### Migration + +- Existing plugins continue to work unchanged; sensitive-output routing + is opt-in via `ResourceOutput.Sensitive`. +- Operators with pre-existing state records containing plaintext + secrets: run `wfctl infra audit-state-secrets` to enumerate; rotate + via `wfctl infra bootstrap --force-rotate <name>`. +- Apply runs on plugins that newly emit `Sensitive` outputs require a + `secrets:` configuration block (recommend `provider: env` with a + prefix for local runs). + +### Rollback + +Pin `setup-wfctl@v0.26.x` and rebuild. State records written under +v0.27.0 contain `secret_ref://` placeholders that v0.26.x does not +understand; rotate affected secrets first or manually edit state to +inline the value from `secrets.Provider`. +``` + +**Step 6.4: Validate rollback claim** + +The CHANGELOG / design §8 claims rollback path: "pin `setup-wfctl@v0.26.x` and rebuild; rotate affected secrets first." Validate the claim manually: + +1. Build the v0.27.0 wfctl (current branch) and apply the env-provider fixture from Step 2.7.bis. Confirm state file contains `secret_ref://...` placeholder strings. +2. Inspect what a v0.26.x consumer would do with the placeholder. Build a v0.26.x wfctl from the prior tagged release: + +```sh +cd /tmp && git clone --branch v0.26.x https://github.com/GoCodeAlone/workflow.git wfctl-026 || true +cd wfctl-026 && go build -o /tmp/wfctl-026 ./cmd/wfctl +``` + +(If `v0.26.x` tag doesn't yet exist — workflow's most recent release is the prior tag — substitute the most recent stable release tag and document the substitution.) + +3. Run the v0.26.x wfctl against the v0.27.0-written state: + +```sh +cd /tmp/wfctl-routing-smoke +/tmp/wfctl-026 infra outputs -c infra-with-env.yaml 2>&1 | tee rollback.log +``` + +Expected: outputs literally include `secret_ref://...` strings (not crashes; not hangs). Document the actual behaviour in CHANGELOG §Rollback. If a downstream consumer (e.g., infra_output secret generator) processes the placeholder as a literal value, document the manual recovery step (rotate via `wfctl infra bootstrap --force-rotate <name>` running v0.27.0 first). + +If a v0.26.x build is not feasible (older Go module dependencies, tag missing), document this in CHANGELOG with a one-line note: "Rollback validation deferred — see PR <N> for explicit verification once a v0.26.x branch is available." + +**Step 6.5: Spell-check + verify cross-references** + +Run: +- `grep -n "secret_ref://" docs/ DOCUMENTATION.md CHANGELOG.md` → all references consistent. +- `grep -n "audit-state-secrets" docs/ DOCUMENTATION.md CHANGELOG.md cmd/wfctl/*.go` → wiring matches. + +Expected: cross-references resolve; no broken anchors. + +**Step 6.6: Commit** + +```bash +git add DOCUMENTATION.md docs/WFCTL.md CHANGELOG.md +git commit -m "$(cat <<'EOF' +docs: sensitive-output routing + audit-state-secrets command + +- DOCUMENTATION.md: new "Sensitive output routing" section explaining + Sensitive flag, secret_ref:// placeholder format, failure modes, + recovery, and read-path semantics. +- docs/WFCTL.md: new "infra audit-state-secrets" command reference. +- CHANGELOG.md: v0.27.0 entry covering Added / Changed / Migration / + Rollback sections. + +Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> +EOF +)" +``` + +--- + +## Final verification (before PR) + +After Task 6, the implementer runs the **full local CI gate**: + +```sh +cd /Users/jon/workspace/workflow/_worktrees/engine-sensitive-output-routing + +# 1. Full test suite +go test ./... + +# 2. Race detector +go test -race ./iac/sensitive/... ./cmd/wfctl/... + +# 3. Vet +go vet ./... + +# 4. Lint +golangci-lint run ./... + +# 5. Build wfctl +go build -o wfctl ./cmd/wfctl +./wfctl infra audit-state-secrets --help + +# 6. Verify no missed Outputs literal regressions (regression guard) +grep -rn "Outputs:.*r\.Outputs\|Outputs:.*live\.Outputs\|Outputs:.*imported\.Outputs" cmd/wfctl/*.go +# Expected output: +# infra_state_store.go:318+341 — display layer, OK +# infra.go:1101 — Import path, OUT OF SCOPE per Scope Manifest (operator runs +# audit-state-secrets post-import to triage; routing imported +# state requires a behavioural decision out of scope here) +# Anything else: must be triaged + either rewired through +# persistResourceWithSecretRouting or documented as out-of-scope. +``` + +All five steps must pass. If a regression appears, fix it before opening the PR. + +--- + +## PR checklist + +When opening the PR (`feat/engine-sensitive-output-routing` → `main`): + +- [ ] Title: `feat(iac): engine-side sensitive-output routing through secrets.Provider` +- [ ] Body links to design doc + plan doc. +- [ ] All 5 commits present. +- [ ] CI green (build, test, race, vet, lint, strict-contracts). +- [ ] Copilot review requested. +- [ ] CHANGELOG.md updated under v0.27.0. +- [ ] Rollback note in PR body matches §8 of design doc. diff --git a/docs/plans/2026-05-09-engine-sensitive-output-routing.md.scope-lock b/docs/plans/2026-05-09-engine-sensitive-output-routing.md.scope-lock new file mode 100644 index 00000000..9d97d08d --- /dev/null +++ b/docs/plans/2026-05-09-engine-sensitive-output-routing.md.scope-lock @@ -0,0 +1,5 @@ +f7a8e190ca8bc264252d2609ba856bdd979b3a5eaa3265d0d33f1dc85055c361 docs/plans/2026-05-09-engine-sensitive-output-routing.md (manifest section, lines 15-39) +locked_at: 2026-05-09T08:11:07Z +pr_count: 1 +tasks: 6 +branch: feat/engine-sensitive-output-routing diff --git a/iac/sensitive/route.go b/iac/sensitive/route.go new file mode 100644 index 00000000..7d5024f5 --- /dev/null +++ b/iac/sensitive/route.go @@ -0,0 +1,241 @@ +// Package sensitive routes ResourceOutput fields flagged as Sensitive +// through a secrets.Provider, returning sanitized placeholders for state +// persistence and a hydrated map for in-process consumers. +// +// Per the engine-sensitive-output-routing design (workflow v0.27.0): +// - Route is invoked on Create/Update only. Read/Adoption/Refresh paths +// use Sanitize-only logic (not in this package — see +// cmd/wfctl/infra_apply.go) to prevent cache pollution. +// - The placeholder format "secret_ref://<resource>_<key>" is distinct +// from the user-supplied "secret://<key>" config-reference convention. +// - Routing trigger is exclusively out.Sensitive[k]==true (per-call +// dynamic). ResourceDriver.SensitiveKeys() is NOT consulted here; +// it remains a display-masking signal. +// +// Limitation (v0.27.0): only string-typed sensitive output values are +// supported. Non-string sensitive outputs (e.g., []byte, int) yield an +// error from Route. Future expansion via a MarshalSensitive interface +// is out of scope. +package sensitive + +import ( + "context" + "errors" + "fmt" + "sort" + "strings" + + "github.com/GoCodeAlone/workflow/interfaces" + "github.com/GoCodeAlone/workflow/secrets" +) + +// PlaceholderPrefix is the URI scheme used in state.Outputs values to +// reference a routed secret stored in the configured secrets.Provider. +// Distinct from secrets.SecretPrefix ("secret://") which is for +// user-supplied config references. +const PlaceholderPrefix = "secret_ref://" + +// SecretKey returns the canonical secrets.Provider key for a resource's +// output: "<resourceName>_<outputKey>". Exported so audit-state-secrets +// and other consumers can reverse-engineer routed-secret names. +func SecretKey(resourceName, outputKey string) string { + return resourceName + "_" + outputKey +} + +// Placeholder returns the canonical "secret_ref://<resourceName>_<outputKey>" +// string that replaces a routed value in state.Outputs. +func Placeholder(resourceName, outputKey string) string { + return PlaceholderPrefix + SecretKey(resourceName, outputKey) +} + +// IsPlaceholder reports whether v is a string with the PlaceholderPrefix. +// Non-string values return false. +func IsPlaceholder(v any) bool { + s, ok := v.(string) + if !ok { + return false + } + return strings.HasPrefix(s, PlaceholderPrefix) +} + +// Route routes sensitive fields from out through provider, keying each +// secret as "<resourceName>_<outputKey>". Returns: +// +// - sanitized: a copy of out.Outputs with sensitive values replaced by +// PlaceholderPrefix + SecretKey(resourceName, k). Suitable for +// persistence to interfaces.IaCStateStore. +// - hydrated: a flat map keyed by SecretKey of values that were routed. +// Suitable for in-process hand-off to post-apply consumers in the +// same wfctl invocation. Empty when no fields were routed. +// +// Routing trigger is out.Sensitive[k] == true with out.Outputs[k] present +// (any value, including empty string). When the sensitive key's value is +// absent from out.Outputs the key is silently SKIPPED — neither +// provider.Set is called nor a placeholder inserted (the engine has no +// value to route; existing routed-secret in provider stays as-is). +// +// Errors: +// - resourceName == "" with non-empty Sensitive map → error (defensive; +// out.Name is intentionally NOT consulted, since Read/Adoption paths +// may have empty out.Name). +// - provider == nil with non-empty Sensitive map AND any sensitive key +// present in out.Outputs → error naming the resource and keys. +// - provider.Set returns an error → error wrapping the failed key. Set +// is invoked in sorted order by key for determinism; on first error +// the loop stops and previously-Set values are NOT cleaned up +// (idempotent overwrite on Apply rerun is the recovery path). +// +// Out is not mutated. +func Route( + ctx context.Context, + provider secrets.Provider, + resourceName string, + out *interfaces.ResourceOutput, +) (sanitized map[string]any, hydrated map[string]string, err error) { + if out == nil { + return nil, nil, fmt.Errorf("sensitive.Route: out is nil") + } + // Build sanitized as a copy of Outputs (or empty map). Hydrated is + // allocated lazily — kept nil-or-empty when no routing happens. + sanitized = make(map[string]any, len(out.Outputs)) + for k, v := range out.Outputs { + sanitized[k] = v + } + + // Collect the sensitive keys whose value is present in Outputs. + // Sort for deterministic Set order. + var routableKeys []string + for k, flag := range out.Sensitive { + if !flag { + continue + } + if _, present := out.Outputs[k]; !present { + continue + } + routableKeys = append(routableKeys, k) + } + sort.Strings(routableKeys) + + if len(routableKeys) == 0 { + return sanitized, nil, nil + } + if resourceName == "" { + return nil, nil, fmt.Errorf("sensitive.Route: resourceName is empty (sensitive keys: %v)", routableKeys) + } + if provider == nil { + return nil, nil, fmt.Errorf("sensitive.Route: no secrets.Provider configured but resource %q has sensitive output keys %v", resourceName, routableKeys) + } + + hydrated = make(map[string]string, len(routableKeys)) + for _, k := range routableKeys { + val, sErr := stringifyOutput(out.Outputs[k]) + if sErr != nil { + return nil, nil, fmt.Errorf("sensitive.Route: resource %q key %q: %w", resourceName, k, sErr) + } + secretName := SecretKey(resourceName, k) + if setErr := provider.Set(ctx, secretName, val); setErr != nil { + return nil, nil, fmt.Errorf("sensitive.Route: provider.Set(%q): %w", secretName, setErr) + } + sanitized[k] = Placeholder(resourceName, k) + hydrated[secretName] = val + } + return sanitized, hydrated, nil +} + +// stringifyOutput coerces an output value to string. The secrets.Provider +// API takes string values; non-string sensitive outputs are not supported +// in v0.27.0 (would need encoding decisions out of scope here). +func stringifyOutput(v any) (string, error) { + switch s := v.(type) { + case string: + return s, nil + default: + return "", fmt.Errorf("sensitive output value type %T not supported (must be string)", v) + } +} + +// Revoke deletes routed secrets for resourceName. mergedKeys is the union +// of placeholder-derived keys (caller extracts from pre-delete +// state.Outputs) and any legacy heuristic keys. Errors from +// provider.Delete are aggregated via errors.Join — Revoke does NOT stop +// on the first error so partial cleanup proceeds. Keys that were never +// stored (provider returns secrets.ErrNotFound) are silently treated as +// success. +func Revoke( + ctx context.Context, + provider secrets.Provider, + resourceName string, + mergedKeys []string, +) error { + if provider == nil { + return nil // no-op when no provider configured + } + if resourceName == "" { + return fmt.Errorf("sensitive.Revoke: resourceName is empty") + } + // Sort for determinism (test stability + log readability). + sorted := append([]string(nil), mergedKeys...) + sort.Strings(sorted) + + var errs []error + for _, k := range sorted { + secretName := SecretKey(resourceName, k) + if delErr := provider.Delete(ctx, secretName); delErr != nil { + if errors.Is(delErr, secrets.ErrNotFound) { + continue + } + errs = append(errs, fmt.Errorf("delete %q: %w", secretName, delErr)) + } + } + if len(errs) > 0 { + return errors.Join(errs...) + } + return nil +} + +// MaskSensitiveForDiff returns copies of desired and current with sensitive +// keys elided from BOTH sides. A key is considered sensitive when: +// +// - it is named in driverKeys (i.e., ResourceDriver.SensitiveKeys()), OR +// - its value in current matches the PlaceholderPrefix. +// +// Eliding from both sides ensures driver.Diff or other field-by-field +// comparators don't report drift when state has a placeholder and live +// has a different (or absent) value. Non-sensitive keys are passed +// through unchanged. +// +// Either input may be nil; the corresponding output is also nil. +func MaskSensitiveForDiff(driverKeys []string, desired, current map[string]any) (map[string]any, map[string]any) { + mask := make(map[string]struct{}, len(driverKeys)) + for _, k := range driverKeys { + mask[k] = struct{}{} + } + // Augment with placeholder-derived keys from current. + for k, v := range current { + if IsPlaceholder(v) { + mask[k] = struct{}{} + } + } + // Also augment from desired in case a desired-side placeholder leaked in + // (unusual but defensive). + for k, v := range desired { + if IsPlaceholder(v) { + mask[k] = struct{}{} + } + } + return copyExcept(desired, mask), copyExcept(current, mask) +} + +func copyExcept(in map[string]any, exclude map[string]struct{}) map[string]any { + if in == nil { + return nil + } + out := make(map[string]any, len(in)) + for k, v := range in { + if _, skip := exclude[k]; skip { + continue + } + out[k] = v + } + return out +} diff --git a/iac/sensitive/route_test.go b/iac/sensitive/route_test.go new file mode 100644 index 00000000..5c7ca09b --- /dev/null +++ b/iac/sensitive/route_test.go @@ -0,0 +1,388 @@ +package sensitive + +import ( + "context" + "errors" + "strings" + "testing" + + "github.com/GoCodeAlone/workflow/interfaces" + "github.com/GoCodeAlone/workflow/secrets" +) + +// fakeProvider records Set/Delete/Get/List calls for assertions. +type fakeProvider struct { + values map[string]string + setErr map[string]error // per-key Set override + delErr map[string]error + setLog []string + delLog []string +} + +func newFakeProvider() *fakeProvider { + return &fakeProvider{values: map[string]string{}, setErr: map[string]error{}, delErr: map[string]error{}} +} + +func (p *fakeProvider) Name() string { return "fake" } +func (p *fakeProvider) Get(_ context.Context, k string) (string, error) { + v, ok := p.values[k] + if !ok { + return "", secrets.ErrNotFound + } + return v, nil +} +func (p *fakeProvider) Set(_ context.Context, k, v string) error { + if err, ok := p.setErr[k]; ok { + return err + } + p.setLog = append(p.setLog, k) + p.values[k] = v + return nil +} +func (p *fakeProvider) Delete(_ context.Context, k string) error { + if err, ok := p.delErr[k]; ok { + return err + } + p.delLog = append(p.delLog, k) + delete(p.values, k) + return nil +} +func (p *fakeProvider) List(_ context.Context) ([]string, error) { + out := make([]string, 0, len(p.values)) + for k := range p.values { + out = append(out, k) + } + return out, nil +} + +func TestRoute_NoSensitive_PassesThrough(t *testing.T) { + p := newFakeProvider() + out := &interfaces.ResourceOutput{ + Name: "x", Type: "infra.spaces_key", + Outputs: map[string]any{"bucket": "b", "endpoint": "e"}, + } + sanitized, hydrated, err := Route(context.Background(), p, "myres", out) + if err != nil { + t.Fatalf("Route: %v", err) + } + if len(p.setLog) != 0 { + t.Errorf("expected no Set calls, got %v", p.setLog) + } + if sanitized["bucket"] != "b" || sanitized["endpoint"] != "e" { + t.Errorf("non-sensitive outputs corrupted: %v", sanitized) + } + if len(hydrated) != 0 { + t.Errorf("expected empty hydrated, got %v", hydrated) + } +} + +func TestRoute_SensitiveValuePresent_RoutesAndSanitizes(t *testing.T) { + p := newFakeProvider() + out := &interfaces.ResourceOutput{ + Outputs: map[string]any{"access_key": "AK", "secret_key": "SECRET", "bucket": "b"}, + Sensitive: map[string]bool{"secret_key": true, "access_key": true}, + } + sanitized, hydrated, err := Route(context.Background(), p, "myres", out) + if err != nil { + t.Fatalf("Route: %v", err) + } + if p.values["myres_secret_key"] != "SECRET" { + t.Errorf("provider did not receive secret_key; got %v", p.values) + } + if p.values["myres_access_key"] != "AK" { + t.Errorf("provider did not receive access_key; got %v", p.values) + } + if sanitized["secret_key"] != "secret_ref://myres_secret_key" { + t.Errorf("sanitized[secret_key] = %v, want placeholder", sanitized["secret_key"]) + } + if sanitized["access_key"] != "secret_ref://myres_access_key" { + t.Errorf("sanitized[access_key] = %v, want placeholder", sanitized["access_key"]) + } + if sanitized["bucket"] != "b" { + t.Errorf("non-sensitive bucket lost: %v", sanitized["bucket"]) + } + if hydrated["myres_secret_key"] != "SECRET" { + t.Errorf("hydrated missing secret_key: %v", hydrated) + } +} + +func TestRoute_SensitiveKeyAbsentFromOutputs_Skipped(t *testing.T) { + p := newFakeProvider() + out := &interfaces.ResourceOutput{ + Outputs: map[string]any{"access_key": "AK"}, + Sensitive: map[string]bool{"secret_key": true, "access_key": true}, + } + sanitized, hydrated, err := Route(context.Background(), p, "myres", out) + if err != nil { + t.Fatalf("Route: %v", err) + } + if _, ok := sanitized["secret_key"]; ok { + t.Errorf("absent sensitive key should NOT yield placeholder, got %v", sanitized["secret_key"]) + } + if _, ok := p.values["myres_secret_key"]; ok { + t.Errorf("provider should not have received secret_key (absent value)") + } + if hydrated["myres_secret_key"] != "" { + t.Errorf("hydrated should not contain absent key") + } + // access_key was present, should be routed + if sanitized["access_key"] != "secret_ref://myres_access_key" { + t.Errorf("access_key routing failed: %v", sanitized["access_key"]) + } +} + +func TestRoute_SensitiveFalseValue_NotRouted(t *testing.T) { + p := newFakeProvider() + out := &interfaces.ResourceOutput{ + Outputs: map[string]any{"bucket_uri": "https://b.example/"}, + Sensitive: map[string]bool{"bucket_uri": false}, + } + sanitized, _, err := Route(context.Background(), p, "myres", out) + if err != nil { + t.Fatalf("Route: %v", err) + } + if sanitized["bucket_uri"] != "https://b.example/" { + t.Errorf("Sensitive=false should not route, got %v", sanitized["bucket_uri"]) + } + if len(p.setLog) != 0 { + t.Errorf("Sensitive=false triggered Set: %v", p.setLog) + } +} + +func TestRoute_ProviderSetError_ReturnsError(t *testing.T) { + p := newFakeProvider() + p.setErr["myres_secret_key"] = errors.New("boom") + out := &interfaces.ResourceOutput{ + Outputs: map[string]any{"secret_key": "S"}, + Sensitive: map[string]bool{"secret_key": true}, + } + _, _, err := Route(context.Background(), p, "myres", out) + if err == nil { + t.Fatal("expected error from Set") + } +} + +func TestRoute_NilProviderWithSensitive_Errors(t *testing.T) { + out := &interfaces.ResourceOutput{ + Outputs: map[string]any{"secret_key": "S"}, + Sensitive: map[string]bool{"secret_key": true}, + } + _, _, err := Route(context.Background(), nil, "myres", out) + if err == nil { + t.Fatal("expected error when provider nil and Sensitive non-empty") + } + if !strings.Contains(err.Error(), "myres") { + t.Errorf("error should name resource, got %q", err.Error()) + } +} + +func TestRoute_NilProviderWithoutSensitive_OK(t *testing.T) { + out := &interfaces.ResourceOutput{ + Outputs: map[string]any{"bucket": "b"}, + } + sanitized, hydrated, err := Route(context.Background(), nil, "myres", out) + if err != nil { + t.Fatalf("Route: %v", err) + } + if sanitized["bucket"] != "b" { + t.Errorf("nil-provider non-sensitive corrupted: %v", sanitized) + } + if len(hydrated) != 0 { + t.Errorf("expected empty hydrated, got %v", hydrated) + } +} + +func TestRoute_EmptyResourceName_Errors(t *testing.T) { + p := newFakeProvider() + out := &interfaces.ResourceOutput{ + Outputs: map[string]any{"secret_key": "S"}, + Sensitive: map[string]bool{"secret_key": true}, + } + _, _, err := Route(context.Background(), p, "", out) + if err == nil { + t.Fatal("expected error on empty resourceName") + } +} + +func TestRoute_DeterministicSetOrder(t *testing.T) { + // Multiple sensitive keys: ensure Set order is sorted by key (deterministic). + p := newFakeProvider() + out := &interfaces.ResourceOutput{ + Outputs: map[string]any{"a_key": "A", "b_key": "B", "c_key": "C"}, + Sensitive: map[string]bool{"a_key": true, "b_key": true, "c_key": true}, + } + _, _, err := Route(context.Background(), p, "r", out) + if err != nil { + t.Fatalf("Route: %v", err) + } + want := []string{"r_a_key", "r_b_key", "r_c_key"} + if len(p.setLog) != len(want) { + t.Fatalf("setLog len: got %v want %v", p.setLog, want) + } + for i, w := range want { + if p.setLog[i] != w { + t.Errorf("setLog[%d] = %v want %v", i, p.setLog[i], w) + } + } +} + +func TestRoute_NonStringSensitiveValue_Errors(t *testing.T) { + p := newFakeProvider() + out := &interfaces.ResourceOutput{ + Outputs: map[string]any{"port": 8080}, + Sensitive: map[string]bool{"port": true}, + } + _, _, err := Route(context.Background(), p, "r", out) + if err == nil { + t.Fatal("expected error for non-string sensitive value") + } + if !strings.Contains(err.Error(), "not supported") { + t.Errorf("error should explain non-string: %q", err.Error()) + } +} + +func TestRoute_NilOut_Errors(t *testing.T) { + _, _, err := Route(context.Background(), newFakeProvider(), "r", nil) + if err == nil { + t.Fatal("expected error for nil out") + } +} + +func TestRevoke_DeletesAllKeys(t *testing.T) { + p := newFakeProvider() + p.values["r_secret_key"] = "S" + p.values["r_access_key"] = "A" + if err := Revoke(context.Background(), p, "r", []string{"secret_key", "access_key"}); err != nil { + t.Fatalf("Revoke: %v", err) + } + if _, ok := p.values["r_secret_key"]; ok { + t.Errorf("secret_key not deleted") + } + if _, ok := p.values["r_access_key"]; ok { + t.Errorf("access_key not deleted") + } +} + +func TestRevoke_AggregatesErrors(t *testing.T) { + p := newFakeProvider() + p.values["r_secret_key"] = "S" + p.values["r_access_key"] = "A" + p.delErr["r_secret_key"] = errors.New("boom1") + p.delErr["r_access_key"] = errors.New("boom2") + err := Revoke(context.Background(), p, "r", []string{"secret_key", "access_key"}) + if err == nil { + t.Fatal("expected aggregated error") + } + msg := err.Error() + if !strings.Contains(msg, "boom1") || !strings.Contains(msg, "boom2") { + t.Errorf("aggregated error missing one or both: %q", msg) + } +} + +func TestRevoke_ContinuesAfterError(t *testing.T) { + // First key errors; second key should still be Deleted. + p := newFakeProvider() + p.values["r_secret_key"] = "S" + p.values["r_access_key"] = "A" + p.delErr["r_secret_key"] = errors.New("boom") + _ = Revoke(context.Background(), p, "r", []string{"secret_key", "access_key"}) + if _, ok := p.values["r_access_key"]; ok { + t.Error("access_key not deleted (Revoke should continue past first error)") + } +} + +func TestRevoke_NotFoundIsSuccess(t *testing.T) { + p := newFakeProvider() + // Pre-populate one key; Delete on a missing one returns ErrNotFound. + p.values["r_secret_key"] = "S" + p.delErr["r_access_key"] = secrets.ErrNotFound + if err := Revoke(context.Background(), p, "r", []string{"secret_key", "access_key"}); err != nil { + t.Fatalf("Revoke should swallow ErrNotFound, got %v", err) + } +} + +func TestRevoke_NilProvider_NoOp(t *testing.T) { + if err := Revoke(context.Background(), nil, "r", []string{"k"}); err != nil { + t.Fatalf("nil provider should be no-op, got %v", err) + } +} + +func TestRevoke_EmptyResourceName_Errors(t *testing.T) { + if err := Revoke(context.Background(), newFakeProvider(), "", []string{"k"}); err == nil { + t.Fatal("expected error on empty resourceName") + } +} + +func TestIsPlaceholder(t *testing.T) { + cases := map[string]bool{ + "secret_ref://x": true, + "secret_ref://r_key": true, + "secret_ref://": true, // edge: empty key after prefix; still has prefix + "secret://x": false, + "plain": false, + "": false, + } + for in, want := range cases { + if got := IsPlaceholder(in); got != want { + t.Errorf("IsPlaceholder(%q) = %v, want %v", in, got, want) + } + } + // non-string input + if IsPlaceholder(42) { + t.Error("IsPlaceholder(int) should be false") + } + if IsPlaceholder(nil) { + t.Error("IsPlaceholder(nil) should be false") + } +} + +func TestPlaceholder(t *testing.T) { + got := Placeholder("myres", "secret_key") + want := "secret_ref://myres_secret_key" + if got != want { + t.Errorf("Placeholder = %q, want %q", got, want) + } +} + +func TestSecretKey(t *testing.T) { + got := SecretKey("myres", "secret_key") + want := "myres_secret_key" + if got != want { + t.Errorf("SecretKey = %q, want %q", got, want) + } +} + +func TestMaskSensitiveForDiff_MasksPlaceholdersAndDriverKeys(t *testing.T) { + desired := map[string]any{"region": "nyc3", "secret_key": "should-mask", "bucket": "b"} + current := map[string]any{"region": "nyc3", "secret_key": "secret_ref://r_secret_key", "bucket": "b"} + d2, c2 := MaskSensitiveForDiff([]string{"secret_key"}, desired, current) + if _, ok := d2["secret_key"]; ok { + t.Errorf("desired should have secret_key elided, got %v", d2["secret_key"]) + } + if _, ok := c2["secret_key"]; ok { + t.Errorf("current should have secret_key elided, got %v", c2["secret_key"]) + } + if d2["region"] != "nyc3" || c2["region"] != "nyc3" { + t.Errorf("non-sensitive keys must survive: d=%v c=%v", d2["region"], c2["region"]) + } +} + +func TestMaskSensitiveForDiff_PlaceholderInDesired(t *testing.T) { + // Edge: a desired map carrying a placeholder shouldn't leak it into Diff. + desired := map[string]any{"k": "secret_ref://r_k"} + current := map[string]any{"k": "secret_ref://r_k"} + d2, c2 := MaskSensitiveForDiff(nil, desired, current) + if _, ok := d2["k"]; ok { + t.Errorf("placeholder in desired should be elided") + } + if _, ok := c2["k"]; ok { + t.Errorf("placeholder in current should be elided") + } +} + +func TestMaskSensitiveForDiff_NilInputs(t *testing.T) { + d, c := MaskSensitiveForDiff(nil, nil, nil) + if d != nil || c != nil { + t.Errorf("nil inputs should yield nil outputs, got d=%v c=%v", d, c) + } +}