feat(wfctl): drift class enum + apply --refresh ghost-prune for production state recovery#519
Merged
Conversation
Add DriftClass string type with 4 constants: - DriftClassUnknown (zero value, omitempty-safe for backwards compat) - DriftClassInSync - DriftClassGhost (state has resource; cloud returns ErrResourceNotFound) - DriftClassConfig (both exist; configs differ) Extend DriftResult with Class DriftClass json:"class,omitempty" field (additive, backwards-compatible — consumers without the field see no JSON change due to omitempty). 4 tests covering constant values, omitempty-on-zero, ghost JSON rendering, and round-trip marshal/unmarshal for all 3 non-zero classes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…very New function runInfraApplyRefreshPhase calls provider.DetectDrift and prunes ghost-in-state entries (DriftClassGhost) from the state store: - Dry-run by default (no autoApprove): prints "would prune" per ghost - autoApprove=true: calls store.DeleteResource + emits audit log to stderr - Protected resources blocked unless allowProtectedPrune=true - Transient DetectDrift errors propagate immediately; no pruning happens - DriftClassConfig / DriftClassInSync entries skipped (regular plan path) 6 tests covering: dry-run no-mutate, auto-approve prune, protected-block, protected-with-flag, transient-error-propagation, in-sync-skip. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…apply Add two flags to runInfraApply: - --refresh: runs runInfraApplyRefreshPhase before plan+apply, iterating all state-tracked provider groups via groupStatesByProvider and pruning any DriftClassGhost entries. - --allow-protected-prune: passed to runInfraApplyRefreshPhase to permit pruning resources with protected:true in state Outputs. Refresh phase only fires when --refresh is set and the config has infra.* modules; silently skipped for legacy platform.* configs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
driftInfraModules now prints drift class (GHOST / CONFIG / IN-SYNC)
using the DriftClass constants from interfaces:
GHOST <name> <type> — cloud reports not found
CONFIG <name> <type>
<field>: expected=<v> actual=<v>
IN-SYNC <name> <type>
Providers still returning DriftClassUnknown fall through to the legacy
Drifted-bool behavior for backwards compatibility.
Column-aligned format matches wfctl infra status output style.
Drift-found message updated to suggest --refresh flag.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
docs/wfctl/drift-recovery.md (~100 lines) covering: - Three drift classes (ghost / config / in-sync) with recovery actions - wfctl infra drift usage + example output with Class column - Dry-run-first workflow → auto-approve prune - Protected resource two-key contract (--allow-protected-prune) - Audit log format - Production safety checklist - CI integration patterns CHANGELOG.md Unreleased section: DriftClass enum, --refresh flag, --allow-protected-prune flag, drift output Class column, docs file. Notes omitempty additions to DriftResult.Expected/Actual/Fields. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Adds first set of wfctl drift-recovery primitives: a drift classification enum in interfaces, refreshed wfctl infra apply behavior to prune ghost-in-state entries, extended drift CLI output, and operator documentation.
Changes:
- Add
interfaces.DriftClass+DriftResult.Class(omitempty) and updateDriftResultJSON tags. - Implement
wfctl infra apply --refresh(with--allow-protected-prune) to prune ghost state entries before plan/apply. - Update
wfctl infra driftoutput to show drift class; add docs and new tests; update changelog.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| interfaces/iac_provider.go | Introduces DriftClass and adds Class + omitempty JSON tags to DriftResult. |
| interfaces/iac_provider_test.go | Adds JSON/round-trip tests for DriftClass and DriftResult.Class. |
| cmd/wfctl/infra_apply_refresh.go | New refresh phase implementation that detects drift and prunes ghost-in-state entries. |
| cmd/wfctl/infra_apply_refresh_test.go | New unit tests covering refresh dry-run, prune, protected gating, and error propagation. |
| cmd/wfctl/infra.go | Wires --refresh and --allow-protected-prune flags into infra apply. |
| cmd/wfctl/infra_status_drift.go | Updates drift output to include class-based formatting and messaging. |
| docs/wfctl/drift-recovery.md | Adds an operator runbook for drift detection and recovery workflows. |
| CHANGELOG.md | Documents the new drift-recovery functionality under Unreleased. |
Comment on lines
+141
to
+142
| See `docs/plans/2026-05-02-infra-drift-recovery.md` for the full recovery | ||
| design rationale. |
Comment on lines
+60
to
+67
| isProtected := isRefProtected(states, r.Name) | ||
| if isProtected && !allowProtectedPrune { | ||
| // Hard-block: return an error immediately so the caller sees the | ||
| // problem. No prunes have happened at this point. | ||
| fmt.Fprintf(stderr, "wfctl: BLOCKED: %s is protected; cannot prune without --allow-protected-prune\n", r.Name) | ||
| return fmt.Errorf("refresh: blocked on protected resource %q (use --allow-protected-prune to override)", r.Name) | ||
| } | ||
|
|
Comment on lines
+947
to
+950
| var refreshFlag bool | ||
| fs.BoolVar(&refreshFlag, "refresh", false, "Detect drift and prune ghost-in-state entries before applying") | ||
| var allowProtectedPruneFlag bool | ||
| fs.BoolVar(&allowProtectedPruneFlag, "allow-protected-prune", false, "Allow pruning state entries for resources marked protected: true (requires --refresh)") |
Comment on lines
+10
to
+14
| | Class | Description | Recovery | | ||
| |-------|-------------|----------| | ||
| | `ghost` | State says resource exists; cloud returns 404 | Prune state entry via `--refresh` | | ||
| | `config` | Both exist but configs differ (e.g. someone edited cloud-side) | Reconcile via normal `wfctl infra apply` | | ||
| | `in-sync` | State and cloud agree | No action needed | |
⏱ Benchmark Results✅ No significant performance regressions detected. benchstat comparison (baseline → PR)
|
Important #1 — pre-scan all ghosts for protected resources before any state mutation (infra_apply_refresh.go). The original loop could prune an unprotected ghost then fail on a protected one, leaving partial state. Two-pass pattern: collect all blocked names first, return error listing every blocked resource, then execute mutations only when pre-scan passes. Important #2 — validate --allow-protected-prune requires --refresh (infra.go). Without this check the flag was silently no-op'd, misleading operators. Now returns a clear pre-flight error before any work begins. Minor #3 — replace broken docs/plans/2026-05-02-infra-drift-recovery.md link in drift-recovery.md (design worktree path, never merged) with a pointer to the canonical source file. Minor #4 — markdown table was already correct standard format; no change needed (table separator rows are standard |---|---|). Tests added: - TestApplyRefresh_MultipleGhostsAllOrNothing (all-or-nothing invariant) - TestApplyRefresh_AllGhostsUnprotectedPrunesAll (pre-scan allows clean batch) - TestInfraApply_AllowProtectedPruneRequiresRefresh (flag validation) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copilot AI
added a commit
that referenced
this pull request
May 5, 2026
…y refresh - cmd/wfctl/infra.go: replace inline closer.Close() block in runInfraApply's refresh loop with a deferred warn-on-error closure immediately after the provErr nil-check. Guarantees the provider connection closes regardless of subsequent control flow, preventing a future-maintainer refactor trap. - cmd/wfctl/infra_apply_refresh_test.go: add strings.Contains check for "state mutation prune" keyword in TestApplyRefresh_AutoApprovePrunesAndApplies alongside the existing resource name check, so audit log format regressions (dropped operation keyword or misrouted log) are caught. Relates to #519 follow-up items. Agent-Logs-Url: https://github.com/GoCodeAlone/workflow/sessions/d19a7660-ca79-4186-867b-f47ec64f8435 Co-authored-by: intel352 <77607+intel352@users.noreply.github.com>
intel352
added a commit
that referenced
this pull request
May 5, 2026
…refresh (#548) * Initial plan * fix: closer-defer pattern and audit log assertion scope in infra apply refresh - cmd/wfctl/infra.go: replace inline closer.Close() block in runInfraApply's refresh loop with a deferred warn-on-error closure immediately after the provErr nil-check. Guarantees the provider connection closes regardless of subsequent control flow, preventing a future-maintainer refactor trap. - cmd/wfctl/infra_apply_refresh_test.go: add strings.Contains check for "state mutation prune" keyword in TestApplyRefresh_AutoApprovePrunesAndApplies alongside the existing resource name check, so audit log format regressions (dropped operation keyword or misrouted log) are caught. Relates to #519 follow-up items. Agent-Logs-Url: https://github.com/GoCodeAlone/workflow/sessions/d19a7660-ca79-4186-867b-f47ec64f8435 Co-authored-by: intel352 <77607+intel352@users.noreply.github.com> * fix: wrap refresh provider loop body in helper so closer defers per-group Deferring inside a for loop defers until the outer function returns, not per-iteration. Extract the loop body into a refreshGroup helper (same pattern as infra_plan_provider.go IIFE and infra_apply.go applyGroup), so each provider's connection is closed as soon as its group finishes rather than being held open for the remainder of the apply path. Capture provType in a local variable before the defer to avoid the loop-variable capture footgun. Agent-Logs-Url: https://github.com/GoCodeAlone/workflow/sessions/4b198b40-db65-4947-8a7d-c42eedd48679 Co-authored-by: intel352 <77607+intel352@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: intel352 <77607+intel352@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds reusable wfctl drift-recovery primitives (first PR in a 4-PR drift-recovery chain):
interfaces.DriftClassenum +DriftResult.Classfield (additive, backwards-compatible —omitemptyon zero value preserves existing serialization)wfctl infra apply --refreshflag prunes ghost-in-state entries (cloud 404s) before the normal plan+applywfctl infra apply --allow-protected-prunetwo-key flag for resources withprotected: truein state Outputswfctl infra driftCLI output extended with Class column (GHOST / CONFIG / IN-SYNC)docs/wfctl/drift-recovery.mdProduction safety:
--refreshis set without--auto-approvewfctl: state mutation prune <name> ...)DetectDriftAPI errors propagate without any state mutation--allow-protected-prunetwo-keyTest plan
go test ./interfaces/...green (4 DriftClass JSON marshaling tests)go test ./cmd/wfctl/...green (6 TestApplyRefresh_* tests covering dry-run, auto-approve, protected-block, protected-with-flag, transient-error-propagation, in-sync-skip)go test ./...green — zero failures across full suitewfctl infra apply --helpshows--refreshand--allow-protected-pruneflagsSequencing
This is PR-D2 in the chain. After merge + release tag (v0.20.5):
workflow-plugin-digitalocean feat/detect-drift-impl) rebases its go.mod onto this release to useinterfaces.DriftClass*constants🤖 Generated with Claude Code
Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com