Skip to content

docs: spaces-key prune runbook + ADRs 0015-0018 + 0020 (PR6: Tasks 23+24)#585

Merged
intel352 merged 3 commits into
mainfrom
docs/spaces-key-runbook
May 9, 2026
Merged

docs: spaces-key prune runbook + ADRs 0015-0018 + 0020 (PR6: Tasks 23+24)#585
intel352 merged 3 commits into
mainfrom
docs/spaces-key-runbook

Conversation

@intel352
Copy link
Copy Markdown
Contributor

@intel352 intel352 commented May 9, 2026

Summary

PR6 of the spaces-key-iac-resource plan. Operator + design documentation for the spaces-key infrastructure shipped in PR0–PR5.

  • Task 23docs/runbooks/spaces-key-prune.md: operator runbook covering happy path (rotate-and-prune all-in-one), multi-step variant (audit then bootstrap --force-rotate then prune), recovery from partial-failure rotation (--recovery-from-last-rotation), allowlist for hand-created keys, GH Secrets naming convention, and operator FAQ.

  • Task 24 — five Architecture Decision Records:

    • 0015-spaces-key-as-iac-resource.md — Two-phase fix (canonical-schema + Hybrid IaC resource); Approach B over A.
    • 0016-enumerator-all-interface.md — New optional interfaces.EnumeratorAll for resource types whose API doesn't support tagging.
    • 0017-prune-cli-two-key-opt-in.mdprune discriminator (--type + --created-before + --exclude-access-key) + two-key opt-in (--confirm + WFCTL_CONFIRM_PRUNE=1).
    • 0018-bootstrap-key-exemption.md — Bootstrap key NOT an infra.spaces_key (chicken-and-egg with state + lifecycle separation).
    • 0020-storage-filter-sidecar-metadata.mdbootstrapSecrets storage loop filters JSON map by providerCredentialSubKeys[source] allow-list so created_at from generateDOSpacesKey doesn't leak as a phantom GH Secret.

Notes

  • ADR 0019 (bootstrap-key rotation reaper) is explicitly deferred per the design's decision log — not in this plan; carve-out documented in ADR 0018.
  • ADR 0021 (rewriteTransport test stubbing security fix) was authored separately by implementer-3 in PR4a — referenced by ADR 0020's "Related" section but not modified here.
  • Cross-references between the five new ADRs are populated in each ADR's "Related" section.

Verification

$ ls decisions/ | sort -V | tail -10
…
0015-spaces-key-as-iac-resource.md
0016-enumerator-all-interface.md
0017-prune-cli-two-key-opt-in.md
0018-bootstrap-key-exemption.md
0020-storage-filter-sidecar-metadata.md
0021-spaces-key-test-hook-rewriteTransport-not-env-var.md

$ grep -c '^##' docs/runbooks/spaces-key-prune.md
7

Test plan

  • decisions/ numbering: 0015-0018 + 0020 added; 0019 properly skipped (deferred); 0021 untouched
  • Runbook has 7 H2 sections: Overview / Happy path / Multi-step / Recovery / Allowlist / GH Secrets / FAQ
  • Each ADR follows the workspace standard: Status + Context + Decision + Consequences + Alternatives + Related
  • Spec-reviewer pass

🤖 Generated with Claude Code

Closes Tasks 23 + 24 of the spaces-key-iac-resource plan. PR6 ships
operator + design documentation for the spaces-key infrastructure
shipped in PR0-PR5.

Files:
- docs/runbooks/spaces-key-prune.md (Task 23) — operator runbook
  covering happy/multi-step/recovery/allowlist/GH-Secrets/FAQ.
- decisions/0015-spaces-key-as-iac-resource.md — Approach B (two-phase fix).
- decisions/0016-enumerator-all-interface.md — New optional interface
  for non-tag-supporting resource types.
- decisions/0017-prune-cli-two-key-opt-in.md — `--exclude-access-key`
  + `--created-before` discriminator + two-key opt-in.
- decisions/0018-bootstrap-key-exemption.md — Bootstrap key NOT an
  `infra.spaces_key` (chicken-and-egg + lifecycle separation).
- decisions/0020-storage-filter-sidecar-metadata.md — Storage filter
  by `providerCredentialSubKeys[source]` allow-list.

ADR 0019 (bootstrap-key rotation reaper) explicitly deferred per the
design's decision log. ADR 0021 (rewriteTransport security fix) was
authored separately by implementer-3 in PR4a; not modified here.

Plan: docs/plans/2026-05-08-spaces-key-iac-resource.md (commit 316559f7).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 9, 2026 07:18
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds operator and design documentation for the “spaces-key as IaC resource” workstream (spaces key rotation/pruning runbook + ADRs 0015–0018 and 0020).

Changes:

  • Added an operator runbook for Spaces key rotation + pruning workflows.
  • Added ADRs documenting design decisions around infra.spaces_key, enumeration, prune UX safety gates, bootstrap-key carve-out, and bootstrapSecrets sidecar-metadata filtering.
  • Connected the ADR set via “Related” cross-references.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
docs/runbooks/spaces-key-prune.md New operator runbook for auditing, rotating, recovering, and pruning Spaces keys.
decisions/0015-spaces-key-as-iac-resource.md ADR for the overall two-phase approach and why.
decisions/0016-enumerator-all-interface.md ADR introducing interfaces.EnumeratorAll for non-tag-supporting resources.
decisions/0017-prune-cli-two-key-opt-in.md ADR documenting prune discriminator + multi-factor opt-in UX.
decisions/0018-bootstrap-key-exemption.md ADR carving bootstrap key out of IaC resource management.
decisions/0020-storage-filter-sidecar-metadata.md ADR documenting the storage filter to prevent persisting sidecar metadata as secrets.

Comment on lines +11 to +17
- The workflow CLI is built from a release containing the `infra
audit-keys` / `infra prune` / `infra rotate-and-prune` subcommands
(`wfctl infra --help` lists all three).
- The DO plugin is loaded for the target env (`infra audit-keys` exits
with `no loaded provider implements EnumeratorAll` if not).
- The operator has shell access where `WFCTL_CONFIRM_PRUNE=1` can be
exported and `${WFCTL_STATE_DIR:-$HOME/.wfctl}` is writable.
- The DO plugin is loaded for the target env (`infra audit-keys` exits
with `no loaded provider implements EnumeratorAll` if not).
- The operator has shell access where `WFCTL_CONFIRM_PRUNE=1` can be
exported and `${WFCTL_STATE_DIR:-$HOME/.wfctl}` is writable.
Comment on lines +63 to +68
- The CLI dispatchers (`runInfraAuditKeysCmd`,
`runInfraPruneCmd`, `runInfraRotateAndPruneCmd`) iterate loaded
iac.provider modules and pick the first one implementing
`EnumeratorAll`. Multi-provider configs need to ensure at most one
provider per `--type`.

Comment on lines +116 to +119
- `cmd/wfctl/infra_prune.go` — implementation.
- `cmd/wfctl/infra_prune_test.go` — `TestInfraPrune_RequiresTwoKeyOptIn`,
`TestInfraPrune_RequiresExcludeAccessKey`,
`TestInfraPrune_FiltersByTimeAndExcludesAccessKey`.
Comment on lines +102 to +105
- `docs/plans/2026-05-08-spaces-key-iac-resource-design.md` — full
design doc with the Approach A vs B trade study.
- `docs/plans/2026-05-08-spaces-key-iac-resource-design.adversarial-review-1.md`
through `-7.md` — review history that converged on Approach B.
Comment on lines +43 to +49
Make two changes together (must land in the same merge commit per
this ADR's "same-commit constraint"):

1. `generateDOSpacesKey` returns a JSON map with `access_key` +
`secret_key` + `created_at` (full DO API response shape, not just
the credential pair).

…fy opt-in terminology

Two fixes from PR #585 code review:

1. **Important** (line 98) — the multi-step variant told operators to
   capture `WFCTL_NEW_KEY_ACCESS_KEY=<ak>` from `infra bootstrap
   --force-rotate` stderr. That line is never emitted: per the
   storage-filter contract (ADR 0020), `access_key` is a CANONICAL
   credential field (in `providerCredentialSubKeys["digitalocean.spaces"]`),
   so it gets STORED as a GH Secret named `SPACES_access_key`, NOT
   logged to stderr. Only SIDECAR fields (currently just `created_at`)
   emit `WFCTL_NEW_KEY_<UPPER>=` markers.

   Operators following the false instruction would have gotten stuck
   at Step 3 with no `--exclude-access-key` value, then either
   abandoned the multi-step path or — worse — passed an empty string
   and tripped the paranoia rail with a confusing error.

   Fix: rewrite Step 2's stderr capture comment to clarify what's
   actually emitted, then add an explicit access_key recovery block
   with two options:
   - Option A: `gh secret view SPACES_access_key --repo <owner>/<repo>`
     (read directly from the GH Secrets store).
   - Option B: `wfctl infra audit-keys` + match the row whose
     CREATED_AT equals the captured WFCTL_NEW_KEY_CREATED_AT=.

2. **Minor** (line 27 + FAQ) — the Overview said "all three opt-ins"
   while the FAQ said "two-key opt-in". Both are correct, but the
   relationship was implicit. Fix: rewrite the Overview list to
   spell out that opt-ins (1) + (2) are the "two-key" authorization
   (named after the two-person rule from physical security) and (3)
   is the runtime confirmation; cross-link from the FAQ.

No structural change — runbook still has 7 H2 sections.

Stacked on docs/spaces-key-runbook per
feedback_commit_label_by_content_not_round.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@intel352
Copy link
Copy Markdown
Contributor Author

intel352 commented May 9, 2026

Code review fixes — commit e68fe53 addresses two PR #585 review comments routed via team-lead:

  1. Important — runbook line 98 falsely instructed operators to capture WFCTL_NEW_KEY_ACCESS_KEY=<ak> from infra bootstrap --force-rotate stderr. That marker is never emitted: per the storage-filter contract (ADR 0020), access_key is in providerCredentialSubKeys["digitalocean.spaces"] so it gets stored as a GH Secret named SPACES_access_key, not logged. Only sidecar fields (currently just created_at) emit WFCTL_NEW_KEY_<UPPER>= markers.

    Multi-step variant rewritten to show the two recovery paths for access_key:

    • Option A: gh secret view SPACES_access_key --repo <owner>/<repo> (read from the GH Secrets store directly).
    • Option B: wfctl infra audit-keys + match by the captured created_at.
  2. Minor — Overview said "all three opt-ins" while FAQ said "two-key opt-in". Reconciled: Overview now spells out that opt-ins (1) + (2) are the "two-key" authorization (named after the two-person rule), and (3) is the runtime confirmation. FAQ cross-links to the Overview list.

Runbook still 7 H2 sections (no structural change). ADRs untouched (clean per code-reviewer).

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 9, 2026

⏱ Benchmark Results

No significant performance regressions detected.

benchstat comparison (baseline → PR)
## benchstat: baseline → PR
baseline-bench.txt:260: parsing iteration count: invalid syntax
baseline-bench.txt:429054: parsing iteration count: invalid syntax
baseline-bench.txt:811897: parsing iteration count: invalid syntax
baseline-bench.txt:1222085: parsing iteration count: invalid syntax
baseline-bench.txt:1582113: parsing iteration count: invalid syntax
baseline-bench.txt:1998577: parsing iteration count: invalid syntax
benchmark-results.txt:260: parsing iteration count: invalid syntax
benchmark-results.txt:295836: parsing iteration count: invalid syntax
benchmark-results.txt:616906: parsing iteration count: invalid syntax
benchmark-results.txt:958646: parsing iteration count: invalid syntax
benchmark-results.txt:1294917: parsing iteration count: invalid syntax
benchmark-results.txt:1591171: parsing iteration count: invalid syntax
goos: linux
goarch: amd64
pkg: github.com/GoCodeAlone/workflow/dynamic
cpu: AMD EPYC 7763 64-Core Processor                
                            │ benchmark-results.txt │
                            │        sec/op         │
InterpreterCreation-4                 4.786m ± 126%
ComponentLoad-4                       3.572m ±  11%
ComponentExecute-4                    1.915µ ±   1%
PoolContention/workers-1-4            1.075µ ±   1%
PoolContention/workers-2-4            1.074µ ±   3%
PoolContention/workers-4-4            1.078µ ±   2%
PoolContention/workers-8-4            1.074µ ±   2%
PoolContention/workers-16-4           1.083µ ±   4%
ComponentLifecycle-4                  3.595m ±   0%
SourceValidation-4                    2.296µ ±   2%
RegistryConcurrent-4                  807.9n ±   3%
LoaderLoadFromString-4                3.592m ±   0%
geomean                               17.97µ

                            │ benchmark-results.txt │
                            │         B/op          │
InterpreterCreation-4                  2.027Mi ± 0%
ComponentLoad-4                        2.180Mi ± 0%
ComponentExecute-4                     1.203Ki ± 0%
PoolContention/workers-1-4             1.203Ki ± 0%
PoolContention/workers-2-4             1.203Ki ± 0%
PoolContention/workers-4-4             1.203Ki ± 0%
PoolContention/workers-8-4             1.203Ki ± 0%
PoolContention/workers-16-4            1.203Ki ± 0%
ComponentLifecycle-4                   2.183Mi ± 0%
SourceValidation-4                     1.984Ki ± 0%
RegistryConcurrent-4                   1.133Ki ± 0%
LoaderLoadFromString-4                 2.182Mi ± 0%
geomean                                15.25Ki

                            │ benchmark-results.txt │
                            │       allocs/op       │
InterpreterCreation-4                   15.68k ± 0%
ComponentLoad-4                         18.02k ± 0%
ComponentExecute-4                       25.00 ± 0%
PoolContention/workers-1-4               25.00 ± 0%
PoolContention/workers-2-4               25.00 ± 0%
PoolContention/workers-4-4               25.00 ± 0%
PoolContention/workers-8-4               25.00 ± 0%
PoolContention/workers-16-4              25.00 ± 0%
ComponentLifecycle-4                    18.07k ± 0%
SourceValidation-4                       32.00 ± 0%
RegistryConcurrent-4                     2.000 ± 0%
LoaderLoadFromString-4                  18.06k ± 0%
geomean                                  183.3

cpu: AMD EPYC 9V74 80-Core Processor                
                            │ baseline-bench.txt │
                            │       sec/op       │
InterpreterCreation-4               2.775m ± 95%
ComponentLoad-4                     2.682m ±  4%
ComponentExecute-4                  1.416µ ±  0%
PoolContention/workers-1-4          781.9n ±  2%
PoolContention/workers-2-4          785.7n ±  1%
PoolContention/workers-4-4          783.2n ±  1%
PoolContention/workers-8-4          781.2n ±  3%
PoolContention/workers-16-4         784.8n ±  1%
ComponentLifecycle-4                2.695m ±  0%
SourceValidation-4                  1.612µ ±  1%
RegistryConcurrent-4                579.4n ±  6%
LoaderLoadFromString-4              2.723m ±  0%
geomean                             12.90µ

                            │ baseline-bench.txt │
                            │        B/op        │
InterpreterCreation-4               2.027Mi ± 0%
ComponentLoad-4                     2.180Mi ± 0%
ComponentExecute-4                  1.203Ki ± 0%
PoolContention/workers-1-4          1.203Ki ± 0%
PoolContention/workers-2-4          1.203Ki ± 0%
PoolContention/workers-4-4          1.203Ki ± 0%
PoolContention/workers-8-4          1.203Ki ± 0%
PoolContention/workers-16-4         1.203Ki ± 0%
ComponentLifecycle-4                2.183Mi ± 0%
SourceValidation-4                  1.984Ki ± 0%
RegistryConcurrent-4                1.133Ki ± 0%
LoaderLoadFromString-4              2.182Mi ± 0%
geomean                             15.25Ki

                            │ baseline-bench.txt │
                            │     allocs/op      │
InterpreterCreation-4                15.68k ± 0%
ComponentLoad-4                      18.02k ± 0%
ComponentExecute-4                    25.00 ± 0%
PoolContention/workers-1-4            25.00 ± 0%
PoolContention/workers-2-4            25.00 ± 0%
PoolContention/workers-4-4            25.00 ± 0%
PoolContention/workers-8-4            25.00 ± 0%
PoolContention/workers-16-4           25.00 ± 0%
ComponentLifecycle-4                 18.07k ± 0%
SourceValidation-4                    32.00 ± 0%
RegistryConcurrent-4                  2.000 ± 0%
LoaderLoadFromString-4               18.06k ± 0%
geomean                               183.3

pkg: github.com/GoCodeAlone/workflow/middleware
cpu: AMD EPYC 7763 64-Core Processor                
                                  │ benchmark-results.txt │
                                  │        sec/op         │
CircuitBreakerDetection-4                     293.0n ± 8%
CircuitBreakerExecution_Success-4             21.50n ± 0%
CircuitBreakerExecution_Failure-4             66.39n ± 1%
geomean                                       74.78n

                                  │ benchmark-results.txt │
                                  │         B/op          │
CircuitBreakerDetection-4                    144.0 ± 0%
CircuitBreakerExecution_Success-4            0.000 ± 0%
CircuitBreakerExecution_Failure-4            0.000 ± 0%
geomean                                                 ¹
¹ summaries must be >0 to compute geomean

                                  │ benchmark-results.txt │
                                  │       allocs/op       │
CircuitBreakerDetection-4                    1.000 ± 0%
CircuitBreakerExecution_Success-4            0.000 ± 0%
CircuitBreakerExecution_Failure-4            0.000 ± 0%
geomean                                                 ¹
¹ summaries must be >0 to compute geomean

cpu: AMD EPYC 9V74 80-Core Processor                
                                  │ baseline-bench.txt │
                                  │       sec/op       │
CircuitBreakerDetection-4                  230.9n ± 0%
CircuitBreakerExecution_Success-4          17.59n ± 1%
CircuitBreakerExecution_Failure-4          55.17n ± 0%
geomean                                    60.73n

                                  │ baseline-bench.txt │
                                  │        B/op        │
CircuitBreakerDetection-4                 144.0 ± 0%
CircuitBreakerExecution_Success-4         0.000 ± 0%
CircuitBreakerExecution_Failure-4         0.000 ± 0%
geomean                                              ¹
¹ summaries must be >0 to compute geomean

                                  │ baseline-bench.txt │
                                  │     allocs/op      │
CircuitBreakerDetection-4                 1.000 ± 0%
CircuitBreakerExecution_Success-4         0.000 ± 0%
CircuitBreakerExecution_Failure-4         0.000 ± 0%
geomean                                              ¹
¹ summaries must be >0 to compute geomean

pkg: github.com/GoCodeAlone/workflow/module
cpu: AMD EPYC 7763 64-Core Processor                
                                 │ benchmark-results.txt │
                                 │        sec/op         │
JQTransform_Simple-4                        857.9n ± 30%
JQTransform_ObjectConstruction-4            1.419µ ± 10%
JQTransform_ArraySelect-4                   3.276µ ±  1%
JQTransform_Complex-4                       37.89µ ±  1%
JQTransform_Throughput-4                    1.737µ ±  1%
SSEPublishDelivery-4                        63.85n ±  2%
geomean                                     1.600µ

                                 │ benchmark-results.txt │
                                 │         B/op          │
JQTransform_Simple-4                      1.273Ki ± 0%
JQTransform_ObjectConstruction-4          1.773Ki ± 0%
JQTransform_ArraySelect-4                 2.625Ki ± 0%
JQTransform_Complex-4                     16.22Ki ± 0%
JQTransform_Throughput-4                  1.984Ki ± 0%
SSEPublishDelivery-4                        0.000 ± 0%
geomean                                                ¹
¹ summaries must be >0 to compute geomean

                                 │ benchmark-results.txt │
                                 │       allocs/op       │
JQTransform_Simple-4                        10.00 ± 0%
JQTransform_ObjectConstruction-4            15.00 ± 0%
JQTransform_ArraySelect-4                   30.00 ± 0%
JQTransform_Complex-4                       324.0 ± 0%
JQTransform_Throughput-4                    17.00 ± 0%
SSEPublishDelivery-4                        0.000 ± 0%
geomean                                                ¹
¹ summaries must be >0 to compute geomean

cpu: AMD EPYC 9V74 80-Core Processor                
                                 │ baseline-bench.txt │
                                 │       sec/op       │
JQTransform_Simple-4                     643.3n ± 23%
JQTransform_ObjectConstruction-4         1.079µ ±  3%
JQTransform_ArraySelect-4                2.601µ ±  0%
JQTransform_Complex-4                    31.30µ ±  0%
JQTransform_Throughput-4                 1.300µ ±  0%
SSEPublishDelivery-4                     48.89n ±  1%
geomean                                  1.237µ

                                 │ baseline-bench.txt │
                                 │        B/op        │
JQTransform_Simple-4                   1.273Ki ± 0%
JQTransform_ObjectConstruction-4       1.773Ki ± 0%
JQTransform_ArraySelect-4              2.625Ki ± 0%
JQTransform_Complex-4                  16.22Ki ± 0%
JQTransform_Throughput-4               1.984Ki ± 0%
SSEPublishDelivery-4                     0.000 ± 0%
geomean                                             ¹
¹ summaries must be >0 to compute geomean

                                 │ baseline-bench.txt │
                                 │     allocs/op      │
JQTransform_Simple-4                     10.00 ± 0%
JQTransform_ObjectConstruction-4         15.00 ± 0%
JQTransform_ArraySelect-4                30.00 ± 0%
JQTransform_Complex-4                    324.0 ± 0%
JQTransform_Throughput-4                 17.00 ± 0%
SSEPublishDelivery-4                     0.000 ± 0%
geomean                                             ¹
¹ summaries must be >0 to compute geomean

pkg: github.com/GoCodeAlone/workflow/schema
cpu: AMD EPYC 7763 64-Core Processor                
                                    │ benchmark-results.txt │
                                    │        sec/op         │
SchemaValidation_Simple-4                      1.112µ ± 12%
SchemaValidation_AllFields-4                   1.696µ ±  4%
SchemaValidation_FormatValidation-4            1.593µ ±  2%
SchemaValidation_ManySchemas-4                 1.791µ ±  4%
geomean                                        1.523µ

                                    │ benchmark-results.txt │
                                    │         B/op          │
SchemaValidation_Simple-4                      0.000 ± 0%
SchemaValidation_AllFields-4                   0.000 ± 0%
SchemaValidation_FormatValidation-4            0.000 ± 0%
SchemaValidation_ManySchemas-4                 0.000 ± 0%
geomean                                                   ¹
¹ summaries must be >0 to compute geomean

                                    │ benchmark-results.txt │
                                    │       allocs/op       │
SchemaValidation_Simple-4                      0.000 ± 0%
SchemaValidation_AllFields-4                   0.000 ± 0%
SchemaValidation_FormatValidation-4            0.000 ± 0%
SchemaValidation_ManySchemas-4                 0.000 ± 0%
geomean                                                   ¹
¹ summaries must be >0 to compute geomean

cpu: AMD EPYC 9V74 80-Core Processor                
                                    │ baseline-bench.txt │
                                    │       sec/op       │
SchemaValidation_Simple-4                   865.6n ± 17%
SchemaValidation_AllFields-4                1.258µ ±  7%
SchemaValidation_FormatValidation-4         1.231µ ±  5%
SchemaValidation_ManySchemas-4              1.232µ ±  2%
geomean                                     1.134µ

                                    │ baseline-bench.txt │
                                    │        B/op        │
SchemaValidation_Simple-4                   0.000 ± 0%
SchemaValidation_AllFields-4                0.000 ± 0%
SchemaValidation_FormatValidation-4         0.000 ± 0%
SchemaValidation_ManySchemas-4              0.000 ± 0%
geomean                                                ¹
¹ summaries must be >0 to compute geomean

                                    │ baseline-bench.txt │
                                    │     allocs/op      │
SchemaValidation_Simple-4                   0.000 ± 0%
SchemaValidation_AllFields-4                0.000 ± 0%
SchemaValidation_FormatValidation-4         0.000 ± 0%
SchemaValidation_ManySchemas-4              0.000 ± 0%
geomean                                                ¹
¹ summaries must be >0 to compute geomean

pkg: github.com/GoCodeAlone/workflow/store
cpu: AMD EPYC 7763 64-Core Processor                
                                   │ benchmark-results.txt │
                                   │        sec/op         │
EventStoreAppend_InMemory-4                   1.168µ ±  5%
EventStoreAppend_SQLite-4                     1.237m ±  8%
GetTimeline_InMemory/events-10-4              13.53µ ±  3%
GetTimeline_InMemory/events-50-4              75.16µ ± 20%
GetTimeline_InMemory/events-100-4             121.4µ ±  0%
GetTimeline_InMemory/events-500-4             624.1µ ±  0%
GetTimeline_InMemory/events-1000-4            1.272m ±  1%
GetTimeline_SQLite/events-10-4                105.3µ ±  0%
GetTimeline_SQLite/events-50-4                244.7µ ±  1%
GetTimeline_SQLite/events-100-4               416.7µ ±  1%
GetTimeline_SQLite/events-500-4               1.777m ±  2%
GetTimeline_SQLite/events-1000-4              3.465m ±  0%
geomean                                       214.2µ

                                   │ benchmark-results.txt │
                                   │         B/op          │
EventStoreAppend_InMemory-4                     791.0 ± 6%
EventStoreAppend_SQLite-4                     1.985Ki ± 2%
GetTimeline_InMemory/events-10-4              7.953Ki ± 0%
GetTimeline_InMemory/events-50-4              46.62Ki ± 0%
GetTimeline_InMemory/events-100-4             94.48Ki ± 0%
GetTimeline_InMemory/events-500-4             472.8Ki ± 0%
GetTimeline_InMemory/events-1000-4            944.3Ki ± 0%
GetTimeline_SQLite/events-10-4                16.74Ki ± 0%
GetTimeline_SQLite/events-50-4                87.14Ki ± 0%
GetTimeline_SQLite/events-100-4               175.4Ki ± 0%
GetTimeline_SQLite/events-500-4               846.1Ki ± 0%
GetTimeline_SQLite/events-1000-4              1.639Mi ± 0%
geomean                                       67.35Ki

                                   │ benchmark-results.txt │
                                   │       allocs/op       │
EventStoreAppend_InMemory-4                     7.000 ± 0%
EventStoreAppend_SQLite-4                       53.00 ± 0%
GetTimeline_InMemory/events-10-4                125.0 ± 0%
GetTimeline_InMemory/events-50-4                653.0 ± 0%
GetTimeline_InMemory/events-100-4              1.306k ± 0%
GetTimeline_InMemory/events-500-4              6.514k ± 0%
GetTimeline_InMemory/events-1000-4             13.02k ± 0%
GetTimeline_SQLite/events-10-4                  382.0 ± 0%
GetTimeline_SQLite/events-50-4                 1.852k ± 0%
GetTimeline_SQLite/events-100-4                3.681k ± 0%
GetTimeline_SQLite/events-500-4                18.54k ± 0%
GetTimeline_SQLite/events-1000-4               37.29k ± 0%
geomean                                        1.162k

cpu: AMD EPYC 9V74 80-Core Processor                
                                   │ baseline-bench.txt │
                                   │       sec/op       │
EventStoreAppend_InMemory-4                737.6n ±  8%
EventStoreAppend_SQLite-4                  1.764m ± 53%
GetTimeline_InMemory/events-10-4           10.46µ ±  8%
GetTimeline_InMemory/events-50-4           44.28µ ±  2%
GetTimeline_InMemory/events-100-4          89.32µ ±  1%
GetTimeline_InMemory/events-500-4          458.2µ ±  0%
GetTimeline_InMemory/events-1000-4         939.0µ ±  3%
GetTimeline_SQLite/events-10-4             67.02µ ±  1%
GetTimeline_SQLite/events-50-4             175.6µ ±  1%
GetTimeline_SQLite/events-100-4            309.8µ ±  1%
GetTimeline_SQLite/events-500-4            1.359m ±  1%
GetTimeline_SQLite/events-1000-4           2.619m ±  1%
geomean                                    160.9µ

                                   │ baseline-bench.txt │
                                   │        B/op        │
EventStoreAppend_InMemory-4                 848.5 ± 12%
EventStoreAppend_SQLite-4                 1.984Ki ±  2%
GetTimeline_InMemory/events-10-4          7.953Ki ±  0%
GetTimeline_InMemory/events-50-4          46.62Ki ±  0%
GetTimeline_InMemory/events-100-4         94.48Ki ±  0%
GetTimeline_InMemory/events-500-4         472.8Ki ±  0%
GetTimeline_InMemory/events-1000-4        944.3Ki ±  0%
GetTimeline_SQLite/events-10-4            16.74Ki ±  0%
GetTimeline_SQLite/events-50-4            87.14Ki ±  0%
GetTimeline_SQLite/events-100-4           175.4Ki ±  0%
GetTimeline_SQLite/events-500-4           846.1Ki ±  0%
GetTimeline_SQLite/events-1000-4          1.639Mi ±  0%
geomean                                   67.75Ki

                                   │ baseline-bench.txt │
                                   │     allocs/op      │
EventStoreAppend_InMemory-4                  7.000 ± 0%
EventStoreAppend_SQLite-4                    53.00 ± 2%
GetTimeline_InMemory/events-10-4             125.0 ± 0%
GetTimeline_InMemory/events-50-4             653.0 ± 0%
GetTimeline_InMemory/events-100-4           1.306k ± 0%
GetTimeline_InMemory/events-500-4           6.514k ± 0%
GetTimeline_InMemory/events-1000-4          13.02k ± 0%
GetTimeline_SQLite/events-10-4               382.0 ± 0%
GetTimeline_SQLite/events-50-4              1.852k ± 0%
GetTimeline_SQLite/events-100-4             3.681k ± 0%
GetTimeline_SQLite/events-500-4             18.54k ± 0%
GetTimeline_SQLite/events-1000-4            37.29k ± 0%
geomean                                     1.162k

Benchmarks run with go test -bench=. -benchmem -count=6.
Regressions ≥ 20% are flagged. Results compared via benchstat.

intel352 added a commit that referenced this pull request May 9, 2026
… + 3 Minor)

Addresses code-reviewer's HOLD on Task 19 (commit 3185c5b) of PR #584.
All five findings bundled into one commit per
feedback_commit_label_by_content_not_round.

Important #1 — flag rename: --allowlist → --preserve-names

The --allowlist flag preserved matching names (skipped them during
delete) but the name read as "list of resources allowed to be operated
on", an ambiguity that on a destructive command is a real operator-
error trap. Mental-model A would have an operator running
`--allowlist '^manual-'` expecting to delete only manual-* keys, and
instead deleting everything else (every production key).

Renamed to --preserve-names (verb in the name; matches what the impl
actually does: PRESERVE matching names, skip them during delete). No
backward-compat alias since the flag shipped today and no operator
has been instructed in any external doc to use --allowlist.

Updates:
- cmd/wfctl/infra_prune.go: runInfraPrune flag declaration + variable
  + error message
- cmd/wfctl/infra_prune.go: runInfraPruneCmd dispatcher captures
  --preserve-names + forwards as --preserve-names
- cmd/wfctl/infra_rotate_and_prune.go: runInfraRotateAndPrune flag
  declaration + variable; runInfraRotateAndPruneCmd dispatcher's
  pre-parse declaration; pruneArgs forwarding to runInfraPrune

Important #2 — typed ProviderID/Name fields with Outputs[*] fallback

The exclusion filter in runInfraPrune compared
`out.Outputs["access_key"]` against --exclude-access-key. If a future
EnumeratorAll-implementing provider populates ProviderID per the
documented contract but doesn't redundantly write
Outputs["access_key"], the exclude check would compare against an
empty string, the active credential would be added to toDelete, and
get silently deleted on a destructive run.

Same risk in audit-keys' rendering loop (Outputs[*] only).

Fix: prefer typed ProviderID/Name fields, fall back to Outputs[*] for
backward compat with providers that populate both. Applied to:
- cmd/wfctl/infra_prune.go: filter loop + dry-run rendering
- cmd/wfctl/infra_audit_keys.go: rendering loop

The DO provider (Task 15) populates both, so this is forward-looking
defense; no behavior change for the only current consumer.

Minor #1 — stray TODO fragment removed

cmd/wfctl/infra_prune.go: the line `// move opt-in checks further from
--confirm parsing.` was sandwiched between the exit-codes doc and the
//nolint:cyclop directive. It was actually the second line of the
nolint comment that gofmt split. Folded back into the nolint comment
where it belongs.

Minor #2 — conditional "recovery file retained" message

cmd/wfctl/infra_prune.go: the failure message
`%d delete(s) failed; recovery file retained at %s` fired
unconditionally on delete failure, but the recovery file only exists
if --recovery-from-last-rotation was set OR rotate-and-prune wrote it
before this invocation. Plain prune invocations were getting pointed
at a non-existent path. Conditional fix: only mention the recovery
file when one actually exists.

Minor #3 — surface os.Remove error as warning

cmd/wfctl/infra_prune.go: `_ = os.Remove(recoveryFilePath())` after
successful prune silently discarded errors. If perms changed or the
file was locked, the next --recovery-from-last-rotation invocation
would re-read stale data. Replaced with a non-fatal warning that's
loud enough for operators to hand-clean.

Verification:
  $ go test ./cmd/wfctl → ok
  $ wfctl infra prune --help        → shows -preserve-names (no -allowlist)
  $ wfctl infra rotate-and-prune --help → shows -preserve-names

Follow-up: PR #585 (docs) needs the runbook + ADR 0017 + WFCTL.md
updated to the renamed flag. Will push as a stacked commit on
docs/spaces-key-runbook.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
intel352 added a commit that referenced this pull request May 9, 2026
Follow-up to abe6b7b (the code rename). Updates the four docs/WFCTL.md
references that still pointed at the old --allowlist flag name:
- prune synopsis line
- prune flag table row
- rotate-and-prune synopsis line
- rotate-and-prune flag table row

These are the only remaining --allowlist references in the workflow
repo. The runbook + ADR 0017 in docs/spaces-key-runbook (PR #585) get
matching updates in a stacked commit on that branch.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ionale

Follow-up to abe6b7b (the code rename in PR #584). Updates the runbook
+ ADR 0017 to use --preserve-names everywhere the operator-facing flag
appears, plus adds a rename-rationale block in each that explains why
"allowlist" was retired (ambiguous verb on a destructive command —
some operators read it as "list allowed to be operated on" / delete).

Files:
- docs/runbooks/spaces-key-prune.md:
  - "Allowlist for manual keys" section retitled "Preserving
    hand-created keys"; --allowlist → --preserve-names throughout.
  - Added a blockquote-style note explaining the rename, anchored to
    ADR 0017.
- decisions/0017-prune-cli-two-key-opt-in.md:
  - Trade-study list updated (kept the prose generic — "Name regex
    as the primary filter" rather than naming a specific flag).
  - Decision section's --preserve-names paragraph extended with the
    naming rationale explicitly.

Two remaining mentions of "allowlist" (runbook line 190, ADR 0017
line 65) are intentional — they're inside the rename-rationale text.

No structural change — runbook still 7 H2 sections; ADR still
follows workspace standard format.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 9, 2026 07:40
@intel352
Copy link
Copy Markdown
Contributor Author

intel352 commented May 9, 2026

Stacked update for the --preserve-names rename — commit c784ae6.

PR #584 renamed the --allowlist prune flag to --preserve-names per code-reviewer's Important #1 (the verb in the name has to be unambiguous on a destructive command). This PR's runbook + ADR 0017 get matching updates:

  • docs/runbooks/spaces-key-prune.md: "Allowlist for manual keys" section retitled "Preserving hand-created keys"; all --allowlist references → --preserve-names. Added a blockquote-style note explaining the rename rationale, anchored to ADR 0017.
  • decisions/0017-prune-cli-two-key-opt-in.md: trade-study list made flag-name-agnostic; Decision section's --preserve-names paragraph extended with the naming rationale.

Two remaining mentions of "allowlist" (runbook line 190, ADR 0017 line 65) are intentional — they're inside the rename-rationale text.

Runbook still 7 H2 sections; ADR still follows workspace standard format. No structural change.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Comment on lines +4 to +13
cleanup, periodic hygiene, scheduled rotation), use `wfctl infra
rotate-and-prune` to rotate the canonical credential AND prune older keys
in one atomic flow. Inspect-then-prune and recovery flows are documented
below for the cases where the all-in-one is not the right tool.

This runbook assumes:

- The workflow CLI is built from a release containing the `infra
audit-keys` / `infra prune` / `infra rotate-and-prune` subcommands
(`wfctl infra --help` lists all three).
Comment on lines +63 to +67
- The CLI dispatchers (`runInfraAuditKeysCmd`,
`runInfraPruneCmd`, `runInfraRotateAndPruneCmd`) iterate loaded
iac.provider modules and pick the first one implementing
`EnumeratorAll`. Multi-provider configs need to ensure at most one
provider per `--type`.
Comment on lines +97 to +98
- `cmd/wfctl/infra_audit_keys.go`, `cmd/wfctl/infra_prune.go` — CLI
consumers.
Comment on lines +119 to +121
- `cmd/wfctl/infra_prune.go` — implementation.
- `cmd/wfctl/infra_prune_test.go` — `TestInfraPrune_RequiresTwoKeyOptIn`,
`TestInfraPrune_RequiresExcludeAccessKey`,
Comment on lines +47 to +60
`wfctl infra prune` requires:

1. `--type <T>` (required) — single resource type per invocation. Rejects
if missing.
2. `--created-before <RFC3339>` (required) — only resources older than
this are eligible.
3. `--exclude-access-key <AK>` (required) — this access_key is preserved
no matter what (paranoia rail).
4. `--confirm` flag — explicit per-invocation consent.
5. `WFCTL_CONFIRM_PRUNE=1` environment variable — two-key authorization.
6. Interactive `y/N` prompt — skipped only with `--non-interactive`.

If ANY of (1)-(5) is missing, prune exits with a structured error and
NO cloud calls are made.
intel352 added a commit that referenced this pull request May 9, 2026
…asks 16-22) (#584)

* test(wfctl): failing test for infra audit-keys

Adds TestInfraAuditKeys_ListsAll + fakeProviderEnumeratorAll fixture for
the new `wfctl infra audit-keys --type <T>` CLI. Verifies that the
command:
  - delegates to interfaces.EnumeratorAll.EnumerateAll(resourceType)
  - forwards the --type flag to the enumerator
  - renders every returned key's identifying fields (Name, ProviderID/
    access_key) into the writer
  - exits 0 on success

This is the failing test for Task 16 of the spaces-key-iac-resource plan
(PR5). Until Task 17 implements runInfraAuditKeys + the registration of
`wfctl infra audit-keys`, this test fails to compile with `undefined:
runInfraAuditKeys`.

Pre-staged on feat/spaces-key-clis off feat/spaces-key-storage-filter
(PR4a). Will be rebased onto origin/main after PR4a merges + workflow
v0.26.0 tag is cut, then pushed as PR5.

Plan: docs/plans/2026-05-08-spaces-key-iac-resource.md (commit 316559f7),
Task 16 of PR5.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(wfctl): infra audit-keys lists cloud resources via EnumeratorAll

Implements `wfctl infra audit-keys --type <T>` for Task 17 of the
spaces-key-iac-resource plan (PR5). The command loads iac.provider
modules from infra.yaml (honoring --config / --env), finds the first
provider that implements interfaces.EnumeratorAll, and prints every
resource of `<T>` it returns as a fixed-width table (NAME / ACCESS_KEY
/ CREATED_AT). Read-only — drift correction surface for the destructive
`wfctl infra prune` (Task 19) that comes next.

Design notes:

 - runInfraAuditKeys takes interfaces.EnumeratorAll directly (not the
   broader IaCProvider) so unit tests can pass a minimal fake without
   implementing every IaCProvider method. The IaCProvider →
   EnumeratorAll type-assertion happens at the dispatcher boundary in
   runInfraAuditKeysCmd, where producing a structured error is
   appropriate (provider plugin doesn't support the optional interface).
 - auditKeysLoadProviders is a seam variable defaulting to
   defaultCleanupLoadProviders so audit-keys inherits the same
   env-resolution + plugin-discovery contract as `infra cleanup`.
 - auditKeysStdout / auditKeysStderr seam variables mirror the cleanup
   pattern so parallel tests don't race on global os.Stdout.

Test coverage: TestInfraAuditKeys_ListsAll (Task 16, prior commit) now
PASSes with this implementation; full cmd/wfctl test suite stays green.

Smoke-tested:
  $ wfctl infra audit-keys --help    # exit 0, prints flags
  $ wfctl infra audit-keys            # exit 1, "no infra config found"

Plan: docs/plans/2026-05-08-spaces-key-iac-resource.md (commit 316559f7),
Task 17 of PR5.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(wfctl): failing tests for infra prune two-key opt-in + filter

Adds three failing tests for `wfctl infra prune` (Task 18 of PR5):

  TestInfraPrune_RequiresTwoKeyOptIn — destructive prune must require
    BOTH `--confirm` flag AND WFCTL_CONFIRM_PRUNE=1 env var. Either
    alone (or neither) → non-zero exit before any cloud call.
  TestInfraPrune_RequiresExcludeAccessKey — `--exclude-access-key` is
    mandatory so the operator must explicitly name the active credential
    they want preserved. Error message must mention the flag.
  TestInfraPrune_FiltersByTimeAndExcludesAccessKey — happy path: with
    both opt-ins + the exclude flag, prune deletes every key whose
    created_at is older than --created-before EXCEPT the excluded
    access_key. Tracks deletions by ProviderID on a fakeProviderWithDelete
    that implements EnumerateAll + DeleteResource.

Local helper `pruneContains` (renamed to avoid collision with the
existing `containsString` in infra_templates.go).

Currently fails with `undefined: runInfraPrune` (test build broken) —
expected red state. Task 19 will introduce the implementation to make
all three pass. Stacked onto PR #584; final squash will land all
six PR5 commits together.

Plan: docs/plans/2026-05-08-spaces-key-iac-resource.md (commit 316559f7),
Task 18 of PR5.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(wfctl): infra prune with two-key opt-in + time/access_key discriminator

Implements `wfctl infra prune` for Task 19 of PR5. Destructive cloud-side
resource pruning with three-factor authorization plus mandatory exclusion
target so a typo can't accidentally nuke the active credential.

Authorization gauntlet (all three required):
  - `--confirm` flag: explicit per-invocation consent
  - WFCTL_CONFIRM_PRUNE=1 env var: two-key authorization (set by
    operator; not by automation by default)
  - interactive y/N prompt: skipped under `--non-interactive` for CI

Mandatory filter args (paranoia rail):
  - `--created-before <RFC3339>`: only resources older than this eligible
  - `--exclude-access-key <AK>`: this access_key preserved no matter what

Optional filters:
  - `--allowlist <regex>`: skip names matching the regex
  - `--recovery-from-last-rotation`: read filter args from the recovery
    file written by `infra rotate-and-prune` (Task 21). On success the
    recovery file is removed; on failure it's retained for re-invocation.

Design notes:

  - runInfraPrune takes a narrow `pruneProvider` interface (EnumerateAll +
    DeleteResource) so unit tests can use a minimal fake. Production code
    wraps an interfaces.IaCProvider in pruneProviderAdapter which bridges
    to the existing interfaces.EnumeratorAll + ResourceDriver.Delete
    primitives at the boundary.
  - Separates the recovery-file machinery (defaultStateDir,
    recoveryFilePath, readRecoveryFile, recoveryFile struct) so Task 21
    can reuse the writer side without code duplication.
  - pruneStdout/pruneStderr seam vars + pruneLoadProviders seam mirror
    the cleanup pattern so prune-related tests don't race on global
    os.Stdout.

Test coverage: TestInfraPrune_RequiresTwoKeyOptIn,
TestInfraPrune_RequiresExcludeAccessKey, and
TestInfraPrune_FiltersByTimeAndExcludesAccessKey (Task 18, prior commit)
all PASS with this implementation. Full cmd/wfctl suite stays green.

Smoke-tested:
  $ wfctl infra prune --confirm ... → exits 0/1 per the gauntlet

Plan: docs/plans/2026-05-08-spaces-key-iac-resource.md (commit 316559f7),
Task 19 of PR5.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(wfctl): failing tests for rotate-and-prune happy path + recovery

Adds two failing tests for `wfctl infra rotate-and-prune` per Task 20 of
docs/plans/2026-05-08-spaces-key-iac-resource.md (commit 316559f7):

- TestInfraRotateAndPrune_HappyPath stubs the package-level
  bootstrapSecrets test hook (same pattern as the existing generateSecret
  hook in bootstrap_test.go) to return a single RotationResult, runs
  rotate-and-prune against a fake provider returning AK_OLD + AK_NEW from
  EnumerateAll, and asserts that AK_OLD is deleted, AK_NEW is preserved,
  and the recovery file at $WFCTL_STATE_DIR/last-rotation.json is removed
  on full success.

- TestInfraRotateAndPrune_RecoveryFileWrittenWithCorrectPerms forces
  DeleteResource to return an error, then asserts that the recovery file
  is RETAINED (so `wfctl infra prune --recovery-from-last-rotation` can
  complete the prune without re-rotating) AND has permissions 0600
  (owner-only — the file contains access_key + name metadata sufficient
  to identify the canonical credential).

Adds the fakeProviderEnumerableDriver test double — same shape as the
existing fakeProviderWithDelete in infra_prune_test.go but with an
additional deleteErr hook so the failure-path test can simulate
transient errors. Local rotateAndPruneContains helper avoids cross-file
collisions with sibling test helpers (existing pruneContains is in a
different file scope).

Currently FAILS with `undefined: runInfraRotateAndPrune` — the
failing-side signal Task 20 is supposed to produce. Task 21 lands the
implementation in cmd/wfctl/infra_rotate_and_prune.go to make these PASS.

Stacks on top of Task 19's implementation (commit 3185c5b) on the
PR5 (feat/spaces-key-clis) branch.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(wfctl): infra rotate-and-prune all-in-one + recovery file (ADR 0017)

Implements Task 21 of the spaces-key-iac-resource plan
(docs/plans/2026-05-08-spaces-key-iac-resource.md, commit 316559f7).

Adds `cmd/wfctl/infra_rotate_and_prune.go` with three pieces:

1. recoveryRecord struct — JSON-superset of recoveryFile (defined in
   infra_prune.go by Task 19) adding Source + RotatedAt for forensics.
   The prune reader ignores the extra fields so this is a backwards-
   compatible extension; both readers share the same canonical path.

2. writeRecoveryRecord(rec) — persists the JSON to
   $WFCTL_STATE_DIR/last-rotation.json with 0600 file perms + 0700
   parent dir. Sensitive credential metadata; only the owner reads it.

3. runInfraRotateAndPrune(args, provider, w) — the all-in-one CLI:
   - Two-key opt-in: --confirm flag + WFCTL_CONFIRM_PRUNE=1 env (same as
     plain prune, since rotation+prune is doubly destructive).
   - Step 1: rotate via the existing parseSecretsConfig +
     resolveSecretsProvider + resolveCredentialRevoker + bootstrapSecrets
     chain, with forceRotate={name: true}. Returns []RotationResult per
     ADR 0020 — no subprocess, no stderr parsing.
   - Step 2: persist recovery record BEFORE the prune step, so a
     mid-prune failure doesn't lose the data needed to finish cleanup
     (an operator can recover via prune --recovery-from-last-rotation
     without re-rotating, which would worsen any leak).
   - Step 3: delegate to runInfraPrune passing rotated.CreatedAt as
     --created-before and rotated.AccessKey as --exclude-access-key.
     Older keys for the same Type are deleted; the just-rotated key is
     preserved by the exclusion filter.
   - On full success: rotate-and-prune (the file's writer) deletes the
     recovery file. On prune failure: file is retained + recovery
     instructions printed to stdout.

Provider is typed as `pruneProvider` (the narrow interface from
infra_prune.go) so the unit tests share a single fake surface with the
prune CLI itself — keeps blast radius small.

Verification:
- GOWORK=off go test ./cmd/wfctl -run TestInfraRotateAndPrune -v →
  TestInfraRotateAndPrune_HappyPath PASS,
  TestInfraRotateAndPrune_RecoveryFileWrittenWithCorrectPerms PASS.
- GOWORK=off go test ./cmd/wfctl -count=1 → entire wfctl suite PASS.

Test fixture adjustments (Task 20's test file):
- Added writeMinimalRotationConfig helper that writes a minimal
  infra.yaml with secrets.provider=env (no external deps) so
  parseSecretsConfig + resolveSecretsProvider succeed before the
  bootstrapSecrets stub takes over. Plan's test code didn't pass a
  --config but the actual implementation chain requires one;
  surfacing the fixture in the test rather than special-casing the
  impl keeps the hot path config-driven.
- Both tests now pass --config <fixture> alongside the existing flags.

Stacks on top of Task 20 (commit 739a0ef) on PR5 branch
(feat/spaces-key-clis).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(wfctl): wire audit-keys + prune + rotate-and-prune subcommands (Task 22 + Task 17 args bug fix)

Closes Task 22 of PR5 — registers all three new infra subcommands under
the `wfctl infra` dispatcher and adds operator-facing reference docs in
docs/WFCTL.md.

Bundled bug fix (per code-reviewer + Copilot 3212662894 retro):
runInfraAuditKeysCmd and runInfraPruneCmd both forwarded the raw args
slice (which can include --config / --env / -c) to their inner
runInfraAuditKeys / runInfraPrune helpers whose narrow FlagSets only
declared --type and prune-specific flags. Real CLI invocations like
`wfctl infra audit-keys --type infra.spaces_key --config infra.yaml`
exited 2 with `flag provided but not defined: -config` — documented
flags broken. Unit tests of the inner functions never exercised the
dispatcher's args-passing path so the bug shipped past review.

Fix (team-lead's recommended option a): both dispatchers now CAPTURE
every flag they pre-parse and SYNTHESIZE a clean inner-args slice with
only flags the inner function understands. Added regression-sentinel
smoke tests `TestInfraAuditKeysCmd_AcceptsConfigFlag` and
`TestInfraPruneCmd_AcceptsConfigFlag` that prove `--config` + `--env`
are accepted end-to-end through the dispatcher.

rotate-and-prune dispatcher: Task 21's `runInfraRotateAndPrune` already
declares --config / --env on its own FlagSet (it needs them for
parseSecretsConfig in Step 1 of the rotate flow), so the dispatcher
forwards args verbatim — no synthesize-clean-args dance required there.

Files:
- cmd/wfctl/infra.go: register `audit-keys`, `prune`, `rotate-and-prune`
  cases under the infra dispatcher; add usage-doc lines.
- cmd/wfctl/infra_audit_keys.go: capture all dispatcher flags including
  --type; synthesize clean inner args for runInfraAuditKeys.
- cmd/wfctl/infra_prune.go: same fix for runInfraPruneCmd; capture all
  flags + synthesize clean inner args.
- cmd/wfctl/infra_rotate_and_prune.go: add runInfraRotateAndPruneCmd
  dispatcher + rotateAndPruneStdout / rotateAndPruneStderr seam vars +
  rotateAndPruneLoadProviders seam.
- cmd/wfctl/infra_audit_keys_test.go: TestInfraAuditKeysCmd_AcceptsConfigFlag
  smoke test + minimal fakeIaCProviderForAuditKeys stub.
- cmd/wfctl/infra_prune_test.go: TestInfraPruneCmd_AcceptsConfigFlag smoke
  test + fakeIaCProviderForPrune + fakeNoopDriver stubs.
- docs/WFCTL.md: add command reference for all three subcommands (~120
  lines: action table entries, full flag tables, exit codes, examples).

Verification:
  $ go test ./cmd/wfctl → ok (full suite green)
  $ go build -o /tmp/wfctl ./cmd/wfctl
  $ /tmp/wfctl infra --help → lists audit-keys, prune, rotate-and-prune
  $ /tmp/wfctl infra audit-keys --help → exit 0, prints flags
  $ /tmp/wfctl infra prune --help → exit 0, prints flags
  $ /tmp/wfctl infra rotate-and-prune --help → exit 0, prints flags

Note: `docs/dsl-reference-embedded.md` regen step from the plan skipped
— the file does not exist in the current main; the `docs gen-dsl-reference`
generator wasn't introduced in PR0/PR1/PR4a so there's nothing to regen.

Plan: docs/plans/2026-05-08-spaces-key-iac-resource.md (commit 316559f7),
Task 22 of PR5.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* chore(wfctl): tighten rotate-and-prune — drop unused envName, clarify step numbering, actionable recovery-cleanup warning

Three Task 21 Minor follow-ups from code-reviewer (PR #584, on top of
implementer-2's Task 22 wiring commit 5d8ae88):

1. **Unused --env**: replaced `var envName string; fs.StringVar(&envName, ...);
   _ = envName` with `_ = fs.String("env", "", ...)`. The flag is still
   declared so the dispatcher (`runInfraRotateAndPruneCmd` in 5d8ae88)
   can forward args verbatim including --env without the inner FlagSet
   erroring on unknown-flag, but no local var is allocated for a value
   that's never read inside this function. Doc comment now explicitly
   states the flag is consumed by the dispatcher; secrets-config
   resolution here happens via --config alone.

2. **Step-numbering**: previously the inline comments labelled three
   internal steps (1/rotate, 2/persist, 3/prune) but the user only saw
   two banners ("Step 1: rotating", "Step 2: pruning"). Aligned the
   comments to match the user-visible numbering: rotate is Step 1
   (with recovery-write as a sub-step), prune is Step 2.

3. **Recovery-cleanup warning**: bare `failed to remove recovery file`
   replaced with explicit success-confirmation + manual-cleanup hint
   ("rotation+prune succeeded but failed to remove stale recovery file
   at PATH: ERR" + "this file is no longer needed; remove with `rm
   PATH` once safe."). Operator knows their data is fine and only the
   state file needs hand-clearing.

Verification:
- GOWORK=off go test ./cmd/wfctl -run TestInfraRotateAndPrune -v → both PASS
- GOWORK=off go test ./cmd/wfctl -count=1 → entire wfctl suite PASS

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(wfctl): code-reviewer findings on Task 19 prune impl (2 Important + 3 Minor)

Addresses code-reviewer's HOLD on Task 19 (commit 3185c5b) of PR #584.
All five findings bundled into one commit per
feedback_commit_label_by_content_not_round.

Important #1 — flag rename: --allowlist → --preserve-names

The --allowlist flag preserved matching names (skipped them during
delete) but the name read as "list of resources allowed to be operated
on", an ambiguity that on a destructive command is a real operator-
error trap. Mental-model A would have an operator running
`--allowlist '^manual-'` expecting to delete only manual-* keys, and
instead deleting everything else (every production key).

Renamed to --preserve-names (verb in the name; matches what the impl
actually does: PRESERVE matching names, skip them during delete). No
backward-compat alias since the flag shipped today and no operator
has been instructed in any external doc to use --allowlist.

Updates:
- cmd/wfctl/infra_prune.go: runInfraPrune flag declaration + variable
  + error message
- cmd/wfctl/infra_prune.go: runInfraPruneCmd dispatcher captures
  --preserve-names + forwards as --preserve-names
- cmd/wfctl/infra_rotate_and_prune.go: runInfraRotateAndPrune flag
  declaration + variable; runInfraRotateAndPruneCmd dispatcher's
  pre-parse declaration; pruneArgs forwarding to runInfraPrune

Important #2 — typed ProviderID/Name fields with Outputs[*] fallback

The exclusion filter in runInfraPrune compared
`out.Outputs["access_key"]` against --exclude-access-key. If a future
EnumeratorAll-implementing provider populates ProviderID per the
documented contract but doesn't redundantly write
Outputs["access_key"], the exclude check would compare against an
empty string, the active credential would be added to toDelete, and
get silently deleted on a destructive run.

Same risk in audit-keys' rendering loop (Outputs[*] only).

Fix: prefer typed ProviderID/Name fields, fall back to Outputs[*] for
backward compat with providers that populate both. Applied to:
- cmd/wfctl/infra_prune.go: filter loop + dry-run rendering
- cmd/wfctl/infra_audit_keys.go: rendering loop

The DO provider (Task 15) populates both, so this is forward-looking
defense; no behavior change for the only current consumer.

Minor #1 — stray TODO fragment removed

cmd/wfctl/infra_prune.go: the line `// move opt-in checks further from
--confirm parsing.` was sandwiched between the exit-codes doc and the
//nolint:cyclop directive. It was actually the second line of the
nolint comment that gofmt split. Folded back into the nolint comment
where it belongs.

Minor #2 — conditional "recovery file retained" message

cmd/wfctl/infra_prune.go: the failure message
`%d delete(s) failed; recovery file retained at %s` fired
unconditionally on delete failure, but the recovery file only exists
if --recovery-from-last-rotation was set OR rotate-and-prune wrote it
before this invocation. Plain prune invocations were getting pointed
at a non-existent path. Conditional fix: only mention the recovery
file when one actually exists.

Minor #3 — surface os.Remove error as warning

cmd/wfctl/infra_prune.go: `_ = os.Remove(recoveryFilePath())` after
successful prune silently discarded errors. If perms changed or the
file was locked, the next --recovery-from-last-rotation invocation
would re-read stale data. Replaced with a non-fatal warning that's
loud enough for operators to hand-clean.

Verification:
  $ go test ./cmd/wfctl → ok
  $ wfctl infra prune --help        → shows -preserve-names (no -allowlist)
  $ wfctl infra rotate-and-prune --help → shows -preserve-names

Follow-up: PR #585 (docs) needs the runbook + ADR 0017 + WFCTL.md
updated to the renamed flag. Will push as a stacked commit on
docs/spaces-key-runbook.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs(wfctl): rename --allowlist references to --preserve-names

Follow-up to abe6b7b (the code rename). Updates the four docs/WFCTL.md
references that still pointed at the old --allowlist flag name:
- prune synopsis line
- prune flag table row
- rotate-and-prune synopsis line
- rotate-and-prune flag table row

These are the only remaining --allowlist references in the workflow
repo. The runbook + ADR 0017 in docs/spaces-key-runbook (PR #585) get
matching updates in a stacked commit on that branch.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
@intel352 intel352 merged commit b90857b into main May 9, 2026
22 checks passed
@intel352 intel352 deleted the docs/spaces-key-runbook branch May 9, 2026 07:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants