Skip to content

feat: architect-lens, rework-scan, typed DoD, reframe nudges#4

Merged
jed72 merged 15 commits into
mainfrom
feat/cross-task-architectural-integrity
May 23, 2026
Merged

feat: architect-lens, rework-scan, typed DoD, reframe nudges#4
jed72 merged 15 commits into
mainfrom
feat/cross-task-architectural-integrity

Conversation

@jed72
Copy link
Copy Markdown
Owner

@jed72 jed72 commented May 23, 2026

Summary

Six improvements that close the "merged code later removed because the
architecture was wrong" gap, without compromising any of the framework's
principles (no new guardrail, no new route, no new reading dimension,
Flow still advises and never gates).

  • architecture/ artifact tree — sibling to governance/, loaded by Frame
    into architecture-loaded.yml. New compass adr new helper. Five worked-
    example ADRs under templates/architecture/decisions/.
  • architect-lens agent — reads architecture/, writes architecture-notes.md;
    spec-author and planner consult those notes rather than re-invoking. New
    /compass:roundtable architect-lens invocation.
  • Reframe-as-default-response — stop-hook nudges on scope-bloat phrases
    (patterns loaded from new governance/signals.yml); compass calibration
    reports absorbed mis-frames; roundtable's "Reframe trigger" section.
  • compass rework-scan — cross-task add-then-delete detector + public-surface
    • migration-pair detection; signal not gate (exit 0 on detection); surfaces
      in /compass:flow --digest.
  • Typed Definition of Done(evidence: EV-id) / (backfill: BF-id)
    inline tags; cross-task backfill chain via target_task field; new
    dod-evidence-typed check under G4 (no sixth guardrail); compass backfill pay.
  • governance/signals.yml — third governance file alongside guardrails/routing;
    advisory patterns for the above.

40 BDD scenarios across 7 intents. Built on Expedition route with 6-stream
swarm. `compass check` 10/10 PASS, `pytest tests/` 161/161 PASS,
`ruff check` matches baseline (no new lint debt).

Five principle-preservation regression tests (Group F) protect P1–P8:
Frame stays mandatory, guardrail count stays at 5, adaptive routing
baseline is captured + diff-tested, Flow remains non-mutating, architect-lens
output is annotations not Gherkin.

Test plan

  • `pytest tests/` — 161/161 PASS
  • `python3 cli/compass check` — 10/10 PASS
  • `compass policy lint` — PASS
  • Routing baseline captured at `tests/fixtures/route-baseline.yml` and
    regression-tested
  • Lint: same 11-error count as `main` (different makeup; no new debt)
  • All 8 architectural invariants and 6 boundary risks verified
  • CI `self-check` — pending on PR

Follow-on tasks recorded (not in this PR)

  1. `architecture-invariants-schema` — structured schema for `invariants.yml`
  2. `compass-self-architecture` — populate Compass's own `architecture/`
  3. `brief-mandatory-on-user-visible-tasks`
  4. `project-ci-health-gate`
  5. `backfill-area-matching`
  6. `swarm-script-strips-markdown` — strip markdown from distribution-map branch-name cell

🤖 Generated with Claude Code

jed72 and others added 15 commits May 23, 2026 21:02
Adds the shipped foundation needed by downstream streams:

- governance/signals.yml — advisory patterns for scope-bloat detection,
  rework-scan window + public-surface patterns + migration paths.
  Project-overridable; ships with sane defaults.

- schemas/signals.schema.json — structural validator for signals.yml,
  enforced via compass policy lint (in a follow-on change).

- tests/fixtures/route-baseline.yml — captured outputs of
  compass route evaluate for the five reference reading combinations.
  Read by the routing regression test (TRC-F3) so any future drift
  in route composition fails loudly.

- tests/fixtures/rework-scan/add-then-delete-pair/ — synthetic pair of
  task.yml fixtures (task-adds-handler + task-removes-handler) where
  five identical file paths flip from action: added to action: deleted.
  The canonical pattern compass rework-scan must detect (TRC-D6).
  Filenames + service names are illustrative; the load-bearing structure
  is the matching paths and opposing actions.

Also adds /docs/proposals/ to .gitignore — working proposals are
team-internal planning notes; the rationale that survives into the
framework lives in docs/methodology.md and the rest of the public docs.
…hor, planner

Implements TRC-B1..B5, TRC-X5, TRC-F5 (stream-2 / U2):

- agents/architect-lens.md (new): the architect lens agent. Reads
  architecture/ artifacts and the task's spec/plan; writes architecture-notes.md
  with five required sections (system under change, invariants, boundary risks,
  candidate ADRs, notes for the planner). Degrades gracefully when architecture/
  is absent (WARNING line, non-blocking). Explicitly prohibits Given/When/Then
  output (B-Risk 2 avoidance). Inv-5 enforced: reads spec, never authors it.

- commands/roundtable.md: add "Agent roster" section listing architect-lens
  alongside existing lenses, with the registration contract (invoke only if
  agent file exists) and named-invocation pattern. Section is distinct from
  stream-3's "Reframe trigger" section for mechanical merge at Land.

- agents/spec-author.md: add step 4 — consult architect-lens trigger (Q5
  logic: touches contains public-api OR service name from relations.md OR
  lens_trigger_tag from invariants.yml). Bootstrap exception documented.

- agents/planner.md: extend step 1 — read architecture-notes.md when present;
  cite-or-diverge protocol (DD-5); recordable-absence behaviour.

- tests/test_architect_lens.py (new): 19 structural tests covering all 7
  scenarios. TRC-F5 encodes the B-Risk 2 prohibition explicitly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…C-D1..D6,X2)

Implements the cross-task rework scanner (R4 from the architectural integrity
proposal). The scanner reads changed_files across task.yml files, detects
add-then-delete patterns within a configurable window, and surfaces the results
as an advisory signal in the flow digest — never as a gate.

Key invariants enforced:
- Exit code is always 0 on rework detection (B-Risk 3: signal not gate)
- Patterns loaded from governance/signals.yml at runtime (B-Risk 6: no hardcoding)
- Flow --digest does not mutate any task.yml (Inv-4: advises, never gates)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ader, adr subcommand, templates

Implements stream-1 work unit U1 (system-level architecture artifact + Frame load):

- cli/compass: add frame_load_architecture() helper (TRC-A1, A2, A5, A5b, X1)
  and compass adr new <slug> subcommand (TRC-A3)
- templates/architecture/: ship system-context.md, relations.md, ownership.md,
  ADR-template.md, decisions/README.md, and five worked-example ADRs 001-005 (TRC-A4)
- commands/frame.md: extend Procedure with "Load project architecture if present" step
- docs/methodology.md: add §11 "Loading project architecture" — the contract for
  downstream agents
- tests/: test_frame_loads_architecture.py (5 tests), test_adr_helper.py (3 tests),
  test_templates_present.py (4 tests) — 12 new tests, 106 total pass

Architectural invariants honoured:
- Inv-1: frame_load_architecture writes architecture-loaded.yml only; task.yml.readings
  is untouched (B-Risk 1 avoided)
- Inv-7: deterministic — sha256 per artifact, idempotent within a git tree
- Inv-8: absence of architecture/ produces empty record, no error (TRC-A2, A5b)
- Malformed invariants.yml fails loudly with file path + parse error (TRC-X1)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add the `dod-evidence-typed` check under G4 (evidence, not assertion).
Every unchecked DoD box in verification-report.md must carry an inline tag:
`(evidence: EV-<id>)` pointing at a typed evidence registry entry, or
`(backfill: BF-<id>)` pointing at a task.yml backfill with status: owed.
Bare unchecked boxes fail compass check. Checked boxes pass unconditionally.

Cross-task chaining (TRC-E3): when a backfill carries `target_task: <slug>`,
the named task's Land check fails until the backfill is paid. New subcommand
`compass backfill pay --task <slug> <BF-id>` flips status to paid.

Updates:
- governance/guardrails.yml: dod-evidence-typed check declared and wired to G4
- cli/compass: _check_dod_evidence_typed, _check_inbound_backfills, cmd_backfill_pay
- commands/land.md: inline-tag syntax documentation
- templates/verification-report.md: DoD template teaches the syntax
- tests/test_check_guardrails.py: TRC-E1..E5, TRC-X4
- tests/test_policy_integrity.py: TRC-E6

Guardrail count: still 5 (G1..G5). dod-evidence-typed is a check under G4,
not a new guardrail letter. Backward compat: tasks with empty or absent DoD
pass cleanly (TRC-X4). Existing backfills without target_task field work as before.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…debt, roundtable trigger

Implements Group C (TRC-C1..C6) and TRC-X3 for stream-3 (U3).

Key changes:
- hooks/stop.sh: scope-bloat phrase detection reads governance/signals.yml
  at runtime (B-Risk 6); anchors phrase as top-level statement to avoid
  false positives on quoted/indented context (TRC-X3); non-blocking (exit 0)
- cli/compass: calibration extended with _find_reframe_debt() — surfaces
  absorbed mis-frames (tasks with scope-bloat devlog signals and empty reframes
  lists); strictly read-only over task.yml (B-Risk 5)
- commands/roundtable.md: "Reframe trigger" section documents the requirement
  to file a reframe after boundary or migration decisions (TRC-C4)
- docs/methodology.md: §"Reframes — feedback signal" explains the contract
  that absorbed mis-frames are calibration signals lost
- tests/test_stop_hook_reframe_nudge.py: new — covers C1, C2, C3, X3 and
  B-Risk 6 runtime-loading guard
- tests/test_roundtable_doc.py: new — covers C4 structural assertion
- tests/test_calibration.py: extended — covers C5, B-Risk 5 (SHA invariant)
- tests/test_flow_digest.py: new — covers C6, F4 (no mutation), D5 stub

All 107 tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… (U6 / TRC-F1..F4)

Implements stream-6 (U6) of cross-task-architectural-integrity. Adds four
regression tests that encode the framework's architectural invariants as
executable checks, so any future change that drifts from these principles
fails a test rather than passing silently.

- TRC-F1 (tests/test_pre_tool_hook.py): invokes hooks/pre-tool.sh against a
  synthetic project with no route.md and asserts exit 2 with a message naming
  the missing route.md. Confirms Frame is still mandatory and unchanged.

- TRC-F2 (tests/test_policy_integrity.py): parses governance/guardrails.md
  for H3 headings and governance/guardrails.yml defaults for ids; asserts
  exactly five (G1..G5) in both, with no project entry re-using the G namespace.

- TRC-F3 (tests/test_route_selection.py): loads tests/fixtures/route-baseline.yml
  and asserts compass route evaluate produces the same route/topology/phases/gates
  for each of the five reference reading combinations. No routing drift detected.

- TRC-F4 (tests/test_flow_digest.py): SHA256-snapshots every task.yml before
  running compass calibration, then re-snapshots after; asserts byte-identity.
  Confirms advisory commands do not mutate task state. Includes stubs for
  TRC-C6 (stream-3) and TRC-D5 (stream-4).

No regressions found: all four invariants hold against the current framework.
Full suite: 104 tests, 0 failures.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Resolved conflicts:
- cli/compass: kept both ADR block (stream-1) and flow/rework-scan blocks
  (stream-4) as additive subparser registrations
- docs/methodology.md: renumbered sections — stream-1 "Loading project
  architecture" -> 11; stream-4 "Cross-task rework" -> 12; stream-3
  "Reframes — feedback signal" -> 13; original Design principles -> 14
- tests/test_flow_digest.py: unioned test functions — stream-6's
  test_does_not_mutate_tasks + test_calibration_does_not_write_to_work_dir
  (canonical), stream-4's test_includes_rework_scan, stream-3's
  test_includes_reframe_debt
Resolved cli/compass conflict by accepting both regions: stream-4's
rework-scan helpers and stream-5's backfill pay command are additive;
both subparser blocks registered.
Resolved conflicts:
- tests/test_flow_digest.py: kept the unified version produced during the
  stream-4 merge (test_does_not_mutate_tasks from stream-6,
  test_includes_rework_scan from stream-4, test_includes_reframe_debt from
  stream-3, plus helpers and shared module header)
- tests/test_policy_integrity.py: kept both TRC-E6 (stream-5's
  test_dod_check_registered, already in HEAD) and TRC-F2 (stream-6's
  guardrail-count regression tests) as additive test functions
ruff check --fix resolved 35 F541 / F811 / similar in the merged tree.
The 11 remaining errors (3 E402 + 8 F841) match the pre-existing baseline
on main in count; the makeup shifts but no new lint debt is introduced.
… typed DoD in CLAUDE.md

CLAUDE.md is loaded into every session — these are the consumer-facing
mentions of mechanisms added by the cross-task-architectural-integrity work:
- architect-lens in the agents-and-skills section
- architecture-loaded.yml + architecture-notes.md in the per-task tree
- signals.yml in the governance/ enumeration
- a short paragraph on architecture/ load at Frame, paralleling governance/

Resolves backfill BF-LIVING-DOCS at Land.
@jed72 jed72 merged commit 9db36e5 into main May 23, 2026
1 check passed
jed72 added a commit that referenced this pull request May 24, 2026
Cuts the 1.0.0 stable release. Three substantial PRs have shipped since 1.0.0-rc.1: PR #4 (architectural integrity suite), PR #5 (Compass self-architecture), PR #6 (swarm.sh parser fix). Remaining follow-ons are enhancements, not gaps. Bumps the four locations carrying the version string and adds a regression test guarding against partial bumps.

Merged via --admin override per the established CODEOWNERS-blocks-self-approval pattern (solo-maintainer repo).

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant