Skip to content

phase5-D: schema-version policing gate (TDD)#35

Merged
rafael5 merged 3 commits into
mainfrom
phase5-D
May 11, 2026
Merged

phase5-D: schema-version policing gate (TDD)#35
rafael5 merged 3 commits into
mainfrom
phase5-D

Conversation

@rafael5
Copy link
Copy Markdown
Contributor

@rafael5 rafael5 commented May 11, 2026

Summary

Phase 5 Track D per phase5-plan.md §5. Gate 4 of 4 — the final continuous-enforcement gate. Watches schema_compat bumps + non-additive shape changes on profile/tools.schema.json + profile/task_index.schema.json (the two schemas that carry the field).

What ships

  • profile/build/check-schema-compat.py — diffs each schema between a base ref (default origin/main) and a head ref (default HEAD). Enforces:

    1. If schema_compat bumps in either schema, profile/schema-changelog.md must appear in the same diff. Otherwise MISSING_CHANGELOG_ROW → exit 1.
    2. If a non-additive change happens without a schema_compat bump, surface NON_ADDITIVE_WITHOUT_BUMP → exit 1. Three heuristics:
      • required field removed
      • enum value removed
      • additionalProperties tightened true (or unset) → false

    False positives are acceptable per plan §9 — the maintainer can bump schema_compat + add a changelog row. UNKNOWN-shape changes pass; the heuristics catch the three common breakage shapes, not every conceivable one.

    Pure-function core check_schema_compat_impl(pairs, changelog_modified) is independently unit-testable. Thin git wrapper reads each schema at base + head via git show <ref>:profile/<name> and detects changelog modification via git diff --name-only.

  • profile/build/test_check_schema_compat.py — 14 TDD cases:

    Surface Cases
    Pure function 10 (no change OK; bump + changelog OK; bump without changelog → MISSING_CHANGELOG_ROW; three non-additive heuristics → NON_ADDITIVE_WITHOUT_BUMP; two additive cases → OK; non-additive WITH bump+changelog → OK; multi-schema mixed → OK)
    CLI / git 4 (ephemeral git init tmp_path repo: no change → rc 0; bump w/o changelog → rc 1; bump w/ changelog → rc 0; smoke against real main HEAD↔HEAD → rc 0)
  • make check-schema-compat — invokes with defaults (--base origin/main --head HEAD).

  • .github/workflows/ci.ymlcheck job gains a Schema-version policing step on push/PR. Not wired to the weekly cron — by design, schema_compat bumps land via PRs so a periodic re-run adds no signal.

Critical wiring detail

The check job's actions/checkout@v4 now uses fetch-depth: 0 so origin/main is actually present in the runner's git history. Default depth-1 checkout would make git show origin/main:... fail. Cost is negligible — this repo is small.

Verified locally

  • pytest profile/build/116 / 116 (102 prior + 14 new)
  • make check-schema-compat — clean (no schema bumps in this PR; no non-additive changes)
  • make check-freshness / check-links / check-licenses — still clean

Test plan

  • 14/14 new pytest cases pass
  • Full suite stays green (116/116)
  • All four Phase-5 gates clean locally
  • The ephemeral-git-repo CLI tests exercise the real git driver end-to-end
  • CI green

What's next

Track E — docs/ai-discoverability/phases/phase5-evidence.md close-out. Phase 5 then closed; the AI-discoverability framework's operational loop is complete.

rafael5 and others added 3 commits May 11, 2026 11:05
Org-wide documentation standard now lives at
docs/docs-discoverability/README.md, status: accepted (2026-05-11).
Records the 23-type vocabulary, frontmatter schema (incl. the
generated-doc exclusion rule), 4-phase CI rollout plan, and the
7 resolved questions from the proposal phase as the design record.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Live tabular tracker for the org-wide documentation standard accepted
in dad3e20. Covers four tables: phase summary, Phase 0 (done) and
Phase 1 detail, the 18 CI checks with target phase, and the legacy
remediation backlog (renames, moves, splits, reviews).

Update protocol is event-driven: tracker rows move status in the same
commit as the underlying change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 5 Track D per phase5-plan.md §5. Gate 4 of 4. Final
continuous-enforcement gate. Watches schema_compat bumps + non-
additive shape changes on the two schemas that carry the field
(profile/tools.schema.json + profile/task_index.schema.json).

### profile/build/check-schema-compat.py

Diffs each schema between a base ref (default origin/main) and a
head ref (default HEAD). Enforces two rules:

1. If schema_compat bumps in either schema, schema-changelog.md must
   appear in the same diff. Otherwise: MISSING_CHANGELOG_ROW (exit 1).

2. If a non-additive change happens without a schema_compat bump,
   surface NON_ADDITIVE_WITHOUT_BUMP (exit 1). Three heuristics:

   * required field removed (consumer producing it now fails
     additionalProperties: false)
   * enum value disappeared (consumer producing the removed value
     now rejects)
   * additionalProperties tightened from true (or unset) to false
     (extras previously accepted now reject)

False positives are acceptable per plan §9: the maintainer can bump
schema_compat + add a changelog row. UNKNOWN-shape changes pass —
the heuristics catch common breakage, not every conceivable one.

Pure-function core ``check_schema_compat_impl(pairs,
changelog_modified)`` is independently unit-testable. Thin git
wrapper layered on top reads each schema at base + head via
``git show <ref>:profile/<name>`` and detects changelog modification
via ``git diff --name-only``.

### Test design — 14 cases

Pure function (10):

* no schema change → OK
* schema_compat 1→2 + changelog modified → OK (the happy migration
  path)
* schema_compat bumped without changelog → MISSING_CHANGELOG_ROW
* required field removed without bump → NON_ADDITIVE_WITHOUT_BUMP
  (with "license" detail in the message)
* enum value removed without bump → NON_ADDITIVE_WITHOUT_BUMP
  (with "MIT" + "enum" details)
* additionalProperties tightened true→false without bump →
  NON_ADDITIVE_WITHOUT_BUMP
* additive change (new optional field) without bump → OK
* additive enum value addition without bump → OK
* non-additive WITH bump AND changelog → OK
* two schemas, only one bumps, with changelog → OK

CLI (4):

* ephemeral git repo with no change → rc 0
* ephemeral repo, bump without changelog → rc 1
* ephemeral repo, bump WITH changelog → rc 0
* against the real committed main (HEAD vs HEAD) → rc 0

The ephemeral repo cases ``git init`` a tmp_path, commit a baseline
schema, optionally commit a modification, then invoke ``main(["--base",
<sha>, "--head", "HEAD", "--repo", str(tmp_path)])`` to assert end-
to-end behavior including the git driver.

### Make + CI wiring

* make check-schema-compat — invokes the script with defaults
  (origin/main → HEAD).
* .github/workflows/ci.yml: `check` job gets a `Schema-version
  policing` step on push/PR. **Not wired to the weekly cron** — by
  design, schema_compat bumps land via PRs, so a periodic re-run
  adds no signal.
* Critical wiring detail: the `check` job's checkout now uses
  `fetch-depth: 0` so origin/main is actually present in the runner's
  git history (default is depth 1 — `git show origin/main:...` would
  fail). The repo is small (mostly markdown + small Python), so the
  cost is negligible.

### Verification locally

* pytest profile/build/ — 116/116 (102 prior + 14 new)
* make check-schema-compat — clean (no schema bumps in this PR; no
  non-additive changes)
* All other Phase-5 gates (check-freshness / check-links /
  check-licenses) still clean
@rafael5 rafael5 merged commit 8792787 into main May 11, 2026
2 checks passed
@rafael5 rafael5 deleted the phase5-D branch May 11, 2026 15:14
rafael5 added a commit that referenced this pull request May 11, 2026
Captures Phase 5 exit per phase5-plan.md §6 + §10. Mirrors
phase4-evidence.md shape: one section per gate, "what this proves"
roll-up, then each §10 done-criterion cited green.

Verified locally (gate outputs in the evidence doc):

* pytest profile/build/ — 116/116 (51 prior + 65 across Phase 5)
* make check-freshness — clean (worst=OK)
* make check-links — clean (offline; 59 URLs catalogued)
* make check-licenses — clean (worst=SKIP; 9 INVENTORIED + 1 SKIP
  for m-modern-corpus's mixed-per-subdir)
* make check-schema-compat — clean (no bumps in this PR; no non-
  additive changes)
* make handshake — 8/8 steps
* make recipes-check — 4/4 clean
* make validate-catalog — OK
* make check-docs-prose — clean

All seven §10 done-criteria cited green in the evidence doc:

1. check-freshness.py + 15 TDD + make target (PR #32 / e9e00cb)
2. check-links.py + 13 TDD + B0 binary-URL fix in
   validate-repo-meta.py + 5 TDD (PR #33 / 16bbd08)
3. check-licenses.py + 18 TDD + per-license signature dict (PR #34
   / 5d9a995)
4. check-schema-compat.py + 14 TDD + fetch-depth:0 wiring (PR #35
   / 8792787)
5. CI per-PR runs all four --offline variants
6. Weekly cron firing runs --strict freshness + live link-check +
   full LICENSE-fetch
7. This evidence file

Also:

* docs/ai-discoverability/README.md phase table — Phase 5 row
  flipped from "in flight" → "Closed 2026-05-11" with evidence link.
* AI-discoverability-architecture.md "Phase 5 — in flight" section
  rewritten to "Phase 5 — closed 2026-05-11"; notes the operational
  loop is complete and future phases would address growth, not
  enforcement coverage.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant