docs(wiki): verify Phase 2 pilot on 1000-page sample (96.7% kept — passes ADR-2244 target) by cdeust · Pull Request #32 · cdeust/Cortex

cdeust · 2026-05-13T08:21:20Z

Summary

Re-runs the Phase 2 pilot from #31 with the calibrated registry at a 10× larger sample to verify the calibration generalises beyond the stratified 100-page sample that drove the tuning.

Headline result

Metric	100-page (#31)	1000-page (this PR)
Admitted	88.0%	94.2%
Kind kept (after legacy→modern map)	87.5%	96.7% ✅
Kind changed	12.5%	3.3%

96.7% kind agreement exceeds the ≥ 90% ADR-2244 §5 Phase 2 acceptance target. The classifier registry is ready for Phase 4 bulk migration.

Of the 31 "changed" pages

#	Transition	Verdict
11	`specs → explanation`	Specs that aren't pre-decision RFCs; defensible re-bucketing
11	`guides → explanation`	Guides without "how to" prose; content-dependent, defensible
5	upgrades to specific kinds (`conventions → how-to`, `notes → runbook`, `notes → how-to`, `lessons → how-to`)	Correct — content pattern beats legacy directory
2	`adr → explanation`	Last remaining ADR-detection gap. Pages have neither the prose pattern nor the heading skeleton. Worth investigating in a follow-up but well under the 10% acceptance margin.
2	`README.md`, `architecture` → `explanation`	Unknown legacy dirs

Facet distributions (admitted pages, n=942)

Lifecycle: 899 seedling · 43 proposed (ADRs default to proposed)
Audience: 935 developer · 90 ops · 48 security
Provenance: 809 auto-generated · 133 human

The auto-generated count is dominated by file-doc pages tagged codebase/code-reference produced by codebase_analyze. Human-authored = ADRs, RFCs, specs, lessons.

Diff

scripts/wiki-pilot-report.md only — overwrites the 100-page report from #31 with the 1000-page run. The 100-page report remains in git history for audit purposes.

How to reproduce

python scripts/wiki_pilot_migration.py --sample-size 1000

Seed and stratified sampling logic unchanged from #31.

Next phase

Phase 3 (stable IDs) or Phase 4 (bulk migration) — whichever you'd like next. The classifier is now empirically validated on real content.

🤖 Generated with Claude Code

Re-runs the pilot from #31 with the calibrated registry at a 10× larger sample to verify the calibration generalises. Headline -------- Sample size: 1000 (vs 100 in #31) Admitted: 942 (94.2%) Rejected: 58 (5.8%) — all admission-gate (template skeletons, audit artefacts), no false negatives observed Kind kept: 911 (96.7% of admitted) Kind changed: 31 (3.3% of admitted) 96.7% kept exceeds the ≥ 90% ADR-2244 §5 Phase 2 acceptance target. The classifier registry is ready for Phase 4 bulk migration. Of the 31 "changed" pages ------------------------- 11 specs → explanation — specs that aren't pre-decision RFCs; defensible re-bucketing 11 guides → explanation — guides without "how to" prose; content-dependent, defensible 5 upgrades to specific kinds where content beat the legacy dir: conventions → how-to (2) notes → runbook (1) notes → how-to (1) lessons → how-to (1) 2 adr → explanation — last remaining ADR-detection gap. Pages have neither the prose pattern nor the heading skeleton. Investigation deferred to a follow-up; well under the 10% acceptance margin. 2 README.md, architecture → explanation — unknown legacy dirs Facet distributions on admitted pages ------------------------------------- Lifecycle: 899 seedling · 43 proposed (ADR default) Audience: 935 developer · 90 ops · 48 security Provenance: 809 auto-generated · 133 human The auto-generated count is dominated by file-doc pages tagged ``codebase`` / ``code-reference``. Human-authored = ADRs, RFCs, specs, lessons. Run reproducibly ---------------- python scripts/wiki_pilot_migration.py --sample-size 1000 Seed and sampling logic unchanged from #31. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…n complete) (#41) Bundles 11 merged PRs (#30-#40) since v3.15.4 closing out the ADR-2244 wiki classification cycle: Phase 2 #31 #32 pilot migration analyzer + 1000-page verification (96.7% kind-kept, passes target) Phase 3 #33 stable page IDs (UUID4) + redirect data model + backfill CLI Phase 3.2 #34 handler-layer redirect mechanics (wiki_read follows transparently, wiki_list/wiki_reindex exclude stubs, new wiki_rename tool) Phase 4.1 #35 #36 deterministic bulk migration for the 70 known pollution paths (.md.md, timestamp-slug, path-leak) Phase 4.2 #37 file-doc re-bucket (8734 pages from notes/ to reference/ with modern frontmatter) Phase 5 #39 filter auto-generated pages from default listings; INDEX.md splits human-authored from auto-gen Phase 6 #38 producer audit — codebase_analyze output routes to kind=reference (root-causes the 8734-page misroute) Phase 6.2 #40 producer audit — wiki_seed_codebase emits modern kind tags the classifier reads Security #30 authlib CVE-2026-44681 bump (dependabot #4) Notes for users: - Wiki on disk not migrated yet. Apply scripts (in scripts/) are dry-run by default. Three commands to fully migrate; each is idempotent and leaves redirect stubs. - Phases 5/6/6.2 take effect on next MCP restart. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cdeust merged commit 344b2d8 into main May 13, 2026
11 checks passed

cdeust deleted the feat/wiki-pilot-1000-verification branch May 13, 2026 09:24

cdeust mentioned this pull request May 13, 2026

release: v3.16.0 — ADR-2244 Phases 2-6.2 (wiki classification redesign complete) #41

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(wiki): verify Phase 2 pilot on 1000-page sample (96.7% kept — passes ADR-2244 target)#32

docs(wiki): verify Phase 2 pilot on 1000-page sample (96.7% kept — passes ADR-2244 target)#32
cdeust merged 1 commit into
mainfrom
feat/wiki-pilot-1000-verification

cdeust commented May 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cdeust commented May 13, 2026

Summary

Headline result

Of the 31 "changed" pages

Facet distributions (admitted pages, n=942)

Diff

How to reproduce

Next phase

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant