Skip to content

docs(wiki): verify Phase 2 pilot on 1000-page sample (96.7% kept — passes ADR-2244 target)#32

Merged
cdeust merged 1 commit into
mainfrom
feat/wiki-pilot-1000-verification
May 13, 2026
Merged

docs(wiki): verify Phase 2 pilot on 1000-page sample (96.7% kept — passes ADR-2244 target)#32
cdeust merged 1 commit into
mainfrom
feat/wiki-pilot-1000-verification

Conversation

@cdeust
Copy link
Copy Markdown
Owner

@cdeust cdeust commented May 13, 2026

Summary

Re-runs the Phase 2 pilot from #31 with the calibrated registry at a 10× larger sample to verify the calibration generalises beyond the stratified 100-page sample that drove the tuning.

Headline result

Metric 100-page (#31) 1000-page (this PR)
Admitted 88.0% 94.2%
Kind kept (after legacy→modern map) 87.5% 96.7%
Kind changed 12.5% 3.3%

96.7% kind agreement exceeds the ≥ 90% ADR-2244 §5 Phase 2 acceptance target. The classifier registry is ready for Phase 4 bulk migration.

Of the 31 "changed" pages

# Transition Verdict
11 specs → explanation Specs that aren't pre-decision RFCs; defensible re-bucketing
11 guides → explanation Guides without "how to" prose; content-dependent, defensible
5 upgrades to specific kinds (conventions → how-to, notes → runbook, notes → how-to, lessons → how-to) Correct — content pattern beats legacy directory
2 adr → explanation Last remaining ADR-detection gap. Pages have neither the prose pattern nor the heading skeleton. Worth investigating in a follow-up but well under the 10% acceptance margin.
2 README.md, architectureexplanation Unknown legacy dirs

Facet distributions (admitted pages, n=942)

  • Lifecycle: 899 seedling · 43 proposed (ADRs default to proposed)
  • Audience: 935 developer · 90 ops · 48 security
  • Provenance: 809 auto-generated · 133 human

The auto-generated count is dominated by file-doc pages tagged codebase/code-reference produced by codebase_analyze. Human-authored = ADRs, RFCs, specs, lessons.

Diff

scripts/wiki-pilot-report.md only — overwrites the 100-page report from #31 with the 1000-page run. The 100-page report remains in git history for audit purposes.

How to reproduce

python scripts/wiki_pilot_migration.py --sample-size 1000

Seed and stratified sampling logic unchanged from #31.

Next phase

Phase 3 (stable IDs) or Phase 4 (bulk migration) — whichever you'd like next. The classifier is now empirically validated on real content.

🤖 Generated with Claude Code

Re-runs the pilot from #31 with the calibrated registry at a 10×
larger sample to verify the calibration generalises.

Headline
--------

  Sample size:     1000  (vs 100 in #31)
  Admitted:        942 (94.2%)
  Rejected:        58 (5.8%) — all admission-gate (template skeletons,
                              audit artefacts), no false negatives
                              observed
  Kind kept:       911 (96.7% of admitted)
  Kind changed:    31 (3.3% of admitted)

96.7% kept exceeds the ≥ 90% ADR-2244 §5 Phase 2 acceptance target.
The classifier registry is ready for Phase 4 bulk migration.

Of the 31 "changed" pages
-------------------------

  11  specs → explanation     — specs that aren't pre-decision RFCs;
                                defensible re-bucketing
  11  guides → explanation    — guides without "how to" prose;
                                content-dependent, defensible
   5  upgrades to specific kinds where content beat the legacy dir:
        conventions → how-to (2)
        notes → runbook       (1)
        notes → how-to        (1)
        lessons → how-to      (1)
   2  adr → explanation       — last remaining ADR-detection gap.
                                Pages have neither the prose pattern
                                nor the heading skeleton. Investigation
                                deferred to a follow-up; well under
                                the 10% acceptance margin.
   2  README.md, architecture → explanation — unknown legacy dirs

Facet distributions on admitted pages
-------------------------------------

  Lifecycle:   899 seedling · 43 proposed (ADR default)
  Audience:    935 developer · 90 ops · 48 security
  Provenance:  809 auto-generated · 133 human

The auto-generated count is dominated by file-doc pages tagged
``codebase`` / ``code-reference``. Human-authored = ADRs, RFCs,
specs, lessons.

Run reproducibly
----------------

  python scripts/wiki_pilot_migration.py --sample-size 1000

Seed and sampling logic unchanged from #31.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cdeust cdeust merged commit 344b2d8 into main May 13, 2026
11 checks passed
@cdeust cdeust deleted the feat/wiki-pilot-1000-verification branch May 13, 2026 09:24
cdeust added a commit that referenced this pull request May 13, 2026
…n complete) (#41)

Bundles 11 merged PRs (#30-#40) since v3.15.4 closing out the
ADR-2244 wiki classification cycle:

  Phase 2     #31 #32  pilot migration analyzer + 1000-page
                       verification (96.7% kind-kept, passes target)
  Phase 3     #33      stable page IDs (UUID4) + redirect data model
                       + backfill CLI
  Phase 3.2   #34      handler-layer redirect mechanics (wiki_read
                       follows transparently, wiki_list/wiki_reindex
                       exclude stubs, new wiki_rename tool)
  Phase 4.1   #35 #36  deterministic bulk migration for the 70
                       known pollution paths (.md.md, timestamp-slug,
                       path-leak)
  Phase 4.2   #37      file-doc re-bucket (8734 pages from notes/
                       to reference/ with modern frontmatter)
  Phase 5     #39      filter auto-generated pages from default
                       listings; INDEX.md splits human-authored
                       from auto-gen
  Phase 6     #38      producer audit — codebase_analyze output
                       routes to kind=reference (root-causes the
                       8734-page misroute)
  Phase 6.2   #40      producer audit — wiki_seed_codebase emits
                       modern kind tags the classifier reads
  Security    #30      authlib CVE-2026-44681 bump (dependabot #4)

Notes for users:
  - Wiki on disk not migrated yet. Apply scripts (in scripts/) are
    dry-run by default. Three commands to fully migrate; each is
    idempotent and leaves redirect stubs.
  - Phases 5/6/6.2 take effect on next MCP restart.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant