From ae270e18cd5c4e469fee9541a553d7e3d06526d8 Mon Sep 17 00:00:00 2001
From: Sutu Sebastian <sebiitv@gmail.com>
Date: Sun, 17 May 2026 22:37:30 +0300
Subject: [PATCH 1/3] docs(plans): perf-triangulation rollout sequence
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Captures the post-merge sequencing for PRs #95 + #96 — phases 0
(changeset + PR ready), 1 (merge + CI variance characterisation), 2
(PRAGMA + INSERT tuning trio), 3 (parse-insert pipeline + AST cache
plans), 4 (trigger-gated deferrals), 5 (audit-doc closure).

Also recaps the two methodology lessons from the triangulation audit's
Methodology gaps section so they outlive the audit when it closes.

Plan retires at Phase 5; tracks rollout, not implementation.
---
 docs/plans/perf-triangulation-rollout.md | 105 +++++++++++++++++++++++
 1 file changed, 105 insertions(+)
 create mode 100644 docs/plans/perf-triangulation-rollout.md

diff --git a/docs/plans/perf-triangulation-rollout.md b/docs/plans/perf-triangulation-rollout.md
new file mode 100644
index 0000000..6126dcc
--- /dev/null
+++ b/docs/plans/perf-triangulation-rollout.md
@@ -0,0 +1,105 @@
+# Perf-triangulation rollout
+
+Post-merge sequencing for the perf-architecture triangulation work in PRs #95 (audit docs) + #96 (Tier 1–5 code). Each phase below is independently shippable; phases are ordered by risk-adjusted ROI. Closes when Phase 5 (audit-doc closure) runs.
+
+**Provenance:** [`docs/audits/2026-05-17-performance-architecture-triangulation.md`](../audits/2026-05-17-performance-architecture-triangulation.md) — synthesis of 5 independent perf/architecture audits. This plan is the execution rollout for that audit; it doesn't add new findings.
+
+---
+
+## Phase 0 — pre-merge housekeeping
+
+- [ ] **Changeset** for PR #96 covering (a) the perf-wall improvements, (b) `IndexPerformanceReport` new fields (`bindings_ms`, `module_cycles_ms`, `re_export_chains_ms`), (c) `queryRows` `PRAGMA query_only = 1` hardening (the only user-visible behavior change). Format per the [`project-recipes-cli-pre-bootstrap`](https://github.com/stainless-code/codemap/blob/main/CHANGELOG.md) precedent.
+- [ ] **PR #95** ready for review.
+- [ ] **PR #96** ready for review.
+
+Recommend reviewers go commit-by-commit (Tier 1 → 5 → 5.1) since later tiers depend on earlier instrumentation.
+
+## Phase 1 — merge + CI variance characterisation
+
+- [ ] Merge PR #95 (audit docs, zero functionality change).
+- [ ] Merge PR #96 (Tier 1–5 perf code).
+- [ ] **Watch the `📈 Perf baseline (self-index)` CI job** for the first ~10 runs on `main` post-merge. Job is currently `continue-on-error: true`.
+- [ ] If GitHub-runner variance stays under 25% of baseline → promote to hard gate in a one-line CI PR (drop `continue-on-error: true`). If runners are noisier, bump `CODEMAP_PERF_REGRESSION_PCT` to 35-40% first.
+- [ ] If anyone lands an unrelated perf-affecting change between #96 merge and the gate flip, refresh `fixtures/benchmark/perf-baseline.json` via `bun run check:perf-baseline:update`.
+
+## Phase 2 — quick PRAGMA + INSERT tuning follow-up
+
+Single focused PR titled `perf: full-rebuild PRAGMA + INSERT tuning`. Three commits, each measured against the perf-baseline before and after. Expected total impact: ~200-500 ms on a 2k-file corpus.
+
+- [ ] `feat(perf): PRAGMA journal_mode = OFF during full-rebuild bulk-insert window` — SQLite docs estimate 20-30% on bulk INSERT throughput. Crash-recovery story for full rebuild is already "rerun `--full`" (rebuild starts with `dropAll` + `createTables`); WAL contributes nothing during that window.
+- [ ] `feat(perf): INSERT (not INSERT OR REPLACE) on full-rebuild bulk paths` — table is empty after `dropAll`, conflicts impossible, SQLite skips the conflict-check codepath. ~5-15%. Incremental paths keep `INSERT OR REPLACE` (re-indexing a changed file must overwrite).
+- [ ] `feat(perf): per-table BATCH_SIZE = min(500, floor(999/col_count))` — wide tables (20-col `symbols`) hit `SQLITE_MAX_VARIABLE_NUMBER=999` long before 500 rows; narrow tables (4-col `bindings`) under-batch. Per-table optimum.
+
+Run benchmark on both this repo and a real-world ~2k-file external corpus per commit. Refresh `perf-baseline.json` after the last commit.
+
+## Phase 3 — plan PRs before code (architectural changes)
+
+Each is a **docs-only PR** that lands a `docs/plans/<name>.md` for design review. No implementation until plan is approved. These are real architectural changes — a wrong design here means slow rollback.
+
+### Phase 3a — `docs/plans/parse-insert-pipeline.md`
+
+Overlap parse and insert phases. Today: workers finish all parsing → main thread sorts → main thread inserts. Pipelining: as worker chunks complete, main starts inserting that chunk while later workers parse. Theoretical save: ~300-500 ms on big trees.
+
+Open questions to settle in the plan:
+
+- Approximate vs exact sort order — B-tree locality (`architecture.md` § Sorted inserts) requires monotonic insert order, not strictly sorted. Pipelining means insert starts before final sort completes; does monotonic-within-chunk suffice?
+- One transaction per chunk vs one big transaction wrapping all inserts. Trade-off: per-chunk = better error isolation; single = lower commit overhead.
+- Error-isolation per chunk (a parse failure in one chunk shouldn't roll back inserts of earlier chunks).
+- `IndexPerformanceReport.insert_ms` semantics when phases overlap — does it report wall time or sum of insert work?
+- Worker → main IPC encoding (audit Tier 5.2 hypothesis: CBOR / transferables). Likely pre-requisite: ship a `parse_ms_pure_worker` timer split first so the IPC fraction is measurable.
+
+### Phase 3b — `docs/plans/ast-cache.md`
+
+The biggest unshipped horizontal-scaling primitive. Hash-keyed cache of `ParsedFile` rows → skip re-parse on hash hit. Massive win for big trees with low churn (typical CI watch loop, monorepo `git pull` that doesn't touch most files).
+
+Open questions:
+
+- Cache substrate: `.codemap/parse-cache/<hash>.json` (file per row) vs new `parsed_cache` table in `index.db` (rolled into the main file) vs sidecar `index-cache.db`.
+- Invalidation keys: `content_hash` is necessary but not sufficient. Also: `SCHEMA_VERSION`, parser version (oxc release), adapter version, `fts5_enabled` toggle, any extractor-affecting config.
+- Cache size management: LRU vs manual prune (`codemap cache prune`) vs unbounded. Disk-space tradeoff estimate: ~100-500 bytes per row × millions of rows on big trees.
+- Cache hit semantics in full rebuild vs incremental: full rebuild today is "drop + recreate"; with a cache, full becomes "drop + recreate from cache where possible". Need explicit `--no-cache` for cache invalidation cases.
+- Cold-start cost on first run (no cache yet) — no regression vs today.
+- Cache poisoning: what if a cached row is corrupt / wrong? Detection + recovery.
+- Interaction with the worker pool: workers consult cache before parsing; main reconciles.
+
+## Phase 4 — trigger-gated deferrals (not on the critical path)
+
+These ship when their respective trigger fires, not proactively. The triangulation audit documents each trigger inline. **Audit closure (Phase 5) folds these into a single `roadmap.md` line so this plan can retire.**
+
+- **Tier 5.2 IPC encoding** — fires after Phase 3a's `parse_ms_pure_worker` split shows IPC > ~30% of `parse_ms`.
+- **Tier 5.4 markers lineMap reuse** — fires only if marker extraction becomes hot on >10k-file trees.
+- **Tier 5.6 group-by bucketizer cache** — fires when a serve-mode user reports slow repeated `--group-by owner|package`.
+- **Tier 5.7 git subprocess collapse** — fires if git-subprocess time becomes measurable in incremental wall (Tier 2.3 mostly killed it).
+- **Tier 6.1 RO connection pool** — fires when `mcp` / `serve` indexing 10k+ trees reports contention. Pool is scoped to long-running transports only, NOT one-shot CLI (per audit reconciliation).
+- **Tier 6.2 CI dep vendoring** — fires after timing the existing CI install steps and confirming meaningful savings (currently unmeasured).
+
+## Phase 5 — close the triangulation audit
+
+Per [`docs-governance` § Closing an audit](../../.agents/skills/docs-governance/SKILL.md#closing-an-audit), the triangulation audit closes when Tier 1 + Tier 2 ship (done at Phase 1) AND surviving deferred items either ship or lift to `roadmap.md`.
+
+- [ ] Add one `roadmap.md` backlog line consolidating Phase 4 deferrals: `**Perf-triangulation deferrals (trigger-gated)** — Tier 5.2/5.4/5.6/5.7/6.1/6.2 from audit 2026-05-17; each ships when its trigger fires (see audit doc for triggers).`
+- [ ] Apply the re-derivable test from the closing-audit substrate variants: does a fresh audit today re-derive every finding from current code? If yes AND no source cites the audit doc by name → **delete** (no tombstones). If the methodology section (instrument before estimating) is unique policy → slim to that + add `Status: Closed` header.
+
+## Methodology learnings (already in audit doc, recapped here)
+
+The triangulation audit's `Methodology gaps` section records two lessons that should outlive the audit itself:
+
+1. **Audit cost models should be falsifiable, not estimated.** The Tier 5.1 deferral predicted a `Map.get` hoist win that didn't materialise (~38 ms estimate; measurement showed 0 to slight regression). The actual win came from a PRAGMA-window analysis that wasn't in any of the 5 source audits. Future audits should ship `performance.now()` instrumentation around suspected hot spots BEFORE recommending refactors, gated by an env var (e.g. `CODEMAP_<phase>_PROFILE=1`).
+2. **Profile reveals where time goes; estimates reveal where authors think time goes.** All 5 source audits assumed `bindings_ms` was dominated by `resolveBindings` (the loop). Profiling showed it's dominated by `persistBindings.insert` (the SQL INSERT). One profile-instrumented run would have caught this.
+
+**Optional follow-up:** lift the two lessons into a Tier-2 rule (e.g. `.agents/rules/perf-audits-must-be-falsifiable.md`) or extend the [`audit-pr-architecture`](../../.agents/skills/audit-pr-architecture/SKILL.md) skill so future audits inherit this discipline automatically. Out of scope for this plan; mention only.
+
+## What NOT to do
+
+- **Don't try Phase 3a / 3b items without their plan PR first.** They're each architectural problems worth their own grilling.
+- **Don't pursue multi-DB / sharding.** Measurement-grounded conclusion: SQLite single-writer is the moat; horizontal scaling within that constraint is caching (Phase 3b), not parallelism. See audit doc § Reconciled disagreements + the Tier 5.1 empirical update.
+- **Don't add more CPU workers beyond the existing default + env override.** Parse phase is sub-second on most corpora; diminishing returns; the [`docs/audits/2026-05-17-performance-architecture-triangulation.md`](../audits/2026-05-17-performance-architecture-triangulation.md) Tier 3.3 env knob covers the legitimate override case.
+- **Don't retry Tier 5.1's predicted hoist (TypedArrays, no-imports fast-path) without profiling first.** The previous attempt was a dud; the audit's cost model was wrong.
+
+## References
+
+- [Triangulation audit](../audits/2026-05-17-performance-architecture-triangulation.md) — full design context, consensus matrix, per-tier rationale, methodology gaps.
+- [`docs/benchmark.md` § Perf baseline](../benchmark.md#perf-baseline-regression-guardrail) — the regression-guard mechanism Phase 1's CI variance check uses.
+- [`docs-governance` § Closing an audit](../../.agents/skills/docs-governance/SKILL.md#closing-an-audit) — Phase 5 substrate variants.
+- [`tracer-bullets`](../../.agents/rules/tracer-bullets.md) — why each phase is its own focused PR rather than a megaproject.
+- [`verify-after-each-step`](../../.agents/rules/verify-after-each-step.md) — why each commit in Phase 2 gets its own baseline run.

From 838ad5f3ce5c10cded2e79312401777113b2becc Mon Sep 17 00:00:00 2001
From: Sutu Sebastian <sebiitv@gmail.com>
Date: Mon, 18 May 2026 09:35:39 +0300
Subject: [PATCH 2/3] docs(plans): mark Phase 0 + Phase 1 complete after
 #95/#96/#99/#100

---
 docs/plans/perf-triangulation-rollout.md | 18 ++++++++----------
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/docs/plans/perf-triangulation-rollout.md b/docs/plans/perf-triangulation-rollout.md
index 6126dcc..487f2e6 100644
--- a/docs/plans/perf-triangulation-rollout.md
+++ b/docs/plans/perf-triangulation-rollout.md
@@ -8,19 +8,17 @@ Post-merge sequencing for the perf-architecture triangulation work in PRs #95 (a
 
 ## Phase 0 — pre-merge housekeeping
 
-- [ ] **Changeset** for PR #96 covering (a) the perf-wall improvements, (b) `IndexPerformanceReport` new fields (`bindings_ms`, `module_cycles_ms`, `re_export_chains_ms`), (c) `queryRows` `PRAGMA query_only = 1` hardening (the only user-visible behavior change). Format per the [`project-recipes-cli-pre-bootstrap`](https://github.com/stainless-code/codemap/blob/main/CHANGELOG.md) precedent.
-- [ ] **PR #95** ready for review.
-- [ ] **PR #96** ready for review.
-
-Recommend reviewers go commit-by-commit (Tier 1 → 5 → 5.1) since later tiers depend on earlier instrumentation.
+- [x] **Changeset** for PR #96 covering (a) the perf-wall improvements, (b) `IndexPerformanceReport` new fields (`bindings_ms`, `module_cycles_ms`, `re_export_chains_ms`), (c) `queryRows` `PRAGMA query_only = 1` hardening (the only user-visible behavior change). Shipped in [#96](https://github.com/stainless-code/codemap/pull/96).
+- [x] **PR #95** ready for review → merged.
+- [x] **PR #96** ready for review → merged.
 
 ## Phase 1 — merge + CI variance characterisation
 
-- [ ] Merge PR #95 (audit docs, zero functionality change).
-- [ ] Merge PR #96 (Tier 1–5 perf code).
-- [ ] **Watch the `📈 Perf baseline (self-index)` CI job** for the first ~10 runs on `main` post-merge. Job is currently `continue-on-error: true`.
-- [ ] If GitHub-runner variance stays under 25% of baseline → promote to hard gate in a one-line CI PR (drop `continue-on-error: true`). If runners are noisier, bump `CODEMAP_PERF_REGRESSION_PCT` to 35-40% first.
-- [ ] If anyone lands an unrelated perf-affecting change between #96 merge and the gate flip, refresh `fixtures/benchmark/perf-baseline.json` via `bun run check:perf-baseline:update`.
+- [x] Merge [PR #95](https://github.com/stainless-code/codemap/pull/95) (audit docs, zero functionality change).
+- [x] Merge [PR #96](https://github.com/stainless-code/codemap/pull/96) (Tier 1–5 perf code).
+- [x] **CI variance characterised on first run** — within-run variance ~5% (882/838/866 ms total across 3 internal samples), well inside the 25% threshold. Issue was systematic CI-slowness (2-4× vs local), not noise.
+- [x] **Re-baselined from CI runner medians** in [#99](https://github.com/stainless-code/codemap/pull/99) (`ebb862e`) — local was the wrong baseline source; GitHub's Ubuntu runners are systematically slower on parse + insert phases.
+- [x] **Promoted `📈 Perf baseline (self-index)` to hard gate** in [#100](https://github.com/stainless-code/codemap/pull/100) (`c0cdf0e`) — dropped `continue-on-error: true`, added to `ci-complete`'s required-jobs list, and added to branch protection's `required_status_checks.contexts` via `gh api`.
 
 ## Phase 2 — quick PRAGMA + INSERT tuning follow-up
 

From 2e51e7f7dbba5a1e565a3e38ca889d32bc3bfd1c Mon Sep 17 00:00:00 2001
From: Sutu Sebastian <sebiitv@gmail.com>
Date: Mon, 18 May 2026 09:56:10 +0300
Subject: [PATCH 3/3] docs(plans): mark Phase 5 (audit closure) done after #102

---
 docs/plans/perf-triangulation-rollout.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/plans/perf-triangulation-rollout.md b/docs/plans/perf-triangulation-rollout.md
index 487f2e6..8f2caf4 100644
--- a/docs/plans/perf-triangulation-rollout.md
+++ b/docs/plans/perf-triangulation-rollout.md
@@ -75,8 +75,8 @@ These ship when their respective trigger fires, not proactively. The triangulati
 
 Per [`docs-governance` § Closing an audit](../../.agents/skills/docs-governance/SKILL.md#closing-an-audit), the triangulation audit closes when Tier 1 + Tier 2 ship (done at Phase 1) AND surviving deferred items either ship or lift to `roadmap.md`.
 
-- [ ] Add one `roadmap.md` backlog line consolidating Phase 4 deferrals: `**Perf-triangulation deferrals (trigger-gated)** — Tier 5.2/5.4/5.6/5.7/6.1/6.2 from audit 2026-05-17; each ships when its trigger fires (see audit doc for triggers).`
-- [ ] Apply the re-derivable test from the closing-audit substrate variants: does a fresh audit today re-derive every finding from current code? If yes AND no source cites the audit doc by name → **delete** (no tombstones). If the methodology section (instrument before estimating) is unique policy → slim to that + add `Status: Closed` header.
+- [x] Added one `roadmap.md` backlog line consolidating Phase 4 deferrals: `**Perf-triangulation deferrals (trigger-gated)** — Tier 5.2/5.4/5.6/5.7/6.1/6.2 from audit 2026-05-17`; each item has explicit trigger conditions inline. Shipped in [#102](https://github.com/stainless-code/codemap/pull/102) (`76174dd`).
+- [x] Applied the re-derivable test → **slim + keep with `Status: Closed`**. Audit doc has unique synthesis (consensus matrix can't be reconstructed from the source audits without re-reading all five), decisions of record (Tier 5.1 empirical update saves the next agent from re-running the dead-end Map.get hoist), and unique methodology policy (profile-before-estimate; all 5 audits mis-located the bindings bottleneck). Verified via codemap (`SELECT … FROM symbols WHERE doc_comment LIKE '%performance-architecture-triangulation%'` → 0 rows) that no source file has a JSDoc anchor on the audit, so slimming was anchor-safe. Per-model source audits untouched as historical snapshots.
 
 ## Methodology learnings (already in audit doc, recapped here)