Batch 4 (phase-1) + v2 prompt overlay + COMPILER_STRATEGY#49
Merged
Conversation
…e maggo (batch 4 opens; first teaching content)
After 8 phases of narrative framing (a-h: "Thus I have heard… the Buddha
addressed the monks… the monks replied… the Buddha said this"), phase-1
opens the satipaṭṭhāna teaching proper with its famous declaratory claim:
'This, monks, is the direct path.' 4 words / 8 segments.
Centerpiece: 'ekāyano' is one of the most-debated words in the entire
Pāli canon. ek + āyana (from root i, to go) can read as 'one-going /
direct' (Sujato), 'going to one / convergent', 'going alone / solitary'
(Bhikkhu Bodhi), 'one and only' (doctrinally controversial), or 'unified'.
The packet ALREADY had 5 senses encoding this debate; this commit grounds
each sense with curatorial basis + per-sense notes citing the tradition.
Changes:
- p1 Ekāyano: isAnchor=true. morph nom/sg/m on p1s3. DROPPED "Way TO"
type=ownership relation (didn't fit case-quirk palette per the
arrow-earning rule we just ratified in 9830ef1). 5 senses all
curatorial with per-sense notes: direct (Sujato, high) / one-way
(older, medium) / solitary (Bodhi, medium) / convergent (interpretive,
low) / only (controversial, low). Plain-first explanations of the
ek- and āyan- elements + the translator-debate framing.
- p2 ayaṁ: color-explanation facet + cross-references to etad (phase-h)
and te (phase-g) — same demonstrative system, different cases. morph
nom/sg/m. DROPPED "This IS" type=direction relation (universal grammar,
no case-quirk earns the arrow). 1 sense lexical + dpd:8757.
- p3 Bhikkhave: REFRAIN HIT #5 — refrain-explanation facet on p3s2
references all prior bhikkhu appearances. morph voc/pl/m. 5 senses:
2 lexical (Mendicants/Monks + dpd:49868), 2 etymological (Sharers
from bhaj-share, Seekers from bhī-kkh danger-seer), 1 curatorial
(Friends — Thanissaro-style relational rendering).
- p4 maggo: plain-first rewrite. morph nom/sg/m. 3 senses lexical:
path/road (dpd:50495) + method (dpd:50496 — the abstract sense that
frames satipaṭṭhāna as METHOD, not just road).
- 4 new packet.citations: dpd:8757 ayaṁ, dpd:49868 bhikkhave,
dpd:50495 magga-road, dpd:50496 magga-method. Total 28 → 32.
Schema tension #12 (arrow-earning rule): STABLE post-codification. Phase-1's
two pre-existing relations both failed the rule and were dropped.
Schema tension #1 (DPD stripper): STAYS RESOLVED. Ekāyono is the 5th
Lookup-gap surface across batch 3+4 (Bhikkhavo, Bhadante, etad, avoca,
Ekāyono). Pattern: certain inflected/compound forms fall outside DPD's
enumeration even when the lemma is attested. Defer upstream action;
revisit after phase-2/3 to see if morphology-generator fallback is
warranted.
NEW PATTERN observed: translator-debate as first-class curation. When a
word has multiple legitimate scholarly readings (Ekāyano: 5 readings),
surface them as distinct senses with 'curatorial' basis, per-sense notes
citing the tradition, and confidence ranking. Reader cycles through the
debate rather than receiving an authorial verdict. Proposed §3.4.2
amendment for next docs commit cycle.
Refrain status — fully mature:
- bhikkhu: 5/9 phases (e/f/g/h/1)
- bhagavā: 4/9 phases across 3 forms (b/e/g/h)
- viharati: 1/9 (expected to recur in phase-2's satipaṭṭhāna formula)
Curation log: docs/sutta-studio/curation/phase-1.md (§0-§8). Includes
proposed §3.4.2 translator-debate cycle rule.
Tests: worktree sandbox restricted test execution this commit (env issue,
not packet-related). JSON validates; structural assertions confirm
integrity (isAnchor, morph hints, citations, basis distribution).
Batch 4 opens. 9/51 phases curated (a-h + phase-1).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…on learnings as compiler-prompt overlay
Adds config/suttaStudioPromptContextV2.ts — 6 amendment blocks codifying
protocol learnings from MN10 hand-curation (batches 1-4):
1. TOOLTIP REGISTER — strengthens v1's "JARGON-WITH-EXPLANATION" to the
full §3.4 pay-rent rule. Drop bracketed grammar prefixes, √ symbols
without prose, and emoji defaults; keep technical terms only when
they pay rent (precision required + glossed inline).
2. ARROW-EARNING RULE — refines v1 relations guidance with the rule
ratified in FEATURES.md §1.3: relations earn their arrow when Pāli's
case-marker does work English doesn't have an analog for. NOT for
subject-of-active-verb, direct-object-of-verb, or demonstrative
agreement. Includes earned/not-earned examples from curated phases.
3. SENSE METADATA — new fields v1 doesn't mention: epistemicBasis
(lexical/grammatical/curatorial/etymological/commentarial/contextual/
comparative), sourceCitationIds (DPD wiring), confidence
(high/medium/low), notes (translator tradition references).
4. ANCHOR SELECTION — exactly one isAnchor per phase, semantic
centerpiece. Heuristics for verb-anchor / contested-word-anchor /
proper-noun-anchor / framing-anchor (from phase-a/c/d/e/g/h/1).
5. TRANSLATOR-DEBATE AWARENESS — for famously-contested words
(ekāyano, ātāpī, sampajāno, etc.), generate multiple senses
representing distinct scholarly readings, each with curatorial
epistemicBasis + per-tradition notes + confidence ranking.
Worked example from phase-1's Ekāyono 5-sense cycle.
6. CROSS-PHASE AWARENESS — when phase-state envelope provides prior-
phase context, recurring lemmas should get cross-reference facets
(≤4 phases back). Three pattern categories: same-lemma-new-form,
same-lemma-new-role, parallel-structures.
These amendments are NOT yet wired into the compiler — this commit ships
the overlay as a standalone module that can be imported by buildPhasePrompt
(and the relevant Anatomist/Lexico passes) behind a feature flag or
unconditionally in a future refactor.
Companion to:
- docs/sutta-studio/CURATION_PROTOCOL.md §3.4 + §9.1 + §3.4.1
- docs/sutta-studio/FEATURES.md §1.3 arrow-earning rule
- 9 hand-curated phases in components/sutta-studio/demoPacket.json
+ curation/phase-{a,b,c,d,e,f,g,h,1}.md
Next step: wire v2 into prompts.ts and re-run compiler on phase-2 to
test. The companion analysis "what v2 would change about phase-2"
lives in this commit's PR description.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Captures the strategic-economic analysis that emerged from MN10 batches
1-4 hand-curation. The conversation surfaced ~14 distinct insights about
pipeline vs hand-curation economics, cost telemetry, scaling, and the
strategic pivot — none of which lived in the codebase until this commit.
docs/sutta-studio/COMPILER_STRATEGY.md (289 lines, new):
§1 The economic shape — quality bands (35% v1 / 65% v2 / 85% +post-passes
/ 100% hand), Pareto distribution (10-15 phases pedagogically critical,
35-40 are routine recurrences), per-compile cost estimates ($0.10-0.30
Gemini Flash / $1-3 Sonnet / $3-10 Opus per MN10).
§2 What the pipeline does today vs what it could do — 11-row matrix
classifying each hand-curation move as: learnable by prompt /
deterministic post-processable / irreducibly human.
§3 What's irreducibly human — translator-tradition citations,
pedagogical taste, curation-log narrative. ~5-8 phases per sutta
fall in this bucket.
§4 Cost telemetry — surprised discovery that services/apiMetricsService.ts
already records every API call with tokens+cost+apiType=sutta_studio
to IndexedDB. Missing: phaseId attribution, UI, prompt caching,
local-vs-LLM split beyond DPD. 3-step plan to close gaps. ccusage
for Claude-Code-side conversation cost.
§5 Scaling roadmap — 5 stages: hand-curate MN10 exemplar (in progress)
→ wire v2 overlay → build 4 deterministic post-passes → run on DN22
with selective polish (~5-6 hr vs ~30 hr from scratch) → satipaṭṭhāna
sub-corpus → cross-pattern (~20-30 patterns covering most of the canon).
§6 Open questions — translator-tradition DB, DPD Lookup-gap pattern
resolution, prompt-caching tradeoff, when to wire v2, pedagogical-
fidelity floor for routine phases.
docs/HANDOVER.md (180 lines, replaces prior 2026-05-12 handover):
- Full 17-commit inventory across 3 branches (PR #47 from prior session,
PR #48 batch-3 from today, batch-4 branch from today)
- DPD root-cause fix details (coverage 86.9% → 89.5%, 458 sqlite-lookup
vs 20 heuristic-fallback vs 56 unmatched, better-sqlite3 dep,
one-time 168 MB download)
- Schema tensions status (#1 RESOLVED at root, #7 RESOLVED prior,
#12 RESOLVED via documentation, Lookup-gap as new observation)
- 3 protocol amendments codified (§9.1, §3.4.1, FEATURES §1.3)
- 5 phase logs added (e/f/g/h/1)
- Refrain status (bhikkhu 5/9, bhagavā 4/9, viharati 1/9)
- 10 pending threads in priority order with effort estimates
- The pending strategic pivot decision flagged for next session
- Worktree convention + bash sandbox quirk + 3-branch base structure
documented as non-obvious context
- Resume instructions branch on pivot decision
Both docs written by parallel subagents with full context briefings;
reviewed and committed by main session. Companion to the v2 prompt
overlay (2d198f6) and the protocol amendments (c6b150f + 9830ef1).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
This was referenced May 13, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Opens batch 4 of MN10 hand-curation with phase-1 (the first teaching-content phase), ships the v2 prompt overlay codifying all session learnings as a ready-to-wire compiler amendment, and adds the strategic-economic analysis document that emerged from this session.
Base note: This PR is stacked on #48 (batch-3). GitHub will auto-rebase the base to
mainwhen #48 merges. Review the 3 batch-4 commits independently.Commits
110f1d0— phase-1 curation: Ekāyano ayaṁ Bhikkhave maggoFirst teaching-content phase after 8 framing phases. The famously-contested compound
ekāyanogets a 5-sense translator-debate cycle (direct / one-way / solitary / convergent / only) with per-tradition notes citing Sujato, Bhikkhu Bodhi, and the Chinese parallel EA 12.1. New curation log atdocs/sutta-studio/curation/phase-1.md.2d198f6— v2 prompt context amendments (config/suttaStudioPromptContextV2.ts, 280 lines new)Six amendment blocks codifying MN10 batches 1-4 learnings:
Standalone module — not yet wired into
prompts.ts. Companion PR will wire + run on phase-2.ea528d9— COMPILER_STRATEGY.md + HANDOVER.md replacementdocs/sutta-studio/COMPILER_STRATEGY.md(289 lines, new): economic-strategic analysis of pipeline vs hand-curation. Quality bands (35% v1 / 65% v2 / 85% +post-passes / 100% hand), Pareto distribution (10-15 phases pedagogically critical, 35-40 routine), per-compile cost estimates, scaling roadmap, irreducibly-human gaps.docs/HANDOVER.md(180 lines, replaces prior): 17-commit inventory across 3 branches, schema-tensions status, protocol-amendment summary, refrain progression, 10 pending threads, strategic-pivot decision flagged for next session.Strategic context
Per COMPILER_STRATEGY.md, this PR closes batch 4 at one phase (phase-1) and recommends pivoting from linear hand-curation to wiring v2 + building 4 deterministic post-passes (morph-from-POS, citation-linker, cross-phase-facet detector, §3.4 linter). The pivot is ~7 hr cheaper for MN10 alone and inherits a multiplier for every future sutta. Decision is deferred to a follow-up PR.
Test plan
npm test services/providers/dpd.test.ts(regression coverage from batch-3 DPD fix carries forward)/sutta/demo, hover phase-1 words, confirm 5-sense ekāyano cycle renders with translator-tradition notes in the audit modalconfig/suttaStudioPromptContextV2.tsis exported but not yet imported byservices/compiler/prompts.ts(intentional — overlay is staged for follow-up wiring PR)docs/sutta-studio/COMPILER_STRATEGY.mdend-to-end before next compiler work; it explains the pivot rationale🤖 Generated with Claude Code