feat(sutta-studio): phase-2 hand-curation + A2 experiment scaffolding by anantham · Pull Request #52 · anantham/LexiconForge

anantham · 2026-05-13T22:47:26Z

Summary

Prepares the empirical experiment that validates PR #51's V2 prompt amendments. Phase-2 (`sattānaṁ visuddhiyā` — "for the purification of beings") is now hand-curated following all V2 amendments, with the v10 baseline snapshotted for diffing and scaffolding ready for the v11 compile run.

What's in this PR

Prompt version bump — `SUTTA_STUDIO_PROMPT_VERSION → 'sutta-studio-v11-mn10-amendments'`. PR Wire V2 amendments into live compiler (activates 2d198f6) #50's intended bump landed via PR Compiler consolidation Phase 0+1 — design doc + canonical prompts #51's consolidation transitively, but the version key itself was never bumped. Doing it now so benchmark/leaderboard runs distinguish v10 vs v11.
Hand-curated phase-2 — replaces the v10 LLM-generated phase-2 entry in `components/sutta-studio/demoPacket.json`. Full V2 amendments applied: morph fields on suffixes, epistemicBasis + confidence + notes on all senses, isAnchor on visuddhiyā, plain-first tooltip prose (no bracketed grammar prefixes or decorative emoji), translator-tradition citations for the 6 visuddhi senses (Sujato / Ñāṇamoli-Bodhi / Thanissaro), and cross-phase notes connecting sattā to bhikkhū (phases e-g) and visuddhiyā to samatikkamāya (phase-3).
Experiment scaffolding at `docs/sutta-studio/experiments/`:
- `README.md` — three-way diff plan + analysis follow-up
- `phase-2-v10-baseline.json` — pre-curation snapshot (read-only diff target)
- `phase-2-hand-curated.json` — source-of-truth for the new live entry
- `phase-2-v11-output.json` — to land after the user runs A2
Curation log at `docs/sutta-studio/curation/phase-2.md` — 185 lines following the established phase-{a..h,1}.md format. Documents every decision, predicts what v11 should produce, and identifies which gaps deterministic post-passes (A3) could close.

What this PR does NOT do

Does not run the v11 compile (needs your API key + script tweak — see "Next steps" below).
Does not write the three-way analysis (waits on v11 output).
Does not populate `sourceCitationIds` on the hand-curated senses (DPD not queried in this session; noted in curation log as candidate for citation-linker post-pass).
Does not modify other phases.

Next steps after merge

The user runs the v11 compile. Two paths:

CLI path:
```bash

Edit scripts/sutta-studio/generate-new-phases.ts:

SEGMENT_RANGE = ['mn10:2.1']

add wordRange [4, 6] for phase-2's words (sattānaṁ visuddhiyā)

Then:

OPENROUTER_API_KEY=... tsx scripts/sutta-studio/generate-new-phases.ts
```

UI path: open /sutta/demo, trigger compile, save phase-2 output.

Save the result to `docs/sutta-studio/experiments/phase-2-v11-output.json`. Then write the analysis at `phase-2-analysis.md`.

Test plan

App loads — `/sutta/demo` renders phase-2 with the new senses + tooltips visible on hover
Hover on `sattānaṁ` — see plain-first tooltip prose, no `[Genitive Plural]` brackets, no emoji
Hover on `visuddhiyā` — see anchor underline (isAnchor:true), cycle through 6 senses with confidence rankings
Cycle senses for visuddhiyā — confirm translator notes render in the audit modal / hover details
Run `npm test` — should remain ~1332 passing

🤖 Generated with Claude Code

…r A2 validation This commit prepares the empirical experiment that validates the V2 prompt amendments landed by PR #51 (compiler consolidation). The three-way comparison documented in docs/sutta-studio/experiments/README.md compares v10 baseline output vs v11 compiler output vs hand-curated gold standard. Changes: 1. Bump SUTTA_STUDIO_PROMPT_VERSION → 'sutta-studio-v11-mn10-amendments'. V2 amendments have been active in the canonical prompt builders since PR #51 merged. This version key lets the benchmark / quality-scorer distinguish v10 and v11 outputs in the leaderboard. (Reverts the version state from v10 back to v11 — PR #50 had attempted this bump in the now-closed approach.) 2. Hand-curate phase-2 (sattānaṁ visuddhiyā) following CURATION_PROTOCOL §3.4 + §3.4.1 with all 6 V2 amendments applied. Replaces the v10 LLM-generated entry in components/sutta-studio/demoPacket.json: - p5 sattānaṁ: 3 senses with epistemicBasis (lexical/etymological), confidence, and per-sense notes. morph field added on -ānaṁ suffix (gen, pl). Cross-phase tooltip notes the bhikkhū-vs-sattā audience-vs-beneficiary contrast. - p6 visuddhiyā: 6 senses (purification/purity/clarity/cleansing/brightening/ refinement) with curatorial epistemicBasis, confidence rankings, and per- tradition notes (Sujato, Ñāṇamoli-Bodhi, Thanissaro). morph field added on -yā suffix (dat, sg, f). isAnchor: true (semantic centerpiece — what the path is FOR). - All tooltips rewritten in plain-first §3.4 prose. No bracketed grammar prefixes ([Genitive Plural], [Dative]), no decorative emoji (🔗 ✨ 🎯). - Relation p5s2 → p6 (genitive-of-possession) preserved — earns its arrow per the V2 arrow-earning rule. 3. Add docs/sutta-studio/experiments/ scaffolding: - README.md explains the three-way comparison and follow-up plan. - phase-2-v10-baseline.json snapshots the pre-curation LLM output for diffing (read-only reference). - phase-2-hand-curated.json is the canonical source-of-truth for the new live entry; the demoPacket.json change above is generated from this file. - phase-2-v11-output.json (TBD) will land here after the v11 compile run. 4. Add docs/sutta-studio/curation/phase-2.md — full curation log following the established phase-{a..h,1}.md format. Documents every curation decision, predicts what v11 should produce, and identifies which gaps a deterministic post-pass (the proposed A3 work) could close. How the experiment closes: After this commit lands, the user runs the compiler on phase-2 with v11 prompts active (script edit of scripts/sutta-studio/generate-new-phases.ts with phase-2's mn10:2.1 wordRange, or via the live UI). Saves output to docs/sutta-studio/experiments/phase-2-v11-output.json. Then the three-way diff (v10 / v11 / hand) is written up at phase-2-analysis.md. Decision flows from there: invest in A3 post-passes if v11 hits 65%+ quality; iterate on prompts if it doesn't. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

vercel · 2026-05-13T22:47:31Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
lexicon-forge	Ready	Preview, Comment	May 14, 2026 5:14pm

Empirical-validation script for the V2-amended prompts. Targets a single phase from demoPacket.json, runs the V2-active passes (anatomist + lexicographer + phase composition) on its Pali words, and saves the output for diffing against the hand-curated entry. Skips intentionally: - skeleton: phase grouping is already known from the packet - weaver, typesetter: orthogonal to V2 (V2 amendments don't touch token mapping or layout blocks) - morphology: refinement pass; not core to V2 quality Usage: tsx scripts/sutta-studio/run-phase-experiment.ts \ --phase phase-2 \ [--model google/gemini-3-flash-preview] \ [--out docs/sutta-studio/experiments/phase-2-v11-output.json] Reuses the OpenRouter LLM caller pattern from generate-new-phases.ts (same provider, same fetch shape, same pricing lookup). One-command run for any phase, generalised so phase-3 / phase-4 / DN22 phases can reuse without modification. Output includes _meta with tokens, cost, prompt version, model — supports cross-model comparison. Companion to: - docs/sutta-studio/experiments/README.md - docs/sutta-studio/curation/phase-2.md - PR #52 (the experiment scaffolding this script populates) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Closes the A2 experiment loop opened in PR #52 (phase-2 hand-curation + scaffolding). Adds: 1. v11 outputs from 2 successful frontier models (Gemini Flash + Pro) and failure modes documented for 4 others (Sonnet, Grok, GPT-4o, GPT-5 all fail on our structured-output pipeline). Gemini-only is a known limitation worth noting but not blocking. 2. Phase-3 hand-curation in pipeline+polish mode. First demonstration of the workflow COMPILER_STRATEGY.md §5 predicted: run the v11 pipeline ($0.019), open the draft, add metadata + cross-phase + extra polysemy. Total ~22 min vs ~45 min from scratch. ~2x speedup. 3. phase-2-analysis.md — comprehensive three-way diff (v10 / v11 Flash / hand-curated) + cross-model comparison + failure mode notes + A3 post-pass priority ranking. Empirical signal: - V2 amendments lift STRUCTURAL fields strongly (tooltip register, anchor, morph, relations). v10 had bracketed grammar prefixes and emoji; v11 has plain-first prose throughout. - V2 amendments DO NOT lift METADATA fields. epistemicBasis, confidence, notes citing traditions — LLMs ignore these regardless of model tier (Flash, Pro, Sonnet all skip them). - Polysemy count goes BACKWARDS in some cases (visuddhi: v10 had 5 senses, v11 has 3). Hand-curation re-adds the missing senses. - Cost: Gemini Flash $0.018 per phase; full MN10 re-compile $0.92 total. Pipeline+polish workflow takes ~3-4 hours curator time vs ~25-30 hours from-scratch. ~85% time reduction on routine phases. A3 post-pass priorities (revised based on data): 1. citation-linker (HIGH) — closes sourceCitationIds gap 2. morph-from-POS (HIGH) — closes gender-on-morph gap 3. epistemicBasis inference (HIGH, new — wasn't in original A3 list) 4. cross-phase facet detector (MEDIUM) 5. §3.4 linter (LOW — LLM already does this well) Together these form a "metadata-filler" post-pass module (~5-7 hrs of engineering). After this lands, v11 output should hit ~75% of hand quality automatically. Files changed: - components/sutta-studio/demoPacket.json: phase-3 swapped to hand-curated (line-based splice; other phases untouched) - docs/sutta-studio/curation/phase-3.md: 90-line log of the polish workflow - docs/sutta-studio/experiments/phase-2-analysis.md: 130-line empirical analysis - docs/sutta-studio/experiments/phase-2-v11-output.json: Gemini Flash output - docs/sutta-studio/experiments/phase-2-v11-output-geminipro.json: Gemini Pro output - docs/sutta-studio/experiments/phase-3-v11-output.json: Gemini Flash output for phase-3 - docs/sutta-studio/experiments/phase-3-hand-curated.json: hand-polished gold standard Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Path 1 of the data-fields-evaluation work: makes the V2 metadata fields visible in the UI so you can empirically decide which ones earn their packet bulk vs which can be dropped. Wires LensPanel.tsx into SuttaStudioView. The panel was defined but never imported anywhere — orphaned audit drawer with rich content (Senses / Grammar / Relations tabs) that nothing rendered. Now it docks right when settings.auditPanel is on and a word/segment is hovered. Stays pinned to the most-recent hover target until user clicks ✕ or hovers a new target. Adds 5 new settings toggles under "Audit fields (V2)" section: - Audit panel — open/close the side drawer (default OFF) - Anchor emphasis — toggle the subtle amber underline on PaliWord.isAnchor - Sense notes — Sense.notes prose in the panel - Citation chips — Sense.citationIds chips in the panel - Confidence + basis — NEW per-sense confidence ('high'/'medium'/'low') and epistemicBasis ('lexical'/'curatorial'/'etymological'/...) badges, previously not rendered anywhere LensPanel signature extended with showNotes / showCitationChips / showConfidenceBadges optional props (default true for back-compat). PaliWord.tsx: isAnchor underline now gated by settings.anchorEmphasis (was unconditional). Empirical use: load /sutta/demo, scroll to phase-1 (Ekāyono with 5 translator-debate senses) or phase-2 (sattānaṁ visuddhiyā with the V2 metadata hand-curated in this session). Hover a word. Toggle the audit panel ON in the gear menu. See the senses with their confidence/basis tags, notes citing scholars, and DPD citation chips. Toggle each field on/off to feel which ones add value. Decision criterion: if after using the UI the metadata feels "useful" — keep generating it. If it feels "noise" — drop SENSE_METADATA from V2 amendments and ship a leaner pipeline. Path 2 (~30 min cleanup) is one commit away if that's the call. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…mentary from tooltips 10 facets across 6 phases were using tooltip[0] to explain the UI's color palette instead of providing a gloss — meta-commentary about the renderer, not about the word. Pure removal: no rewrites, no replacement content. After deletion, facet[0] in each affected segment is now the actual gloss that was sitting at index [1]. The color palette explanation belongs in a Legend panel (future commit), not repeated per-word in every tooltip. Reverse-direction cleanup — see ~/.claude/CLAUDE.md "Lean toward the reverse direction" principle. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…affordance on English words Three small UX changes triggered by user feedback after reloading the demo: 1. Trim evaṁ (a1 'eva' segment): drop the "Don't confuse with bare eva" facet. The distinction is already in the Sense.notes for the audit panel; it doesn't belong in a hover tooltip. (Reverse-direction subtraction.) 2. Replace ṁ (a1 'ṁ' segment) tooltip from "humming dot-m sound" to a concrete English-pronunciation analog: "Pronounced as a soft nasal close — like the 'um' in English 'hum' or 'sum'." Information that teaches by example, not by Pāli-internal naming. 3. EnglishWord: render small dot affordance under English words linked to multi-sense Pāli words. One filled dot per sense, current at bold-grey, others at slate-700. Subtle visual cue that the word is clickable AND that there are N alternative renderings available. Only shows when senseCount > 1; ghost words stay clean. The dot affordance is the user's design — replaces my earlier failure mode of leaving clickability hidden. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…UX upgrades Three coordinated changes addressing user feedback after the initial audit-panel shipped: 1. PaliWord.pronunciation field — Pāli is oral; same Roman letters carry different sounds based on syllable position, vowel length, compound boundaries. Pronunciation can't be lemma-derived; it's per-word data. Added optional field on PaliWord type with curator-written syllable breakdown + optional English rhyme analog. Populated for 46 words across the 9 hand-curated phases (a, b, c, d, e, f, g, h, 1, 2, 3). Rendered in LensPanel header below the Pāli text. Note: demoPacket.json diff is bigger than the semantic content adds because json.dump reformatted single-element arrays to multi-line. Accepted the churn rather than break JSON with a line-based splice that mis-matched word ids vs English-token ids. 2. Legend panel (new — components/sutta-studio/Legend.tsx) — visual reference showing the color/symbol vocabulary ONCE: word colors (content/function/vocative), emphasis (anchor/refrain/ghost), diacritics (ā ē ī ō ū / ṁ / ṭ ḍ ṇ ḷ / ñ), relation arrows, cycle dots, audit panel reminder. Toggle via "Legend" in settings. Replaces the deleted "Colored differently because…" tooltip meta-commentary — register-correct location for color teaching. 3. LensPanel UX upgrades: - Header now renders pronunciation under the Pāli text in mono font - Copy Pāli + Copy English moved OUT of the tab row, into a separate footer bar at the bottom of the panel. Added Copy Pron. button when pronunciation exists. - Panel now draggable via framer-motion `drag`. Position persists to localStorage (key: sutta-studio-lens-panel-pos) so it survives reloads. - Panel wider (360px → 440px) to show more text without scrolling. - Header has cursor-move + drag affordance hint. All three changes are positive-additive but each replaces something subtractive: pronunciation replaces lossy lemma-guessing, Legend replaces per-word color meta-commentary, audit-panel UX replaces the cramped header-bar copy buttons. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Three UX fixes after first reload-and-look review: 1. SettingsPanel z-index raised from z-50 to z-[100] — was being covered by the audit panel (z-[90]), which made the gear menu unusable while the audit drawer was open. 2. New "Cycle dots" toggle in settings. Lets readers turn off the small dot affordance under multi-sense English words. Clicking still cycles; only the visual indicator hides. Plumbed through to EnglishWordEngine via the showCycleDots prop. 3. Legend panel rewrites: - Refrain & anchor underlines now hug the text instead of spanning the w-20 column (was extending visibly past 'bhikkhū' / 'visuddhi'). - Ghost-word example "have" opacity dropped from CSS class opacity-60 to inline 0.3 — matches the actual renderer's ghostOpacity default so readers see the real thing in the legend. - Diacritics section rewritten example-first: • Drops "palatal", "niggahīta", "retroflex" from visible prose — they're technical labels that don't help a default reader. • Each long vowel gets its own English-word analog: ā = 'father', ē = 'they', ī = 'machine', ō = 'boat', ū = 'rule'. Reader knows the sound from English; doesn't need to hear "long vowel" first. • ṁ unchanged ("the 'um' in 'hum' or 'sum'") — already example-first. • ñ → "the 'ny' in 'canyon' or 'señor'" (was: "palatal n"). • ṭ ḍ ṇ ḷ → "the soft 'd' in American 'water' or 'butter'" + honest acknowledgment that English lacks a clean equivalent. User feedback: "the example words need to be common words that people actually know... the most important thing is explaining the example English word where naturally you end up doing a long vowel." Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… scale Builds the Tier-1 pronunciation post-pass agreed in the architecture discussion: rule-based syllabifier + Latin-penult stress placer that produces pronunciation hints for any Pāli word from spelling alone. Zero LLM cost; <1ms per word; runs on every word in the packet. services/sutta-studio/postPasses/syllabify.ts (235 lines) - tokenizePaliPhonemes: groups aspirated digraphs (kh, gh, ch, jh, ṭh, ḍh, th, dh, ph, bh) as single phonemes - syllabify: produces CV(C) syllables per Pāli prosody rules (single C between vowels → next syllable; CC → split; CCC → first closes preceding, rest onset the following; ṁ always closes) - isHeavySyllable: long vowel OR closed = heavy - pickStressIndex: Latin penult rule (heavy penult → penult, else antepenult; disyllabic → initial) - syllabifyPaliWord: full pipeline → "vi · SUD · dhi · yā" format services/sutta-studio/postPasses/syllabify.test.ts (29 tests) Worked examples + edge cases: aspirated digraphs, geminate splits, niggahīta closing, stress placement, capitalization preservation. scripts/sutta-studio/backfill-pronunciation.ts One-shot script that populates pronunciation for any PaliWord that doesn't already have one. Idempotent — hand-curated words (the 46 from the earlier commit) are left alone. New words use algorithm. scripts/sutta-studio/syllabify-compare.ts Debug utility: shows algorithm output vs hand-curated pronunciation for every word that has both. Useful for the curator to audit where algorithm and hand-curation diverge. Effect: demoPacket.json now has pronunciation on ALL 269 Pāli words. 46 are hand-curated (with optional "rhymes with X" suffixes for famously-tricky cases like visuddhiyā); 223 are algorithm-populated. Future packet generation can call syllabifyPaliWord() directly as a post-pass, with zero per-word curator effort. Architecture sibling to A3's metadata-filler module: citation-linker → sourceCitationIds morph-from-POS → case/number/gender epistemicBasis-infer → epistemicBasis syllabify → pronunciation (THIS) Known limitations (documented in syllabify.ts): - Stress traditions vary by region/school; we use the standard Latin penult rule. Curator can override per-word via the PaliWord.pronunciation field. - The "rhymes with English-word X" English-analog hint is judgment; not produced by this pass. Curator-added for famously-tricky words. - Sandhi alternations across word boundaries are out of scope. Note: demoPacket.json diff is large because json.dump reformatted single-element arrays. Semantic content added: 223 pronunciation strings. Same accept-the-churn tradeoff as the earlier pronunciation commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…or audit panel Addressing concrete user feedback after first reload of audit panel on mobile + landscape desktop. Five coordinated changes, mostly subtractive: 1. confidenceBadges → default OFF. The 'high/medium/low' confidence + 'DPD-attested/curatorial/...' epistemicBasis badges are useful for curators auditing data, confusing for end readers ("what does 'high' mean next to 'DPD-attested'?"). Settings toggle preserved for opt-in curator review. 2. Inline copy icons replace the footer Copy bar. Removed: the 3-button footer (Copy Pāli / Copy English / Copy Pron.) that occupied vertical space across the panel bottom even when senses tab had plenty of room above it. Added: small clipboard SVG icon next to each copyable item — • Pāli surface in header • Pronunciation (when present) • Each English sense in the Senses tab Net: less wasted space, contextual per-item copy that scales naturally to future language alignments (Tibetan/Chinese/Japanese rows would each get their own icon). 3. Toast on copy success. Bottom-center pill that fades in ("Copied Pāli: sutaṁ") and disappears after 1.6s. Was missing — no feedback that copy worked. 4. Mobile bottom-sheet layout. On <640px viewports, the panel becomes a bottom drawer: • Position: fixed bottom-0, full-width, max-h-65vh • Rounded top corners only • Drag-handle bar visible at top as dismiss-affordance cue • Drag-to-reposition disabled (use of motion `drag={!isMobile}`) Reader content stays visible above the panel. Was full-screen-overlay side panel before, which was unusable on phones. `useIsMobile` hook polls window.innerWidth on resize. 5. Settings gear / About chip collision. AboutThisText container gets `pr-16 md:pr-6` so the about text doesn't wrap under the absolute-positioned gear icon on narrow viewports. Tooltips on the (now opt-in) badges explain what they mean — was missing context. Confidence: "Curator's confidence in this rendering — independent of source." Epistemic basis: per-value explanation of where the sense came from (DPD-attested = from the Digital Pāli Dictionary, etc). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Plain-language label that says what the toggle is FOR (curator metadata) rather than what fields it contains. User couldn't find this when looking to switch off the 'DPD-attested + high/medium' badges that confused them. Reverse-direction edit: just rename, no new toggle, no new logic.

…obile The centered-on-parent tooltip overflowed the viewport when the parent word sat near a screen edge (visible on mobile where viewport is ~400px and tooltip max-width is 28rem = 448px). Measured: word "evaṁ" near the left edge, tooltip extended past the left side of the screen. Fix: useLayoutEffect now measures the tooltip rect after render. If the default centered position would overflow either viewport edge by more than 8px margin, set an absolute leftPx in offsetParent coords that pins the appropriate edge to viewport - 8px. Tailwind centering classes are swapped out for an inline style.left only when the clamp kicks in; desktop tooltips that fit naturally are unchanged. Also tightened max-width from min(28rem, 90vw) to min(28rem, calc(100vw - 1rem)) so the tooltip can never be wider than viewport - 16px — even before the JS clamp runs, the rendering is bounded.

… nested min(...) Previous tooltip max-width was set via Tailwind arbitrary value: max-w-[min(28rem,calc(100vw-1rem))] Tailwind's JIT parser appears to choke on the nested CSS functions inside an arbitrary value bracket. Result: max-width was effectively unset, tooltip rendered at content's natural width (unwrapped), and on narrow viewports the tooltip extended way past the right edge. Moved to inline style.maxWidth which always works. Same effective rule: on desktop, capped at 28rem (~448px); on viewport < 28rem, capped at viewport - 1rem (16px margin). Also set inline style.width = 'max-content' so the tooltip is as wide as needed up to maxWidth — without this, the box would shrink to whatever natural width Tailwind decides, often too narrow.

…dered Was: two visually-separated sections ("Settings" + "Audit fields (V2)") that artificially split toggles. The "V2" label leaked internal architecture language into UX; readers don't need to know what V2 means. Now: one flat list, no extra header, ordered by what most affects the reader's experience: 1. Audit panel — opens large side/bottom drawer 2. Legend — opens visual reference panel 3. Tooltips — fundamental hover info 4. Grammar arrows — relation visualization 5. Alignment lines — Pāli↔English connections 6. Refrain colors — structural recurrence rhythm 7. Anchor emphasis — semantic-centerpiece underline 8. Cycle dots — multi-sense affordance under English words 9. Ghost words — English scaffolding visibility 10. Sense notes — audit panel detail 11. Citation chips — audit panel detail 12. Curator badges — audit panel detail (was hidden under "V2") 13. Emoji in tooltips — style preference 14. Grammar terms — jargon level Reverse-direction edit: removes the section header + container wrapper that conditionally subdivided. Same number of toggles, fewer dividers.

…tips Pure-subtraction data cleanup. The 11 V2-curated phases (a-h + 1-3) were already clean; the other 40 phases carried v10-style tooltips with emoji markers (🎯 Purpose marker, 💭 Mindfulness, etc) and bracketed grammar prefixes ([Genitive Plural], [Dative], etc). Two settings toggles ("Emoji in tooltips", "Grammar terms") were compensating at render time by running stripEmoji() / stripGrammarTerms() over every tooltip. This commit moves the strip from render-time to source-of-truth: - 261 segments with tooltips modified - 82 tooltips that became empty after strip were removed entirely - 0 emoji remain (across full Unicode emoji ranges incl. supplemental pictographs U+1FA70-U+1FAFF — the wood-log emoji escaped first pass) - 0 bracketed grammar prefixes remain Net effect: - Tooltips are now V2-clean at the data layer across all 51 phases - The render-time strip toggles become no-ops (no cruft left to strip) - Both toggles can be safely removed in a follow-up subtractive PR Companion: scripts/sutta-studio/strip-tooltip-cruft.ts. Idempotent — re-running on clean data is a no-op. Note: large diff because Python json.dump reformatted single-element arrays to multi-line. Semantic content removed: emoji characters + bracketed grammar labels. No new content.

…ases Bash wrapper around run-phase-experiment.ts that iterates through the 42 phases not yet V2-curated (phase-4-7, x/y/z, aa-bg) and produces v11 output JSON for each in docs/sutta-studio/experiments/. Loads OPENROUTER_API_KEY from .env.local. Sequential — ~$0.02 × 42 ≈ $0.85 with Gemini Flash. Wall time ~25 min. Idempotent re-run with SKIP_EXISTING=1 to skip phases whose output file already exists.

… repo Worktrees don't carry dotfiles from the main repo, so `source .env.local` failed when the script ran from the worktree. Now falls back to the canonical main-repo path if .env.local isn't present locally.

Empirical artifacts from the bulk v11 pipeline run on the 40 un-V2-curated phases of MN10 (phase-4-7, x/y/z, aa-bg). Total cost $0.96, 1.12M tokens, 40/40 succeeded. Each output is a self-contained anatomist + lexicographer + phaseView trio with _meta showing model, tokens, cost, prompt version. Why commit these: - They're the raw evidence for the pipeline+polish thesis. Future sessions (human or agent) reviewing the strategy can compare v11 output against the hand-curated phases and see exactly what V2 amendments produce vs what gets skipped. - They are read-only research artifacts, not runtime data. The live app reads demoPacket.json — these files don't affect rendering. - Conversation JSONL is local; the repo is the only durable record. Anyone trying to understand "why did we decide pipeline+polish was the right scaling path?" can read phase-2-analysis.md alongside these output files. These outputs are the input to the Path B work that follows: per-phase hand-polish of v11 drafts, splicing into demoPacket.json with curation logs.

…ked example #1) Third dative-of-purpose in the satipaṭṭhāna formula chain (dukkhadomanassānaṁ atthaṅgamāya — "for the disappearance of pain and dejection"). First worked example of the Path B workflow. Kept from v11: - Segmentation (dukkha + domanass + ānaṁ; attha + ṅ + gam + āya) - isAnchor on p10 atthaṅgamāya (the verb-noun action) - Relation arrow p9 → p10 (genitive of purpose) - morph fields (gen-pl on ānaṁ, dat-sg on āya — added gender m) - Plain-first tooltip register - Etymological tooltips (axle-hole metaphor for dukkha, sun-setting for attha) Added in polish: - epistemicBasis + confidence + notes on all 6 senses - Cross-phase note on -ānaṁ: connects to phase-2's sattānaṁ + phase-3's sokaparidevānaṁ (genitive-of-purpose spine) - Cross-phase note on -āya: explicit "third of five datives" framing - Gender m on -āya morph - Caveat on the ṅ segment (traditional grammar treats as sandhi-trace, not separate morpheme — v11's 4-segment parse is pedagogical choice) - Distinction note on dukkha vs soka/parideva/domanassa vocabulary cluster Time spent: ~18 min. Expected to converge to ~10-12 min as rhythm settles. Concerns surfaced in docs/sutta-studio/curation/phase-4.md: 1. Translator-tradition citations are claimed from training memory, not from a verified database. The F task (tradition DB) would ground them. 2. ~5 of the 8 minutes writing hand-curated JSON went to metadata fields that the A3 metadata-filler post-pass would produce deterministically. Continuing Path B vs pausing to build A3 first is a sequencing trade-off the curator should decide — both paths hit ~10 hr total but A3 leaves reusable infrastructure.

…ps clickable User empirical-audit finding: most of the V2 SENSE_METADATA payload is "trust me bro" — confidence levels are LLM hallucinations, DPD-attested tags can't be verified, and the fields are mostly hidden behind a curator-only toggle that's off by default. Verifiable evidence trails (clickable citation links) > asserted confidence levels. WHAT WAS PURGED Sense.epistemicBasis — LLM-classified "lexical/curatorial/etymological/…" tags. Cannot be self-verified by the model. Sense.confidence — "high/medium/low" labels. Hallucinated levels. Sense.sourceCitationIds — Never rendered. Redundant with `citationIds`. Segment.morph — Never rendered anywhere in the UI. Was study-mode infrastructure waiting for a study-mode consumer that doesn't exist. - Removed from V2 prompt amendments (SUTTA_STUDIO_V2_AMENDMENTS no longer includes SUTTA_STUDIO_V2_SENSE_METADATA). The named export is kept for historical reference but unused. - Stripped from demoPacket.json (101 senses, 37 segments). - Stripped from experiment hand-curated files (phases 2, 3, 4). - "Curator badges" toggle removed from settings. - confidenceBadges flag removed from StudioSettings type and defaults. - LensPanel no longer renders confidence/epistemicBasis badges or accepts showConfidenceBadges prop. - CONFIDENCE_COLORS + EPISTEMIC_BASIS_LABELS consts deleted. WHAT WAS ADDED (verifiable evidence, not asserted certainty) - LensPanel accepts `citations?: Citation[]` prop, indexed by id. - citationIds chips become clickable links when their citation has a `url` — opens in new tab, target=_blank, rel=noopener. - Falls back to non-clickable span when no url is available (chip still visible; just not navigable yet). - Chip label uses `citation.short` when available (e.g., "DPD s.v. me (pron — by me)") instead of the raw id. - SuttaStudioView now passes `packet.citations` into LensPanel. Result: when DPD entries are minted with URLs (via citationHelpers.ts which already supports the `url` field on MaterializeOptions), the chips become a "go check the source" affordance. The architecture was always there; the rendering wires it in. SUBTLER CYCLE DOTS EnglishWord cycle-affordance dots default to opacity-30, brighten to opacity-100 on parent-word hover. They were too loud as primary visuals; now they're present-but-quiet, lighting up when the reader approaches. NET EFFECT - demoPacket.json shrinks (phantom data gone). - v11 prompt shrinks (LLM stops generating fields nobody reads). - Settings list drops one toggle. - Audit panel has one less code path. - "Trust me bro" badges replaced by "go check the source" links. The schema definitions in types/suttaStudio.ts for the stripped fields are kept — separate concern, can be cleaned up in a follow-up if confirmed permanently unused. Removing data is the leveraged subtraction; removing schema declarations is cosmetic. The retired V2 amendment + the strip script + the citation-click affordance together signal the principle to future agents: don't carry data ahead of consumer demand, and don't render asserted-confidence when verifiable evidence is available.

vercel Bot deployed to Preview May 13, 2026 22:47 View deployment

vercel Bot deployed to Preview May 14, 2026 01:27 View deployment

vercel Bot deployed to Preview May 14, 2026 01:55 View deployment

vercel Bot deployed to Preview May 14, 2026 12:34 View deployment

vercel Bot deployed to Preview May 14, 2026 13:04 View deployment

anantham and others added 2 commits May 14, 2026 09:14

vercel Bot deployed to Preview May 14, 2026 13:23 View deployment

vercel Bot deployed to Preview May 14, 2026 13:27 View deployment

vercel Bot deployed to Preview May 14, 2026 13:38 View deployment

vercel Bot deployed to Preview May 14, 2026 13:50 View deployment

vercel Bot deployed to Preview May 14, 2026 13:53 View deployment

vercel Bot deployed to Preview May 14, 2026 13:55 View deployment

vercel Bot deployed to Preview May 14, 2026 13:59 View deployment

vercel Bot deployed to Preview May 14, 2026 14:02 View deployment

vercel Bot deployed to Preview May 14, 2026 15:59 View deployment

vercel Bot deployed to Preview May 14, 2026 16:00 View deployment

fix(sutta-studio): batch-v11-pipeline.sh looks for .env.local in main…

26d80c0

… repo Worktrees don't carry dotfiles from the main repo, so `source .env.local` failed when the script ran from the worktree. Now falls back to the canonical main-repo path if .env.local isn't present locally.

vercel Bot deployed to Preview May 14, 2026 16:01 View deployment

vercel Bot deployed to Preview May 14, 2026 16:41 View deployment

vercel Bot deployed to Preview May 14, 2026 16:45 View deployment

vercel Bot deployed to Preview May 14, 2026 17:14 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(sutta-studio): phase-2 hand-curation + A2 experiment scaffolding#52

feat(sutta-studio): phase-2 hand-curation + A2 experiment scaffolding#52
anantham wants to merge 20 commits into
mainfrom
feat/opus-phase2-experiment

anantham commented May 13, 2026

Uh oh!

vercel Bot commented May 13, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

anantham commented May 13, 2026

Summary

What's in this PR

What this PR does NOT do

Next steps after merge

Edit scripts/sutta-studio/generate-new-phases.ts:

SEGMENT_RANGE = ['mn10:2.1']

add wordRange [4, 6] for phase-2's words (sattānaṁ visuddhiyā)

Then:

Test plan

Uh oh!

vercel Bot commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel Bot commented May 13, 2026 •

edited

Loading