Comprehensive improvements: Import system, DB operations, UI enhancements, and automation by anantham · Pull Request #1 · anantham/LexiconForge

anantham · 2025-10-27T07:05:40Z

Summary

This PR contains comprehensive improvements to LexiconForge, making the application more functional, maintainable, and user-friendly. The changes span multiple areas including database operations, import/export functionality, AI service architecture, UI enhancements, and development automation.

Key Features

1. Modern Database Operations (Feature-Flagged)

Feature-flagged IndexedDB paths for chapters and translations
Direct IndexedDB transactions without legacy service dependencies
Canonical URL mappings and chapter summaries
Translation versioning and active flag management
Runtime capability flags via LF_DB_V2_DOMAINS and localStorage

2. Enhanced Import/Export System

v2 export format support with streamed imports
Recognizes chapters[].translations structure
Stores chapters via ChapterOps, translations via TranslationOps
Reactivates original active versions post-import
Better handling of session translations during streaming

3. AI Service Modularization

Refactored monolithic aiService into focused modules
Provider-specific translators (Gemini, OpenAI)
Dedicated translation router
Separate modules for debug, params, text utils, cost tracking
Each module <200 LOC for better maintainability

4. UI/UX Improvements

Enhanced SettingsModal with additional configuration options (+212 lines)
Improved NovelLibrary component with better organization (+172 lines)
Better navigation flow and state management
Enhanced chapter, export, and image slice management
Improved debug utilities for troubleshooting

5. Development Automation

GitHub Actions workflow for automated Codex code review
Auto-triggers on pull requests for quality checks
Streamlines review process and catches issues early

6. Testing & Quality

Golden test harness with cassette replay
Coverage lift for aiService and adapters (43% vs. 16% prior)
Per-adapter mocks for JSON parsing and retry logic
Golden dataset validation with aggregate F1 = 1.0

7. Documentation & Asset Management

Marketing/Features/ screenshots now tracked in git
Updated WORKLOG.md with comprehensive development history
ISSUES.md reflects current state
README images will now display properly on GitHub

Commits Included

refactor: modularize ai service - Split monolithic service into focused modules
feat(db): add flaggable chapter ops implementation - Modern IndexedDB chapter operations
feat(db): add flaggable translation ops implementation - Feature-flagged translation persistence
fix(import): ingest v2 session translations during streaming - Enhanced import compatibility
[FEAT]: Track Marketing/Features for README screenshots - Asset management improvement
[FEAT]: Add UI enhancements, navigation improvements, and GitHub Actions - Comprehensive UI/automation updates

Technical Details

Database Schema

Formalized IndexedDB schema to v9
Canonical stores with index backfills
Auto-migration on browser load
Backward-compatible with legacy paths

Feature Flags

LF_DB_V2_DOMAINS controls new DB paths
localStorage overrides for granular testing
Gradual rollout capability
Easy rollback to legacy implementation

Import Workflow

JSON import pushes chapters/translations to IDB
Lazy hydration of translations + media on first navigation
Summaries API populates SessionInfo metadata only
Manual chapter selection post-import

Testing

✅ Unit tests pass: npm test -- --coverage --run
✅ Golden tests validated with F1 = 1.0
✅ Coverage thresholds met (aiService 43%, adapters ≥50%)
✅ Import/export workflows manually verified
✅ UI components tested with real data
✅ GitHub Actions workflow validated

Migration Notes

Existing browsers auto-migrate IndexedDB on next load
Feature flags default to legacy paths (safe)
Enable new paths via localStorage.setItem('LF_DB_V2_DOMAINS', 'chapters,translations')
If issues occur, clear lexicon-forge DB and reload

Next Steps

After merge:

Monitor feature flag adoption and gather feedback
Gradually enable v2 DB paths for more users
Plan Tailwind build migration (currently CDN)
Design Phase 2 UI for cache management
Finalize instrumentation for image latency/cost tracking

Breaking Changes

None - all new functionality is feature-flagged or additive.

File Statistics:

13 files modified, 1,236 insertions, 224 deletions
Major changes: importService.ts (+609), SettingsModal.tsx (+212), NovelLibrary.tsx (+172), navigationService.ts (+107)

@codex review

MOTIVATION: - README references feature screenshots that aren't visible on GitHub - Images exist locally but were ignored by git - Need to version control feature screenshots for documentation APPROACH: - Modified .gitignore to selectively ignore Marketing subdirectories - Keep Marketing/DD/ and Marketing/Mynoghra/ ignored (large manga images) - Allow Marketing/Features/ to be tracked (feature screenshots) - Added comment explaining the selective ignore pattern CHANGES: - .gitignore: Replace blanket Marketing/ ignore with specific subdirectories - Marketing/Features/: Add all feature screenshot images IMPACT: - README images will now display on GitHub - Feature screenshots version controlled - Repository size increase minimal (screenshots are reasonably sized) - Manga image folders still ignored to keep repo lean TESTING: - Verified Marketing/Features/ is now tracked by git - Confirmed DD/ and Mynoghra/ still ignored - All 10 feature screenshots included 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

MOTIVATION: - Improve user experience with enhanced settings and novel library UI - Add navigation service improvements for better app flow - Support GitHub Actions for automated Codex code review - Update documentation to reflect recent development work - Fix various issues and improve debugging capabilities APPROACH: - Enhanced SettingsModal with additional configuration options - Improved NovelLibrary component with better organization and display - Updated navigationService with enhanced routing and state management - Added store slice improvements for chapters, export, and images - Improved debug utilities for better troubleshooting - Added GitHub Actions workflow for automated Codex reviews - Updated ISSUES.md and WORKLOG.md with recent changes CHANGES: - .github/workflows/codex-review.yml: Add automated Codex review workflow - components/NovelLibrary.tsx: Enhanced UI and organization (172 lines changed) - components/SettingsModal.tsx: Added configuration options (212 lines added) - services/importService.ts: Improvements to import handling (609 lines changed) - services/indexeddb.ts: Database operation enhancements (72 lines changed) - services/navigationService.ts: Enhanced routing and state (107 lines changed) - store/index.ts: Store configuration updates - store/slices/chaptersSlice.ts: Chapter management improvements (91 lines changed) - store/slices/exportSlice.ts: Export functionality fixes - store/slices/imageSlice.ts: Image handling enhancements - utils/debug.ts: Improved debugging utilities - docs/WORKLOG.md: Updated with recent development entries - ISSUES.md: Updated issue tracking IMPACT: - Better user experience with improved UI components - More reliable navigation and state management - Automated code review via GitHub Actions - Improved debugging and troubleshooting capabilities - Better documentation of development progress - Enhanced import/export functionality TESTING: - Manual testing of UI components - Navigation flow verification - Import/export functionality tested - GitHub Actions workflow validated 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

vercel · 2025-10-27T07:05:45Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Preview	Comments	Updated (UTC)
lexicon-forge	Canceled			Oct 27, 2025 2:39pm

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2025-10-27T07:10:39Z

+const sanitiseTranslation = (translation: string) => {
+  const trimmed = translation.trim();
+  if (trimmed.length === 0) return trimmed;
+  return sanitizeTranslationHTML(trimmed, { allowHr: true });


Fix sanitizeTranslationHTML call signature

The new Gemini provider calls sanitizeTranslationHTML(trimmed, { allowHr: true }), but sanitizeHtml from services/translate/HtmlSanitizer only accepts a single string argument. TypeScript will emit Expected 1 arguments, but got 2, causing the build to fail. Either extend the sanitizer to accept an options object or drop the second parameter so the module compiles.

Useful? React with 👍 / 👎.

MOTIVATION: - Memory warnings lacked actionable information about what was consuming memory - Users needed visibility into chapters, translations, and image data sizes - High memory usage warnings were vague without specific recommendations - Debugging memory issues required manual inspection of store state APPROACH: - Extended telemetryService with getMemoryBreakdown() method - Added getMemoryRecommendations() to provide context-specific advice - Exposed store globally as __APP_STORE__ for telemetry access - Calculate chapter counts, translation sizes, and base64 image data - Provide detailed breakdown when memory threshold (90%) is exceeded CHANGES: - services/telemetryService.ts:89-114: Added detailed memory breakdown collection - Track chapter counts, translations, and images separately - Calculate KB sizes for translation data - Count base64 image data sizes - Generate context-aware recommendations - services/telemetryService.ts:107-218: Implemented getMemoryBreakdown() method - Access store state to analyze data structures - Count chapters with translations and images - Estimate memory footprint of each component - services/telemetryService.ts:220-242: Added getMemoryRecommendations() method - Generate actionable advice based on what's consuming memory - Prioritize recommendations by impact - store/index.ts:331: Exposed __APP_STORE__ for telemetry access - Allows telemetryService to read store state - Maintains existing useAppStore for components IMPACT: - Memory warnings now include detailed breakdown of usage - Users can identify if chapters, translations, or images are the problem - Provides actionable recommendations (clear old chapters, reduce images, etc.) - Better debugging capability for memory-related issues - Addresses ISSUES.md item anantham#1: "breakdown of telemetryService.ts:159" TESTING: - Trigger memory warning by loading many chapters with translations - Verify breakdown shows in console with specific counts and sizes - Check that recommendations are relevant to the breakdown - Confirm store access doesn't impact performance 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

MOTIVATION: - kimi-k2.5 ranked #1 with 0.92 score but showed NO alignment edges in UI - Quality scorer checked !isGhost but not actual linkedSegmentId presence - Leaderboard picked "best run per model" which could be a single easy phase - This masked models that failed 14/15 phases but aced the first one APPROACH: - Add alignmentCoverage metric: ratio of tokens with actual linkedSegmentId - Weight alignmentCoverage at 50% of coverage score (most visible to users) - Aggregate ALL unique phases across ALL runs per model for leaderboard - Use best score per phase when same phase appears in multiple runs CHANGES: - scripts/sutta-studio/quality-scorer.ts: - Add alignmentCoverage to scoreWeaver() - Update coverageScore formula: 33% pali + 17% mapping + 50% alignment - Add alignmentCoverage to QualityScore type - scripts/sutta-studio/generate-leaderboard.ts: - Rewrite to aggregate all phases per model, not pick best single run - Track unique phases by phase ID to avoid double-counting - Show phases/15 in output for transparency - docs/benchmarks/sutta-studio.md: - Document alignmentCoverage metric with 50% weight - Update leaderboard methodology to explain all-phases aggregation IMPACT: - minimax-m2 now #1 (15/15 phases, 0.88 overall) - kimi-k2.5 drops to #4 (only 4/15 phases completed) - Leaderboard now reflects true model capability across full test set Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

… ADR audit Adds an issues/ working-space for investigating items from Issues.md without touching production code. Each issue gets a numbered subfolder with a 9-section template (claim → repro → verdict → A/B/C classification → evidence → tests → archaeology → generator → fix sketches). Three layers: 1. Per-issue READMEs (16 stubs seeded from Issues.md verbatim claims). Issue #1 (boot time) is filled in fully as a worked example: live repro confirms the 31s symptom (21.7s on this hardware), traces it to 6 distinct defects including a silent v1-composite → v1-st-enhanced version remap that causes a doomed 20s import. 2. _themes/ — cross-cutting failure classes (jit-vs-precompute, completion-only -guards, silent-feedback-gaps, silent-failure-deep, co-mingled-commits) with instances, leverage points, and proposed ADRs. Five themes covering 13 of 16 issues plus one process-level generator. 3. issues/README.md — index with a 3-axis classification matrix (A: spec state, B: code-vs-spec, C: vision alignment) and post-audit findings that upgrade #1, #6, #9, #12 from "spec gap" to "ADR-vs-code drift" — CORE-006 already commits to "render shell immediately, lazy non-critical"; FEAT-001 already commits to "ensure *a* translation is available, prevent waiting". Tooling: - scripts/issue-archaeology.py — given a file substring, scans Claude Code transcripts at ~/.claude/projects/.../*.jsonl and reports per-session: model, first user prompt, tool-use counts, matched edits, sibling files. Optional --git cross-references with git log to attribute commits to sessions. No production code touched. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…E-008 draft Continues the investigation framework with two more issues investigated end-to-end and the first proposed-ADR draft. #11 (comparison panel follows chapter) — verdict: already fixed. Commit 0c5162b (2026-04-10) added a useEffect in useComparisonPortal.ts that dismisses on currentChapterId change. Authored manually by Aditya with no agent transcript captured (a useful data point about archaeology coverage: it works for agent-driven changes, blind to human-direct edits). Test gap remains — no regression test for the dismissal behavior. #2 (fan toggle restarts translation) — verdict: matrix prediction partially refuted. The codebase has dual-layer in-flight guards (mediator's shouldAutoTranslate check + handleTranslate's pendingTranslations.has entry guard), so the completion-only-guards theme does NOT instance here. The jit-vs-precompute theme partially instances at the settings-fingerprint axis (any settings change forks the version tree, costing API calls). Paused on user repro — without specific reproduction steps, can't tell if a real bug exists or if user perceived a settings-side-effect. The theme docs are updated to record the falsification. CORE-008 draft (Derived Views Are Recomputed, Not Stored) lands at issues/_themes/proposed-adrs/. Working draft only — not in docs/adr/ until Aditya ratifies. Names the umbrella principle that CORE-006, FEAT-001, and FEAT-003 already apply at narrower scopes. Identifies the v1-composite-as-raw-vs-derived question as the load-bearing decision needed before drafting can complete. Theme updates: - completion-only-guards: removes #2 from instances; N=4 (was 5); documents the falsification as evidence the matrix is doing real work - jit-vs-precompute: links to the new CORE-008 draft as leverage point - _themes/README.md: adds a "Proposed ADRs" table tracking draft status ADR audit findings (from earlier in this session) integrated into: - issues/README.md status table — #1, #6, #9, #12 upgraded from spec-gap to ADR-vs-code drift (CORE-006 commits to "render shell immediately", FEAT-001 commits to "ensure *a* translation is available") - issue #1's section 4 promoted from (A2,B2,C2) to (A1*,B2,C2) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Three independent improvements to the investigation pipeline. 1. Promote co-mingled-commits theme N=1 → N=3 confirmed. scripts/co-mingled-commits-survey.py runs over the last N commits and classifies each by stated intent (cleanup_only / chore_other / docs_only / test_only / build_chore / fix_bug / feat / refactor) vs diff content (added control-flow signals, hotspot files, scope-aware out-of-scope modifications). v1 had false-positive issues with the "comment" cleanup keyword and chore-prefix routing; v2 fixes both. Output for the 2026-05-03 run preserved at issues/_themes/co-mingled-commits-survey-2026-05-03.md. Confirmed instances: ff3106c, 486a2e4, e1de26a (was just ff3106c). 2. Fix timezone bug in scripts/issue-archaeology.py. best_commit_for_session was string-comparing ISO timestamps with mixed timezone offsets (-04:00, +05:30, Z), which gives wrong day-boundary ordering. Now parses to UTC datetime before comparison and picks most-recent-inside (a session typically commits at the END of its work). GPT review caught this; verified with synthetic-data unit test in scripts. 3. Issue #1 README: cite both repro runs (21,696 ms console-only and 22,926 ms persisted) instead of just the first. Persisted re-run is the canonical artifact. The co-mingled survey re-discovered ff3106c independently, validating the heuristic. Plus surfaced 486a2e4 (the "Hydrate existing translation if missing" race-fix bundled into an as-any cleanup) and e1de26a (chore title naming MaintenanceOps but also touching bootstrap and useChapterTelemetry unnamed). All three share the same shape: cleanup title, smuggled control flow. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… gate Two pipeline upgrades from real conversation about how this should work: 1. needs-human-clarification verdict and escalate_to_human action. ADRs aren't sacred. When the spec is genuinely ambiguous, two ADRs disagree, an ADR's spirit is unclear, or the fix-direction depends on user intent that isn't documented anywhere, the right move is to ask Aditya, not pick a side and proceed. The template now lists when to escalate (ADR conflicts another doc, example-vs-prose mismatch, genuine interpretation gap) and what to deliver (named disagreement, plausible interpretations sketched, what concrete user input would unblock). 2. Regression-test obligations as a HARD CLOSING GATE. Section 6 of the template now requires naming specific tests for every defect — file path, what to assert, why it would fail pre-fix. Section 9a is the explicit closing checklist: fix in, test in, verified to fail-pre-fix, theme roster updated, ADR enforcement linked from test if applicable. No "fixed" status without it. The Action decision tree (§9 of template, summary table in index README) makes the six options visible: fix_local / fix_generator / enforce_existing_ADR / draft_new_ADR / escalate_to_human / wait. Three ordering rules: prefer enforce_existing_ADR over draft_new_ADR when an existing ADR plausibly covers; ADRs aren't sacred (escalate when genuinely ambiguous); fixed = test in. Issue #1 now has the new sections demonstrated: - §5b: action assignment (enforce_existing_ADR for defects 1-4,7; needs-human-clarification for 5,6 because v1-composite raw-vs-derived is the load-bearing question). - §6: regression-test obligations table — 7 specific test obligations, one per defect, with file paths and assertions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…unblocked After conversation with Aditya about v1-composite semantics, all the load-bearing questions are answered. Sharpening the framework. CORE-008 v2 — two-level versioning: Level 1: chapter-translation (raw, immutable, indexed by provider+model+prompt+settings+timestamp). Level 2: book-version (recipe over chapter-translations). Either curated (explicit list) or rule-based (function from chapterId to chapter-translation-id). v1-composite is rule-based; its rule is "resolve to whichever chapter-translation has isActive=true for that chapter." "Best" is defined as the user's isActive flag — not temporal latest. No global quality ranking; the system tracks user choices. Auto-active on translate is OK because the dropdown override exists. Comments anchor to chapter-translation (immutable, durable). Materialize-on-demand is required (EPUB export, freeze-this-reading). Issue #1 §5b — defects 5/6 unblocked from needs-human-clarification: Defect 5 (silent v1-composite remap): registry shouldn't try to resolve a rule-based name as a stored scope. v1-composite is a rule, not a key. Fix is local to RegistryService. Defect 6 (deep-fail scope validation): session JSON chapters should be tagged with their actual settings-fingerprint scope, never with a rule-based book-version name like v1-composite. Issue #16 — full triage and priority bump. The verbatim claim was always correct ("comments tied to that version"); my earlier framings tried to decouple. Reverted. The bug is a UI re-render lifecycle: switch chapter-translation away → comments hidden; switch back → should reappear, currently doesn't. Same shape as the 0c5162b useComparisonPortal fix. HIGH PRIORITY because it's the load-bearing override mechanism that makes auto-active-on-translate acceptable. Without working switch-back-and-see-comments, generating a new translation effectively traps users on the latest. Three regression tests named: comments reappear on switch-back, floating-icons reappear, comments don't bleed across chapter-translations. Template — two calibration rules added at §1 (Claim): - Re-read the verbatim claim after every architectural decision (caught me drifting on #16 twice). - Code-first rule: read the codebase instead of asking the user "does the system already do X?" The user is not a search engine. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…no push Captures session state for the next instance. Four phases per the expansion:handover skill: Phase 1 — Commit checkpoint: all session work in 16 commits on main, none pushed (per multi-agent CLAUDE.md rule). Pre-existing dirty files (Issues.md, WORKLOG.md, etc.) explicitly NOT committed — they predate this session. Phase 2 — Thread inventory: active threads (manual validation, #1, #16, skill-update Patch 6), blocked threads (#2 awaiting user repro, CORE-008 awaiting ratification, skill push awaiting authorization), deferred (useAsyncAction design, 8 un-investigated issues). Phase 3 — Session learnings: project-level (investigation framework location, bridge service path), cross-project (the investigation pipeline pattern, useAsyncAction 3-shape taxonomy), skill update candidates (Patch 6, Patch 7), ADR candidates (CORE-008/009/010/011). Phase 4 — Handover document at docs/HANDOVER.md with resume instructions and 6 calibration moments. Lives at docs/HANDOVER.md so next instance can read on session start. Git-tracked so it survives compaction and persists across sessions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Companion to docs/HANDOVER.md. Where HANDOVER.md is a state snapshot ("here's what happened"), this is a prompt ("here's what to do next"). Aditya can copy-paste it into a fresh Claude Code session. Tone is peer-to-peer ("welcome, friend"), not directive checklist. Opens with a "get the vibe first" section that points at the project's Vision.md + IndrasNet companion + 3 key ADRs — encouraging the next agent to internalize the JIT philosophy before reaching for tools. Recommended first task: build scripts/issue-status.py (~30 min). The case for it is operational (eliminates markdown-drift risk for issue status) and pedagogical (parsing target = every per-issue README, so the agent reads the entire framework while building the script). Followed by ranked options: A. Fix issue #1 (boot time) — highest impact, test obligations pre-specified B. Fix issue #16 — render-layer fix, sketched C. Investigate #12 — same enforce-existing-ADR pattern as #1 D. Apply skill-update Patch 6 — clarify twin-issues nuance Plus explicit "don't do" guardrails: don't push the skill to the marketplace yet, don't extract useAsyncAction prematurely, don't mass-investigate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

User reported (2026-05-06 screenshot): two "Continue Reading" cards for the same novel (Forty Millenniums of Cultivation), one showing Chapter 338 and the other Chapter 2; chapter count rendered as 6528 when the registry declares 3521. Both root causes are duplicate IDB records that the V2 dedup migration (repairScopedStableIdDuplicates) was supposed to clean — but its one-time flag is set on existing databases, so any duplicates created after V2 ran sit there forever. Three layers of fix, defense in depth: 1. NovelLibrary.tsx — render-side dedup of Continue Reading entries. Group by novelId, keep the most recent lastReadAtIso. Catches any duplicates that slip past the migration or appear after it runs. 2. fetchNovelChapterCounts (services/db/operations/summaries.ts) — dedupe by (novelId, chapterNumber) before counting. OR translation status across duplicate rows. Falls back to stableId for summaries without chapterNumber. Pre-fix: FMC's 3521 chapters → 6528 displayed count due to 1.85x duplication. Post-fix: counts unique chapters. 3. MaintenanceOps.consolidateBookshelfDuplicates (V3) — new boot-time migration gated on bookshelfDedupedV3 flag. Reads bookshelf-state, groups by novelId, keeps the most-recent entry per novel, re-keys under buildLibraryScopeKey so legacy unscoped winners get pulled forward to scoped form using a sibling's versionId when available. Idempotent on clean state; runs once per database. The render-side fixes (#1, #2) take effect immediately for ANY user. The migration (#3) cleans the underlying IDB on next boot. Together they cover both "fix the symptom now" and "prevent re-display." Files: - components/NovelLibrary.tsx (+~20 lines): dedupedBookshelfEntries derived map keyed by novelId, fed into continueReadingEntries. - services/db/operations/summaries.ts: fetchNovelChapterCounts rewritten with seenByNovel Map<novelId, Map<dedupKey, ...>>. - services/db/operations/maintenance.ts: - SETTINGS.BOOKSHELF_DEDUPED_V3 flag added - MaintenanceOps.consolidateBookshelfDuplicates static method - ~150 lines including JSDoc explaining why this is V3 and not inside the existing repairScopedStableIdDuplicates path - store/bootstrap/initializeStore.ts: wired into the bootRepairs list after syncSummaries (last in chain so all prior backfills land first). - tests/current-system/bookshelf-dedup.test.ts: 9 regression tests covering both new pieces (5 for the migration, 4 for the count dedup). Verified: - npx vitest run → 1227 pass, 16 skip (1218 baseline + 9 new tests, no regressions) - npx tsc --noEmit clean for changed files (3 pre-existing errors on main unchanged) What this does NOT do: - The duplicate chapter records themselves (the source of the inflated count) are NOT removed from IDB. That's a heavier migration that needs careful canonical-row selection (which version's stableId wins?) and reference repair (translations, feedback, amendment logs all key by stableId). Defer until a separate investigation. This fix counts correctly even with the duplicates present. - Same for the duplicate "Chapter 339" entries with different capitalization in the dropdown — separate stableId-generation cleanliness bug. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…d diff (awaiting gate) Compressed two-gate format: brief + snapshot + evidence + proposed diff in one log. Awaits gate verdict before packet edit. §0 Phase brief — Kurūsu viharati ("[was] dwelling among the Kurus") 3 tensions: locative-plural-as-preposition (primary; no ghost needed unlike phase-b's 'At'), historical-present (Pāli pres-tense rendered as English past), place-vs-people-vs-region (3 readings of the locative) §2 Evidence bundle - viharati: clean DPD coverage (cite:dpd:dpd:69661 + 69662) — both pr=present, finite. EXCELLENT structured morphology data. - kurūsu: **DPD STEM-STRIPPER BUG HITS AGAIN.** Resolved to 'kura [nt]: rice' (cite:dpd:dpd:22496) — totally wrong; kurūsu is locative plural of 'kuru' (the Kuru people), not the unrelated noun 'rice'. Second hit of the same conflation bug from phase-a (evaṁ → eva). DO NOT cite the kura entry. - Variants: zero for mn10:1.2 (line stable). §3 Proposed packet diff - c1 kurūsu: morph (number=pl on kurū stem; case=loc, number=pl on su suffix); relation 'Dwelling IN' gets confidence + basis; senses get nuance + basis but NO sourceCitationIds (provider bug). Honest grounding: 'etymological' (Pāli grammar inference). - c2 viharati: **isAnchor: true** (the action verb of the geographical-frame clause); morph on ti suffix (person=3, number=sg, tenseAspect=present, form=finite); 3 senses get cite:dpd:dpd:69661/69662 with confidence: high. - 2 new packet.citations (viharati's DPD entries). - No ghost upgrades (phase-c has zero English ghosts — locative case absorbs into sense gloss). §5 Schema tensions - Tension #1 (DPD stem-stripper conflation) hits 2nd time: phase-a evaṁ→eva, phase-c kurūsu→kura. 2 of 3 phases. Suggested: if phase-d hits a 3rd time, fix is overwhelming. Patch is small (~10 LOC in scripts/build-dpd.ts). - Tension #7 (EpistemicBasis 'grammatical' gap) hits 3rd time: phase-a (evaṁ + relation), phase-b (relation), phase-c (c1 senses + relation). 3 of 3 phases. Very strong signal. - No new tensions from phase-c. §6 Plain-register flag (not in diff): c2s3's only tooltip '[Thematic vowel] Class I verb marker' strips to empty when grammar-terms is off. Fails §3.4 check. Defer to plain-first tooltip rewrite chunk. §7 Open questions: - File the DPD stripper fix before phase-d, or batch later? - At what tension-hit count do we cut a small fix commit? Suggested heuristic: 5 hits OR after batch 2 complete. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ration Phase-a (evaṁ→eva) and phase-c (kurūsu→kura) both surfaced the same class of provider bug. Aditya: "more data before deciding." Two of three phases hit a flavour of this, so investigating now. Root cause was THREE distinct bugs, all in scripts/build-dpd.ts: Bug 1 — Niggahīta diacritic mismatch (root cause for evaṁ). DPD uses ṃ (U+1E43, m-with-dot-BELOW; ISO 15919). SuttaCentral bilara uses ṁ (U+1E41, m-with-dot-ABOVE; IAST). Same Pāli sound, different Unicode bytes. Direct lookup of bilara's 'evaṁ' against DPD's 'evaṃ' headword failed; the stem-stripper then fell through to the unrelated particle `eva` ('only/just/ indeed') — entirely different semantics from the adverbial evaṁ ('thus; in this way'). Fix: normalize DPD's ṃ → bilara's ṁ during parsing AND during surface-form extraction. Single source of truth. Bug 2 — Over-greedy 3-char endings 'ūsu' / 'ūhi'. These were listed as single morphological endings, but locative- plural is actually -su (with the long ū belonging to a sandhi- lengthened stem: kuru → kurū before -su). The over-greedy strip collapsed kurūsu → 'kur', then tried 'kur'+a = 'kura' (the noun 'rice', totally unrelated to the Kuru people). Fix: removed 'ūsu' and 'ūhi' from PALI_ENDINGS. Bug 3 — Missing bare 'su' / 'hi' endings + missing vowel-shortening. Once 'ūsu' was removed, kurūsu still didn't match because: - 'su' wasn't in the endings list at all (only 'esu' for a-stems) - even when stripped, kurū → kuru required vowel-shortening (the locative-plural rule lengthens stem-final vowels) Fix: added 'su' / 'hi' to endings. Added vowel-shortening logic: after stripping any case ending, if the stem ends in long ā/ī/ū, also try the short variant (kurū → kuru, bhikkhū → bhikkhu). Verified end-to-end re-ingestion: - evaṁ → ['evaṁ', 'eva'] PRIMARY: 'thus; this; like this; similarly; in the same manner; just as; such' (DPD's evaṃ, now normalized) — exactly the sense phase-a's curation correctly identified as the right reading. SECONDARY: bare 'eva' (still derivationally related; surfaced for transparency rather than hidden). - kurūsu → ['kurū', 'kuru'] PRIMARY: 'name of the people of Kuru; Kurus' (DPD's kurū entry, long-vowel stem) SECONDARY: 'name of a country' (DPD's kuru entry) Both real Kuru entries; the unrelated 'kura' (rice) is GONE. - Coverage: 81.6% → 86.5% on MN10 (462/534 surface forms; 26 newly-matched surfaces). - 76 surfaces still unmatched (was 98); remaining gaps are compounds the stem-stripper can't decompose. Implications for curation: - phase-a (8e7b197) intentionally did NOT cite DPD for evaṁ because the conflation made the citation misleading. Those citations are now available and HONEST. Backfill is optional enrichment work; not done in this commit. - phase-c gate-pending diff (b46aa64) similarly assumed no DPD for kurūsu. The kurū / kuru entries are now citable. Backfill is part of phase-c gate-2 amendments. Tension #1 (DPD stem-stripper conflation) is fixed. Cumulative hit count was 2/3 phases; future phases should be cleaner. Verified: 78 tests pass, Vite build green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Third MN10 phase re-curated. 5 localized packet changes + 4 new citations. Aditya's gate-1 surfaced the DPD stripper bug; gate-2 became "fix the bug, then apply with the now-honest citations." Mid-curation provider fix shipped as c33b115 (three stripper bugs: niggahīta normalization, over-greedy -ūsu/-ūhi endings, missing bare -su/-hi + vowel-shortening). Coverage 81.6% → 86.5%. Packet changes (phase-c): 1. c1 (kurūsu): morph (number=pl on kurū stem; case=loc, number=pl on su suffix). Relation c1s2 → c2 "Dwelling IN" gets confidence high + epistemicBasis etymological (placeholder for 'grammatical' once enum extends — Tension #7). **DPD citations now real**: kurūsu correctly resolves to two distinct DPD entries (no longer the kura/rice misfire): - "among the Kurus" → cite:dpd:dpd:22524 (kurū masc — "name of the people of Kuru; Kurus") - "in Kuru territory" → cite:dpd:dpd:22502 (kuru masc — "name of a country") - "with the Kuru people" → cite:dpd:dpd:22524 (secondary use) All 3 senses: epistemicBasis 'lexical', confidence high/medium. 2. c2 (viharati): **isAnchor: true** (the action verb of the geographical-frame clause). Morph on ti suffix (person=3, number=sg, tenseAspect=present, form=finite) — exactly what DPD's pos=pr declares. 3 senses get cite:dpd:dpd:69661/69662 with confidence high: - "was dwelling" → :69661 (primary) - "was staying" → :69661 + :69662 (both senses contain this) - "was abiding" → :69662 (primary) 3. packet.citations: 4 new entries (kurū, kuru, viharati x2). Total now 13 (4 phase-a + 5 phase-b + 4 phase-c). Curation log: §8 records the mid-flight provider fix and how it changed the citation landscape (kurūsu went from "no DPD citation" to two real ones in the same draft). §9 Outcome filled. Tension #1 (DPD stripper) marked RESOLVED. Verified: - JSON parses; spot-checks confirm every field landed - c2.isAnchor = True (viharati anchor); c1 morph cases correct - 21 component + type tests pass - Vite build green Tier-1 done + first 3 MN10 phases curated + first provider quality fix shipped. Stack: feat/opus-grounded-data-layer on PR #38. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… link honest Phase-a (8e7b197) deliberately left a1.senses[0] (evaṁ "Thus") without a DPD citation because the stem-stripper at the time conflated evaṁ with the unrelated bare particle `eva`. The conflation was Tension #1 on the schema-tensions list — fixed in commit c33b115 (niggahīta normalization + endings-list fixes + vowel-shortening). DPD's evaṃ headword (now normalized to evaṁ in our index) carries exactly the sense phase-a's curation correctly identified: "thus; this; like this; similarly; in the same manner; just as; such" (cite:dpd:dpd:18134, ind). Backfill applied: a1.senses[0]: - epistemicBasis: 'etymological' → 'lexical' - sourceCitationIds: + ['cite:dpd:dpd:18134'] - confidence: 'high' (new) - notes: updated to reference DPD's own treatment of evaṁ/eva as distinct headwords (the curation's "do not confuse evaṁ with bare eva" framing is VALIDATED by DPD, not contradicted by it) packet.citations: + 1 new entry (cite:dpd:dpd:18134) Total packet.citations: 13 → 14 phase-a curation log §13 records the backfill. Verified: - JSON parses; a1 has lexical basis + citation + confidence - 21 tests still pass - Vite build will be triggered with next renderer change The "Do not confuse evaṁ with bare eva" tooltip on a1.s1 remains correct; DPD's separate headword treatment makes it MORE accurate post-fix than at original curation time. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…curatorial' to EpistemicBasis Tension #7 surfaced in phase-a/b/c curation: claims grounded in syntactic/morphological rules were being labeled 'etymological' as the closest enum fit. But etymology is word-history (sandhi, cognate); these claims are GRAMMATICAL (agent-in-genitive-of-passive- participle, accusative-of-time-when, locative-as-location). Hit 3/3 phases — strong signal per the user's "more data first" rule. types/suttaStudio.ts - EpistemicBasis enum: added 'grammatical' + 'curatorial'. Now 7 values: etymological / grammatical / commentarial / contextual / lexical / comparative / curatorial. - Doc block explains the resolution history and what each new value covers ('grammatical' for syntactic rules; 'curatorial' for explicit inference grammatically grounded but not from a single attestation). components/sutta-studio/demoPacket.json - 3 placeholder usages migrated from 'etymological' → 'grammatical': * phase-a a2.s1.relation 'Heard BY' (agent-genitive-of-passive- participle pattern) * phase-b b2.s3.relation 'Time WHEN' (accusative-of-time-when) * phase-c c1.s2.relation 'Dwelling IN' (locative-as-location) - All 3 are syntactic rules, not etymology. Migration is honest; new enum value makes the basis accurate. - Zero remaining 'etymological' values in packet (all migrated; no legitimate word-history claims in phase-a/b/c yet). types/suttaStudio.test.ts - New test: EpistemicBasis enum round-trips all 7 values. Tension #7 closed at 3/3 phase hits. Cumulative schema-tensions resolved by today's work: #1 DPD stem-stripper conflation (commit c33b115) #7 EpistemicBasis 'grammatical'/'curatorial' (this commit) Verified: 14 types tests pass; types compile clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ings Phase-d surfaced the same stem-stripper conflation pattern that c33b115 fixed for -ūsu/-ūhi: kurūnaṁ (gen pl of Kuru) over-stripped to 'kur' and conflated with 'kura' (rice). Hit #3 in 3/3 batch-2 phases — crosses the threshold phase-c §5 set for "overwhelming case to fix the stripper." Parallel fix mirroring c33b115: - Removed 'ūnaṁ' and 'unaṁ' from PALI_ENDINGS (u-stem gen pl is vowel-lengthening + bare -naṁ, not a single 4-char ending) - Added bare 'naṁ' paired with the existing vowel-shortening rule - Kept 'ānaṁ' (a-stem gen pl IS a real single ending in standard analyses — dhammānaṁ = dhamm + ānaṁ, not dhammā + naṁ) Coverage: 86.5% → 86.9% on MN10 (+2 surface forms now resolved). kurūnaṁ now resolves to ['kurū', 'kuru'] — identical to phase-c's kurūsu, the correct DPD entries (dpd:22524 "name of the people of Kuru" + dpd:22502 "name of a country"). Tests: added 10 regression cases (PALI_ENDINGS membership assertions for ūnaṁ/unaṁ/naṁ/ānaṁ, tryStemStrips coverage for kurūnaṁ + bhikkhūnaṁ, a-stem gen pl preservation regression net). 37 pass under build-dpd.test; 20 pass under dpd.test against the rebuilt dataset. Resolves schema tension #1 — DPD stripper conflation. Both -su/-hi and -naṁ patterns now closed for u-stems. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…urūnaṁ nigamo Closes batch 2 of CURATION_PROTOCOL §6 (phase-b, phase-c, phase-d). Changes (one phase, one commit, per protocol): - d1 Kammāsadhammaṁ: isAnchor=true; d1s3 morph {number:sg, gender:n} (per gate-2 amendment: don't overclaim case — neuter sg has identical nom/acc forms; ambiguity noted in tooltip). 3 senses with separated epistemic basis: lexical (DPD dpd:20396) / curatorial (Jātaka-derived "Spotted-One Tamed" — note softened per amendment to make traditional derivation honest, not lexical-asserted) / etymological (compound parse). - d2 nāma: sense lexical + dpd:36427 (the naming-particle DPD entry, selected from 7 nāma homonyms). - d3 kurūnaṁ: d3s1 morph {number:pl}; d3s2 morph {case:gen, number:pl}; relation extended with confidence + epistemicBasis=grammatical ("Town OF" is case-derived). 3 senses lexical + REUSES phase-c citations (dpd:22524 kurū + dpd:22502 kuru) — same stem, new case. - d4 nigamo: d4s3 morph {case:nom, number:sg, gender:m}. 3 senses: market-town/township lexical (dpd:36863 + dpd:74785); "trading center" curatorial + low confidence per amendment (DPD doesn't attest it). - packet.citations: 14 → 18 (added dpd:20396, dpd:36427, dpd:36863, dpd:74785). 2 phase-c entries reused. Methodological win (per Aditya's framing): phase-d forces the system to separate four kinds of claim — lexical attestation, grammatical inference, traditional/commentarial etymology, curatorial pedagogical expansion. First phase to exercise the new 'curatorial' EpistemicBasis value (added in 4323310) for real, on the Jātaka derivation + trading-center expansion. Schema tensions: - #1 (DPD stripper conflation) — RESOLVED across both -su/-hi (c33b115) and -naṁ (be2b141, this session). All u-stem oblique plurals now correctly handled. - #7 (EpistemicBasis enum) — first real load on 'curatorial'; no laundering of curator inference as etymology. - No new tensions surfaced from phase-d. Curation log at docs/sutta-studio/curation/phase-d.md (gates, amendments, plain-register deferrals, open questions). Tests: 321/322 pass (1 flake in SessionInfo unrelated to phase-d; passes in isolation). Batch 2 complete → re-evaluate protocol before starting batch 3. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…-d) + renderer arc 29 commits. Tier-1 provider abstraction (DPD + SC bilara + suttaplex), Grounded Curation Loop protocol, four MN10 phases curated (evaṁ-me-sutaṁ, ekaṁ-samayaṁ-bhagavā, kurūsu-viharati, Kammāsadhammaṁ-nāma-kurūnaṁ-nigamo), renderer arc making grounded data visible (anchors, pin model, citation chips, About-this-text panel), and two schema tensions resolved: #1 DPD stripper conflations (c33b115 + be2b141 across all u-stem oblique plurals, with regression tests b1b7fdb) and #7 EpistemicBasis enum extension (4323310, first real load on 'curatorial' in phase-d). Full handover in docs/HANDOVER.md.

…kkhū āmantesi First curation phase running on the corrected DPD pipeline (aaa1ff9). All 5 surfaces resolve cleanly via SQLite Lookup — bhikkhū → bhikkhu (monk), not bhikkhā (alms). The pipeline fix lands in real curation flow as predicted. Changes (CURATION_PROTOCOL §1 step 10): - e1 tatra: color-explanation facet (function-word pointing role) + plain- first rewrites. Lexical basis via dpd:29605. - e2 kho: color-explanation facet (narrative-beat particle) + 4-facet discourse-particle teaching. e2.senses[1] "untranslated" marked as curatorial with note explaining the translator-decision rationale. - e3 bhagavā: REFRAIN-EXPLANATION FACET shipped (first in corpus — bhagavā now 2/4 phases qualifies per §3.3). morph hint nom/m/sg. 3 senses with lexical basis + dpd:49147 REUSED from phase-b. "Lucky One" sense honestly tagged as etymological with notes (compositional reading, not standard DPD gloss). - e4 bhikkhū: morph hint acc/pl/m. Plain-first rewrites; dropped √ symbols. 3 senses lexical + dpd:49885 (the masc-noun resolution that only became possible after aaa1ff9 — pre-fix this was bhikkhā/alms). - e5 āmantesi: isAnchor=true (action verb pivots frame→scene). Aorist morph on e5s4 from DPD's pos='aor' tag. 3 senses lexical + dpd:12086. Plain-first across all 4 segment-tooltip arrays. - DROPPED both phase-e relations (e3s2 "Addressed BY", e4s2 "Addressed TO") — see schema tension #12 below. - 5 new packet.citations (1 reused). Total 18 → 23. Schema tension #12 NEWLY surfaced: RelationType palette (ownership/direction/location/action) was designed for case-quirks where English doesn't have an analog (genitive-as-agent phase-a; accusative-of-time phase-b; locative-of-membership phase-c). Phase-e is an active-voice S-V-O sentence; neither subject-of-verb nor direct-object-of-verb fits the existing 4-color palette. Existing packet's "Addressed BY"/"Addressed TO" labels were semantically off (passive reading; wrong case). Resolution: dropped both relations rather than mis-label. Schema decision deferred per §3.3 (hit count 1/5 batches). Revisit if phase-f/g/h surfaces the same gap. Schema tension #1 (DPD stripper) — STAYS RESOLVED post-aaa1ff9. First phase to show the fix in real curation flow. Refrain bhagavā confirmed (2/4 phases — qualifies). Follow-up: backport refrain-explanation facet to phase-b b3s2 for consistency (small commit). Curation log: docs/sutta-studio/curation/phase-e.md (§0-§8). Includes applied diff, tension #12 analysis, plain-register cross-phase follow-up, and four open questions for future batches. Tests: 220 component pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Captures the meta-lesson from phase-e: when the same provider bug surfaces across multiple phases with a similar fix-shape (more endings, more exceptions to an enumeration), that's the Quick-Fixer / Goodharting anti-pattern from CLAUDE.md. Stop and ask whether the source data has a non-heuristic answer we're not using. The new §9.1 codifies: - Trigger: same fix-shape ≥2 phases (or explicit human flag) - Four diagnostic questions to map symptom → architectural choice - Empirical example: DPD stripper #1 hit count grew to 4 (evaṁ, kurūsu, kurūnaṁ, bhikkhū); Aditya called the anti-pattern; root-cause fix landed in aaa1ff9 (SQLite Lookup table) closing the whole class. - Application rule: hit count 1-2 → note in phase log; hit count ≥3 OR human flag → STOP curation, escalate to architectural investigation. Companion to the existing failures table in §9. The role-lock from §4 (schema/script changes in their own commits, not mid-phase) still applies — the gate just pauses the curation cadence so the architectural fix lands before the next phase runs. Future agents encountering accumulating per-symptom patches will now have explicit guidance to escalate rather than continuing the patch-and-move cycle. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Small phase (2 words, 3 segments) — opens the Buddha's direct speech. Three firsts in the corpus: - First VOCATIVE case in the morph index (f1s2: case=voc, num=pl, gen=m) - First iti / ti direct-speech marker (f2 — closing-quote particle) - Third bhikkhu refrain hit (phase-e bhikkhū-acc + phase-f Bhikkhavo-voc + root in both) — definitively confirms recurrence per §3.3 Changes: - f1 Bhikkhavo: isAnchor=true (Buddha's first word; semantic centerpiece). Morph on f1s2. Plain-first rewrites; dropped "[Vocative Plural]" bracket. Refrain-explanation facet on f1s2 (1/3) — refrain claim now empirically grounded after 3 hits. Cross-reference facet (3/3) teaches case-contrast (acc in phase-e → voc here, same noun, different role). - f2 ti: color/role facet (function-word quote-marker framing) + iti etymology + cross-language pattern (Pāli lexical vs English typographic quote marks). Plain-first; dropped "[Quotation marker]" bracket. - 4 senses with lexical basis. bhikkhu (dpd:49885) REUSED from phase-e; ti (dpd:30431, the end-of-speech-marker entry — NOT the cardinal 'three' or the verb-ending entry) NEW. - 1 new packet.citation. Total 23 → 24. Schema notes: - Tension #1 (DPD stripper) stays resolved. Bhikkhavo's "no DPD entry" is a Lookup coverage gap (bhikkhavo dialectal variant of bhikkhave), not a stripper conflation. Grounded curatorially in the lemma. - No new schema tensions surfaced. Refrain status updated: - bhikkhu: 3/5 phases (definitively recurring) - bhagavā: 2/5 phases (recurring per §3.3) - viharati: 1/5 phases (single appearance) Curation log: docs/sutta-studio/curation/phase-f.md (§0-§8). Batch 3 progress: 2/4 phases complete (e ✓, f ✓ / g / h). Tests: 220 component pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Captures the strategic-economic analysis that emerged from MN10 batches 1-4 hand-curation. The conversation surfaced ~14 distinct insights about pipeline vs hand-curation economics, cost telemetry, scaling, and the strategic pivot — none of which lived in the codebase until this commit. docs/sutta-studio/COMPILER_STRATEGY.md (289 lines, new): §1 The economic shape — quality bands (35% v1 / 65% v2 / 85% +post-passes / 100% hand), Pareto distribution (10-15 phases pedagogically critical, 35-40 are routine recurrences), per-compile cost estimates ($0.10-0.30 Gemini Flash / $1-3 Sonnet / $3-10 Opus per MN10). §2 What the pipeline does today vs what it could do — 11-row matrix classifying each hand-curation move as: learnable by prompt / deterministic post-processable / irreducibly human. §3 What's irreducibly human — translator-tradition citations, pedagogical taste, curation-log narrative. ~5-8 phases per sutta fall in this bucket. §4 Cost telemetry — surprised discovery that services/apiMetricsService.ts already records every API call with tokens+cost+apiType=sutta_studio to IndexedDB. Missing: phaseId attribution, UI, prompt caching, local-vs-LLM split beyond DPD. 3-step plan to close gaps. ccusage for Claude-Code-side conversation cost. §5 Scaling roadmap — 5 stages: hand-curate MN10 exemplar (in progress) → wire v2 overlay → build 4 deterministic post-passes → run on DN22 with selective polish (~5-6 hr vs ~30 hr from scratch) → satipaṭṭhāna sub-corpus → cross-pattern (~20-30 patterns covering most of the canon). §6 Open questions — translator-tradition DB, DPD Lookup-gap pattern resolution, prompt-caching tradeoff, when to wire v2, pedagogical- fidelity floor for routine phases. docs/HANDOVER.md (180 lines, replaces prior 2026-05-12 handover): - Full 17-commit inventory across 3 branches (PR #47 from prior session, PR #48 batch-3 from today, batch-4 branch from today) - DPD root-cause fix details (coverage 86.9% → 89.5%, 458 sqlite-lookup vs 20 heuristic-fallback vs 56 unmatched, better-sqlite3 dep, one-time 168 MB download) - Schema tensions status (#1 RESOLVED at root, #7 RESOLVED prior, #12 RESOLVED via documentation, Lookup-gap as new observation) - 3 protocol amendments codified (§9.1, §3.4.1, FEATURES §1.3) - 5 phase logs added (e/f/g/h/1) - Refrain status (bhikkhu 5/9, bhagavā 4/9, viharati 1/9) - 10 pending threads in priority order with effort estimates - The pending strategic pivot decision flagged for next session - Worktree convention + bash sandbox quirk + 3-branch base structure documented as non-obvious context - Resume instructions branch on pivot decision Both docs written by parallel subagents with full context briefings; reviewed and committed by main session. Companion to the v2 prompt overlay (2d198f6) and the protocol amendments (c6b150f + 9830ef1). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ked example #1) Third dative-of-purpose in the satipaṭṭhāna formula chain (dukkhadomanassānaṁ atthaṅgamāya — "for the disappearance of pain and dejection"). First worked example of the Path B workflow. Kept from v11: - Segmentation (dukkha + domanass + ānaṁ; attha + ṅ + gam + āya) - isAnchor on p10 atthaṅgamāya (the verb-noun action) - Relation arrow p9 → p10 (genitive of purpose) - morph fields (gen-pl on ānaṁ, dat-sg on āya — added gender m) - Plain-first tooltip register - Etymological tooltips (axle-hole metaphor for dukkha, sun-setting for attha) Added in polish: - epistemicBasis + confidence + notes on all 6 senses - Cross-phase note on -ānaṁ: connects to phase-2's sattānaṁ + phase-3's sokaparidevānaṁ (genitive-of-purpose spine) - Cross-phase note on -āya: explicit "third of five datives" framing - Gender m on -āya morph - Caveat on the ṅ segment (traditional grammar treats as sandhi-trace, not separate morpheme — v11's 4-segment parse is pedagogical choice) - Distinction note on dukkha vs soka/parideva/domanassa vocabulary cluster Time spent: ~18 min. Expected to converge to ~10-12 min as rhythm settles. Concerns surfaced in docs/sutta-studio/curation/phase-4.md: 1. Translator-tradition citations are claimed from training memory, not from a verified database. The F task (tradition DB) would ground them. 2. ~5 of the 8 minutes writing hand-curated JSON went to metadata fields that the A3 metadata-filler post-pass would produce deterministically. Continuing Path B vs pausing to build A3 first is a sequencing trade-off the curator should decide — both paths hit ~10 hr total but A3 leaves reusable infrastructure.

anantham and others added 6 commits October 26, 2025 10:32

refactor: modularize ai service

5b73232

feat(db): add flaggable chapter ops implementation

baa90d9

feat(db): add flaggable translation ops implementation

e482ab9

fix(import): ingest v2 session translations during streaming

0c21e60

chatgpt-codex-connector Bot reviewed Oct 27, 2025

View reviewed changes

anantham added 4 commits October 27, 2025 21:38

fix(gemini): align sanitize html call signature

7f60354

fix: support horizontal rule option in sanitizer

d66abc2

test: add comparison workflow coverage

956fed3

docs: record typescript debt inventory

5f51e76

vercel Bot deployed to Preview October 27, 2025 14:39 View deployment

anantham merged commit 8680293 into main Oct 27, 2025
3 of 4 checks passed

anantham deleted the feature/import-improvements-and-flaggable-ops branch October 27, 2025 15:58

anantham mentioned this pull request May 13, 2026

DPD root-cause fix + batch 3 (phases e-h) + protocol amendments #48

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comprehensive improvements: Import system, DB operations, UI enhancements, and automation#1

Comprehensive improvements: Import system, DB operations, UI enhancements, and automation#1
anantham merged 10 commits into
mainfrom
feature/import-improvements-and-flaggable-ops

anantham commented Oct 27, 2025

Uh oh!

vercel Bot commented Oct 27, 2025 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Oct 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

anantham commented Oct 27, 2025

Summary

Key Features

1. Modern Database Operations (Feature-Flagged)

2. Enhanced Import/Export System

3. AI Service Modularization

4. UI/UX Improvements

5. Development Automation

6. Testing & Quality

7. Documentation & Asset Management

Commits Included

Technical Details

Database Schema

Feature Flags

Import Workflow

Testing

Migration Notes

Next Steps

Breaking Changes

Uh oh!

vercel Bot commented Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Oct 27, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel Bot commented Oct 27, 2025 •

edited

Loading