refactor(db): remove legacy indexedDB facade by anantham · Pull Request #2 · anantham/LexiconForge

anantham · 2025-11-18T04:27:37Z

Summary

Decompose ChapterView/SettingsModal into modular components and hooks with dedicated tests.
Normalize image version deletion, auto-hydrate image state on navigation, and backfill translation metadata snapshots.
Add Gemini 3 Pro image preview pricing and expose strict XHTML sanitizer; polish SessionInfo layout and log work.

Root Cause (if bug fix)

Missing translation metadata snapshots and absent image hydration after navigation caused delete controls to vanish and retranslation prompts to misfire; legacy image version shapes also broke deletion across refreshes.

Changes

components/chapter/, hooks/use for reader extraction; tests/components/chapter/; tests/hooks/
services/navigationService.ts, services/db/operations/maintenance.ts, store/bootstrap/initializeStore.ts, store/slices/chaptersSlice.ts
services/db/operations/imageVersions.ts, store/slices/imageSlice.ts, tests/services/db/ImageOps.test.ts
components/settings/*, hooks/useAdvancedPanelStore.ts, useAudioPanelStore.ts, useExportPanelStore.ts, useProvidersPanelStore.ts, useNovelMetadata.ts
config/constants.ts, config/costs.ts, services/translate/HtmlSanitizer.ts, components/SessionInfo.tsx, docs/WORKLOG.md

Testing

npx tsc --noEmit
npm run test -- run tests/components/chapter
npm run test -- run tests/hooks/useTranslationTokens.test.tsx tests/hooks/useFootnoteNavigation.test.tsx tests/hooks/useChapterTelemetry.test.tsx
npm run test -- run tests/components/diff/ChapterView.mapMarker.test.tsx
npm run test -- run tests/services/db/ImageOps.test.ts
npm run test -- run tests/store/bootstrap/bootstrapHelpers.test.ts
npm run test -- run components/settings

Review Checklist

No direct commits to main
Follows commit format
Ready for Codex review
No unrelated changes mixed in

MOTIVATION: - Manual browser testing was time-consuming and error-prone - Previous work fixed a deadlock issue in ensureChapterSummaries() - Need automated testing to prevent regression of initialization issues - Schema drift errors and re-entrant openDatabase() calls were causing hangs APPROACH: - Installed Playwright for browser automation testing - Created comprehensive E2E test suite covering: * Fresh install initialization (clearing IndexedDB) * Database schema verification (all 10 stores present) * Deadlock detection (re-entrant openDatabase calls) * Prompt template initialization * Existing database upgrades - Added debug logging throughout indexeddb.ts initialization sequence - Fixed database name mismatch (LexiconForge → lexicon-forge) - Updated page.reload() to use waitUntil: 'domcontentloaded' CHANGES: - playwright.config.ts: New Playwright configuration for E2E tests - tsconfig.playwright.json: TypeScript config for Playwright - tests/e2e/initialization.spec.ts: 5 comprehensive initialization tests - tests/e2e/debug-console.spec.ts: Debug test for console log capture - tests/e2e/novel-library-flow.test.tsx: Moved to tests/ (was Vitest, not Playwright) - package.json: Added Playwright scripts (test:e2e, test:e2e:ui, test:e2e:debug) - package-lock.json: Added @playwright/test@^1.56.1 - services/indexeddb.ts: Added [DEBUG:*] console logging to trace initialization - config/constants.ts: Fixed JSON import with 'with { type: json }' attribute IMPACT: - E2E tests can now automatically verify database initialization - Debug logging reveals exact initialization sequence and timing - Tests confirmed app initializes successfully in ~5 seconds - Tests currently failing due to console message capture timing issues - Foundation laid for automated regression testing TESTING: - Manual verification: App loads successfully with debug logs visible - Debug test (debug-console.spec.ts): PASSES - shows full init sequence - 5 initialization tests: Currently FAILING (console capture issue) - Next: Fix console message capture timing to make tests reliable KNOWN ISSUES: - Tests set up console listeners after page navigation, missing early logs - Console message array remains empty or incomplete - Need to investigate React StrictMode double-rendering impact - Tests timeout waiting for messages that may have already been logged 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

MOTIVATION: - New Playwright E2E testing infrastructure needs documentation - Developers need guidance on running and debugging tests - Document known issues and workarounds for future reference APPROACH: - Created docs/E2E-TESTING.md with complete testing guide - Documented setup, configuration, and usage - Listed all test suites and their purposes - Explained debug logging system - Documented known issues and future improvements CHANGES: - docs/E2E-TESTING.md: New comprehensive E2E testing documentation IMPACT: - Developers can quickly understand and use E2E tests - Clear documentation of current status (working vs broken) - Reference for debugging test failures - Foundation for improving test infrastructure 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

vercel · 2025-11-18T04:27:42Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Preview	Comments	Updated (UTC)
lexicon-forge	Ready	Preview	Comment	Nov 22, 2025 8:33am

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2025-11-18T04:33:27Z

+  static async storeEnhanced(enhanced: any): Promise<void> {
    const chapter: Chapter = {
      title: enhanced.title,
      content: enhanced.content,
      originalUrl: enhanced.canonicalUrl || enhanced.originalUrl,
+      canonicalUrl: enhanced.canonicalUrl,
      nextUrl: enhanced.nextUrl,
      prevUrl: enhanced.prevUrl,
      chapterNumber: enhanced.chapterNumber,


Preserve raw URLs when storing enhanced chapters

The modern backend’s storeEnhanced path assigns originalUrl as enhanced.canonicalUrl || enhanced.originalUrl. When both properties are present (the common case when normalizing a fetched chapter), this stores the canonical URL as the primary key and storeChapterModern only generates URL mappings for that value. The original/raw URL is never persisted, so subsequent lookups by the raw URL (e.g. findChapterByUrl, navigation to the original link after a reload, or imports that reference the raw URL) will fail because no mapping exists. The memory implementation still prefers enhanced.originalUrl, so the inconsistency is easy to miss. Prefer the raw originalUrl when available and fall back to the canonical URL to keep both URLs mapped and maintain backward compatibility.

Useful? React with 👍 / 👎.

…-for-chapters fix(db): preserve raw enhanced chapter urls

…E-008 draft Continues the investigation framework with two more issues investigated end-to-end and the first proposed-ADR draft. #11 (comparison panel follows chapter) — verdict: already fixed. Commit 0c5162b (2026-04-10) added a useEffect in useComparisonPortal.ts that dismisses on currentChapterId change. Authored manually by Aditya with no agent transcript captured (a useful data point about archaeology coverage: it works for agent-driven changes, blind to human-direct edits). Test gap remains — no regression test for the dismissal behavior. #2 (fan toggle restarts translation) — verdict: matrix prediction partially refuted. The codebase has dual-layer in-flight guards (mediator's shouldAutoTranslate check + handleTranslate's pendingTranslations.has entry guard), so the completion-only-guards theme does NOT instance here. The jit-vs-precompute theme partially instances at the settings-fingerprint axis (any settings change forks the version tree, costing API calls). Paused on user repro — without specific reproduction steps, can't tell if a real bug exists or if user perceived a settings-side-effect. The theme docs are updated to record the falsification. CORE-008 draft (Derived Views Are Recomputed, Not Stored) lands at issues/_themes/proposed-adrs/. Working draft only — not in docs/adr/ until Aditya ratifies. Names the umbrella principle that CORE-006, FEAT-001, and FEAT-003 already apply at narrower scopes. Identifies the v1-composite-as-raw-vs-derived question as the load-bearing decision needed before drafting can complete. Theme updates: - completion-only-guards: removes #2 from instances; N=4 (was 5); documents the falsification as evidence the matrix is doing real work - jit-vs-precompute: links to the new CORE-008 draft as leverage point - _themes/README.md: adds a "Proposed ADRs" table tracking draft status ADR audit findings (from earlier in this session) integrated into: - issues/README.md status table — #1, #6, #9, #12 upgraded from spec-gap to ADR-vs-code drift (CORE-006 commits to "render shell immediately", FEAT-001 commits to "ensure *a* translation is available") - issue #1's section 4 promoted from (A2,B2,C2) to (A1*,B2,C2) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…no push Captures session state for the next instance. Four phases per the expansion:handover skill: Phase 1 — Commit checkpoint: all session work in 16 commits on main, none pushed (per multi-agent CLAUDE.md rule). Pre-existing dirty files (Issues.md, WORKLOG.md, etc.) explicitly NOT committed — they predate this session. Phase 2 — Thread inventory: active threads (manual validation, #1, #16, skill-update Patch 6), blocked threads (#2 awaiting user repro, CORE-008 awaiting ratification, skill push awaiting authorization), deferred (useAsyncAction design, 8 un-investigated issues). Phase 3 — Session learnings: project-level (investigation framework location, bridge service path), cross-project (the investigation pipeline pattern, useAsyncAction 3-shape taxonomy), skill update candidates (Patch 6, Patch 7), ADR candidates (CORE-008/009/010/011). Phase 4 — Handover document at docs/HANDOVER.md with resume instructions and 6 calibration moments. Lives at docs/HANDOVER.md so next instance can read on session start. Git-tracked so it survives compaction and persists across sessions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

User reported (2026-05-06 screenshot): two "Continue Reading" cards for the same novel (Forty Millenniums of Cultivation), one showing Chapter 338 and the other Chapter 2; chapter count rendered as 6528 when the registry declares 3521. Both root causes are duplicate IDB records that the V2 dedup migration (repairScopedStableIdDuplicates) was supposed to clean — but its one-time flag is set on existing databases, so any duplicates created after V2 ran sit there forever. Three layers of fix, defense in depth: 1. NovelLibrary.tsx — render-side dedup of Continue Reading entries. Group by novelId, keep the most recent lastReadAtIso. Catches any duplicates that slip past the migration or appear after it runs. 2. fetchNovelChapterCounts (services/db/operations/summaries.ts) — dedupe by (novelId, chapterNumber) before counting. OR translation status across duplicate rows. Falls back to stableId for summaries without chapterNumber. Pre-fix: FMC's 3521 chapters → 6528 displayed count due to 1.85x duplication. Post-fix: counts unique chapters. 3. MaintenanceOps.consolidateBookshelfDuplicates (V3) — new boot-time migration gated on bookshelfDedupedV3 flag. Reads bookshelf-state, groups by novelId, keeps the most-recent entry per novel, re-keys under buildLibraryScopeKey so legacy unscoped winners get pulled forward to scoped form using a sibling's versionId when available. Idempotent on clean state; runs once per database. The render-side fixes (#1, #2) take effect immediately for ANY user. The migration (#3) cleans the underlying IDB on next boot. Together they cover both "fix the symptom now" and "prevent re-display." Files: - components/NovelLibrary.tsx (+~20 lines): dedupedBookshelfEntries derived map keyed by novelId, fed into continueReadingEntries. - services/db/operations/summaries.ts: fetchNovelChapterCounts rewritten with seenByNovel Map<novelId, Map<dedupKey, ...>>. - services/db/operations/maintenance.ts: - SETTINGS.BOOKSHELF_DEDUPED_V3 flag added - MaintenanceOps.consolidateBookshelfDuplicates static method - ~150 lines including JSDoc explaining why this is V3 and not inside the existing repairScopedStableIdDuplicates path - store/bootstrap/initializeStore.ts: wired into the bootRepairs list after syncSummaries (last in chain so all prior backfills land first). - tests/current-system/bookshelf-dedup.test.ts: 9 regression tests covering both new pieces (5 for the migration, 4 for the count dedup). Verified: - npx vitest run → 1227 pass, 16 skip (1218 baseline + 9 new tests, no regressions) - npx tsc --noEmit clean for changed files (3 pre-existing errors on main unchanged) What this does NOT do: - The duplicate chapter records themselves (the source of the inflated count) are NOT removed from IDB. That's a heavier migration that needs careful canonical-row selection (which version's stableId wins?) and reference repair (translations, feedback, amendment logs all key by stableId). Defer until a separate investigation. This fix counts correctly even with the duplicates present. - Same for the duplicate "Chapter 339" entries with different capitalization in the dropdown — separate stableId-generation cleanliness bug. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…1 commit B.1) Per ADR SUTTA-008 §Build order step 3, lands the DPD data layer for MN10. Provider impl follows in B.2; compiler wiring in B.3. ms-dpd vs full dpd-db decision: ms-dpd is verb-blind — its inflection table has zero verb conjugation rows, only declensions. For the assasati verb family central to MN10 this kills it. Using full dpd-db. Storage strategy (resolved during this spike — ADR §Open Questions #2): Per-sutta subsets, not full corpus. Full DPD export is 80-120MB JSON; committing that for one sutta is disproportionate. The script extracts only headwords referenced by surface forms in the target sutta. Total committed for MN10: ~656KB. Each new sutta adds its own subset directory. Surface→lemma resolution: Heuristic stem-stripping over dpd.txt (the 4MB human-readable release; no SQLite required). Initial pass: 34%. After parser fix for single-digit homonyms (DPD uses both "me 1" and "a 1.1" styles): 81.6% coverage on MN10 (436/534 surface forms). Remaining 18% are mostly compounds (sammāsambuddhassa, ajjhattikabāhiresu) and inflected verb forms that live in DPD's SQLite inflection table. Documented as unmatchedSurfaces in manifest.json; SQLite escalation is a future commit if curation needs higher coverage. Files: - scripts/build-dpd.ts — Node TS, no native deps. Downloads pinned DPD release dpd-txt.zip (4MB) on first run, caches in data/_raw/ (gitignored), parses to structured DpdRecord, fetches bilara MN10 Pāli root, extracts surface forms, resolves via stem-stripping + quotative marker handling + locative→stem restoration. Projects to LexiconEntry shape per the providers types added in commit A. - data/dpd/mn10/headwords.json (618KB) — lemma → LexiconEntry[] - data/dpd/mn10/forms.json (20KB) — surface → lemma candidates - data/dpd/mn10/manifest.json — coverage stats + unmatched surfaces - data/LICENSE-DATA.md — CC BY-NC-SA 4.0 with DPD attribution + placeholders for VRI / bilara / future providers - .gitignore — data/_raw/ added (upstream zip + extracted txt) - package.json — `npm run build:dpd` script entry Pinned release: dpd-db v0.4.20260501 (May 2026). Re-ingest with `npm run build:dpd -- --force` after bumping DPD_RELEASE_TAG in the script. Monthly cadence upstream. Verified: 42 pre-existing provider tests still pass. No app code changed in this commit; the DpdProvider that consumes this data lands in B.2. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ū bhagavato paccassosuṁ The monks' response to the Buddha's call — 6 words, 10 segments. Three pedagogical firsts surfaced: - Same surface 'bhikkhū' in its THIRD grammatical role across e/f/g (acc-pl / voc-pl-via-Bhikkhavo / nom-pl). Same noun, three contexts disambiguate. g4s2's cross-reference facet teaches the pattern explicitly. - bhagavā stem in GENITIVE (1st gen-bhagavā vs phase-b/e's nom-sg). g5s2's case-contrast facet teaches nom→gen by reference. Genitive functions as dative for verbs of speaking — Pāli quirk worth the arrow. - Aorist 3PL (paccassosuṁ -suṁ) vs phase-e's aorist 3sg (āmantesi -si). Same tense, different number. Verb-side number marking. Changes: - g1 Bhadante: morph voc/sg/m. Senses curatorial + etymological (Bhadante isn't in DPD Lookup — coverage gap like Bhikkhavo phase-f; grounded in bhadanta lemma). - g2 ti: cross-reference to phase-f's ti; lexical + dpd:30431 REUSED. - g3 te: color-explanation facet + morph nom/pl/m + cross-reference to phase-e's tatra (same demonstrative system); lexical + dpd:31134. - g4 bhikkhū: morph nom/pl/m. 3-CASE CROSS-REFERENCE FACET on g4s2. DROPPED "Replied BY" relation (tension #12 — subject-of-verb doesn't fit the case-quirk-only palette). 3 senses lexical + dpd:49885 REUSED. - g5 bhagavato: morph gen/sg/m. KEPT "Replied TO" relation with epistemicBasis='grammatical' (legitimate dative-recipient reading — Pāli gen-as-dative for verbs of speaking; DPD attests both readings). Case-contrast facet on g5s2. 3 senses: lexical (dpd:49136 dative + dpd:49137 genitive). - g6 paccassosuṁ: isAnchor=true (verb of the response). morph aor/3pl/finite on g6s2. 3 senses lexical + dpd:39413. Plain-first explanation of paṭi+su compound logic ("reply IS hearing-back"). Schema tension #12 (S-V-O palette gap) — HIT #2 across batch 3: Phase-e dropped two relations (both mis-shaped); phase-g drops one (subject-of-verb) and KEEPS one (dative-recipient). ARROW-EARNING RULE clarified through curation: relations earn their arrow when the Pāli case-marker does work English doesn't have an analog for. Genitive-as-dative ✓ (Pāli quirk). Subject-of-verb ✗ (universal). Hit count 2/3 batch-3 phases qualifies per §3.3 to file as GH issue / FEATURES.md §1.3 clarification (separate commit). Refrain status update: - bhikkhu: 4/6 phases (definitively recurring) - bhagavā: 3/6 phases (definitively recurring, multiple cases now) - viharati: 1/6 (single appearance) EpistemicBasis distribution this phase: 13 lexical / 1 etymological / 1 curatorial = 86% lexical (highest yet — confirms the DPD root-cause fix is delivering clean evidence in subsequent curation). Curation log: docs/sutta-studio/curation/phase-g.md (§0-§8). Includes proposed §3.4.X cross-phase facet rule and the arrow-earning rule clarification. Batch 3 progress: 3/4 phases complete (e ✓ / f ✓ / g ✓ / phase-h). Tests: 220 component pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Aditya A P and others added 9 commits October 28, 2025 08:29

Align translation DTO plumbing and refresh provider adapters

f74a143

Tighten audio provider types and refresh Claude payload

5430040

refactor(db): align repo contracts and migration plumbing

bb5b2b3

feat(epub): preserve cache keys and richer chapter metadata

902be09

feat(epub): delegate worker jobs to export service

4bc5871

chore(diff): import missing types for analysis service

12212bc

refactor(db): remove legacy indexedDB facade

056bb75

vercel Bot deployed to Preview November 18, 2025 04:27 View deployment

Merge branch 'main' into feature/import-improvements-and-flaggable-ops

43311c0

vercel Bot deployed to Preview November 18, 2025 04:30 View deployment

chatgpt-codex-connector Bot reviewed Nov 18, 2025

View reviewed changes

fix(db): preserve raw URLs in storeEnhanced

6618a3c

vercel Bot deployed to Preview November 18, 2025 05:57 View deployment

docs/tests: capture 2025-10-27 tsc snapshot and sanitizer updates

b57f2b5

vercel Bot had a problem deploying to Preview November 18, 2025 06:38 Failure

anantham added 9 commits November 19, 2025 09:59

fix(db): preserve raw enhanced chapter urls

2ffcc32

refactor(reader): extract reader view stack

5c397a1

fix(bootstrap): backfill translation metadata and hydrate images

8468a9c

fix(images): keep version controls after deletion

4b62131

chore(epub): expose strict xhtml sanitizer

99f3e73

refactor(settings): modularize settings panels

377b301

chore(session): tighten SessionInfo layout

f8ac3c5

feat(config): add gemini-3-pro image pricing

8d6cc63

docs: log image hydration and settings split

532d879

vercel Bot deployed to Preview November 22, 2025 08:14 View deployment

Merge pull request #3 from anantham/codex/fix-bug-in-storing-raw-urls…

6a351ae

…-for-chapters fix(db): preserve raw enhanced chapter urls

anantham merged commit 5508c01 into main Nov 22, 2025
2 of 4 checks passed

anantham deleted the feature/import-improvements-and-flaggable-ops branch November 22, 2025 08:33

vercel Bot deployed to Preview November 22, 2025 08:33 View deployment

anantham mentioned this pull request May 11, 2026

feat(sutta-studio): provider abstraction + Citation extension (Tier-1 commit A of 5) #38

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(db): remove legacy indexedDB facade#2

refactor(db): remove legacy indexedDB facade#2
anantham merged 22 commits into
mainfrom
feature/import-improvements-and-flaggable-ops

anantham commented Nov 18, 2025 •

edited

Loading

Uh oh!

vercel Bot commented Nov 18, 2025 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Nov 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

anantham commented Nov 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root Cause (if bug fix)

Changes

Testing

Review Checklist

Uh oh!

vercel Bot commented Nov 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Nov 18, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

anantham commented Nov 18, 2025 •

edited

Loading

vercel Bot commented Nov 18, 2025 •

edited

Loading