chore: repo hygiene ignores#4
Merged
Merged
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
anantham
added a commit
that referenced
this pull request
Jan 31, 2026
MOTIVATION: - kimi-k2.5 ranked #1 with 0.92 score but showed NO alignment edges in UI - Quality scorer checked !isGhost but not actual linkedSegmentId presence - Leaderboard picked "best run per model" which could be a single easy phase - This masked models that failed 14/15 phases but aced the first one APPROACH: - Add alignmentCoverage metric: ratio of tokens with actual linkedSegmentId - Weight alignmentCoverage at 50% of coverage score (most visible to users) - Aggregate ALL unique phases across ALL runs per model for leaderboard - Use best score per phase when same phase appears in multiple runs CHANGES: - scripts/sutta-studio/quality-scorer.ts: - Add alignmentCoverage to scoreWeaver() - Update coverageScore formula: 33% pali + 17% mapping + 50% alignment - Add alignmentCoverage to QualityScore type - scripts/sutta-studio/generate-leaderboard.ts: - Rewrite to aggregate all phases per model, not pick best single run - Track unique phases by phase ID to avoid double-counting - Show phases/15 in output for transparency - docs/benchmarks/sutta-studio.md: - Document alignmentCoverage metric with 50% weight - Update leaderboard methodology to explain all-phases aggregation IMPACT: - minimax-m2 now #1 (15/15 phases, 0.88 overall) - kimi-k2.5 drops to #4 (only 4/15 phases completed) - Leaderboard now reflects true model capability across full test set Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
anantham
added a commit
that referenced
this pull request
May 8, 2026
… confirmed Final load-test of the proposed-skill workflow on a fresh A3 case. Issue #4 (portal symbol no feedback): investigated end-to-end following the workflow in issues/_meta/proposed-skill/SKILL.md. Findings: - Verdict: real-bug, confidence 0.94 - (A3, B2, C2) — no spec covers UX feedback policy; code lacks button in-flight state; vision drifted (acausality / anticipative UX violated) - Theme: silent-feedback-gaps, instance confirmed end-to-end - Twin of #5 (illustration icon, same shape) and kin of #14 (retry spinner) - Action: fix_local — add useState(false) + disabled + spinner to the portal button in BOTH FeedbackPopover.tsx (desktop) and SelectionOverlay.tsx (mobile) - 3 regression test obligations named (button-pending-state in both components, plus toast-on-handler-entry timing) - Generator-fix sketch deferred: silent-feedback-gaps now has N=3 with one confirmed instance, crossing the "consider fix_generator" threshold. Proposed CORE-010 ADR draft and useAsyncAction/AsyncButton primitive are the natural follow-on but not in this commit. Theme update at issues/_themes/silent-feedback-gaps.md: - Roster table now distinguishes "investigated end-to-end" from "suspected" - Status: promoted from "noticed pattern" to "verified generator with concrete fix shape" Index updated to reflect #4's new state. Validates the proposed-skill workflow on: - An A3 case (no existing spec to enforce; pure spec-gap) - A theme being promoted from N=3-suspected to N=3-with-one-confirmed - Code-reading-confirmed §2 (the skill's hard rule allows this when the bug is mechanically determinable from static analysis without hidden async lifecycle complexity — issue #4 qualifies, issue #16 didn't) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
anantham
added a commit
that referenced
this pull request
May 8, 2026
Fixes issue #4 — clicking the portal symbol (SillyTavern self-insert) left the button visually unchanged until either an early-bail toast fired or the network round-trip resolved. Users re-clicked thinking nothing happened. Implementation in two sites (desktop + mobile twin): components/FeedbackPopover.tsx: Added isSelfInsertPending state + isSelfInsertPendingRef ref. Button onClick wrapped in async handler that sets pending → awaits → unsets in finally. Renders animated SVG spinner when pending. Button gets disabled, aria-busy, data-testid for tests. components/chapter/SelectionOverlay.tsx: Same fix shape for the mobile portal button. Uses ⟳ glyph (matches surrounding emoji-button style) instead of SVG. The state+ref pair: state drives visible UI (disabled, spinner); ref guards against synchronous re-entry. Discovered via test failure when multiple fireEvent.clicks within one act() saw stale state — the ref provides the synchronous check that state can't. Regression tests: tests/components/FeedbackPopover.spec.tsx (new, 4 tests) - portal button renders when SillyTavern enabled - doesn't render when disabled - disables + shows spinner immediately on click - blocks re-clicks while pending tests/components/chapter/SelectionOverlay.test.tsx (extended, +2) - mobile portal disables on click - blocks re-clicks while pending Pre-fix: 5 of 9 fail (the new tests that look for data-testid which didn't exist pre-fix). Post-fix: all 9 pass. Closing gate satisfied. 11 unrelated test failures in NotificationToast and NovelLibrary tests exist before AND after this change (verified by stashing the fix and re-running) — they are pre-existing, not caused by this work. Theme silent-feedback-gaps: promoted from "N=3 suspected" to "N=1 fixed + N=2 suspected" (issues #5 illustration and #14 retry spinner are twins, fix shape is now established). Validates the proposed-skill workflow on a fix end-to-end: - A3 case (no existing spec) — investigated, classified, fixed - Code-reading-confirmed §2 (rule from skill-update Patch 1) - Closing gate exercised (verify-fail-pre-fix via git stash) - Theme roster updated with status column (skill-update Patch 5) Manual validation in the dev server pending from Aditya: click the portal icon, confirm spinner+disabled. Both desktop and mobile. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
anantham
added a commit
that referenced
this pull request
May 8, 2026
…down Three changes addressing the SillyTavern bridge UX (#4 follow-on), in response to user feedback "if the connection isn't there, why even show the button in the toolbar?" 1. start-lexiconforge.command extends to also start the bridge. When ~/Documents/Ongoing Local/ST/novel-analyzer/bridge.py exists and uvicorn is on PATH (or .venv/bin/uvicorn), the script launches uvicorn alongside Vite. Both processes are killed on Ctrl+C. 2. Bridge reachability check + auto-hide portal button. New services/sillyTavernBridge.ts with pingSillyTavernBridge() using fetch with mode:'no-cors' to bypass the bridge's CORS-only-POST config — opaque response = reachable, network error = unreachable. New hooks/useSillyTavernBridgeStatus.ts subscribes to the user's enableSillyTavern + bridgeUrl settings, pings on mount + when those change, exposes isReachable + manual refresh. ChapterView wires onSelfInsert = sillyTavernBridge.isReachable ? handleSelfInsert : undefined, so the popover button auto-hides when the bridge can't be reached. 3. Test Connection + Copy Command in settings panel. SillyTavernPanel.tsx now has a Test Connection button next to the bridge URL input that pings and shows green/red status pill, plus a Copy button next to the uvicorn command. Status resets to unknown when the URL is edited. Tests: tests/services/sillyTavernBridge.test.ts — 5 tests for ping behavior (reachable, unreachable, empty URL, null/undefined, type guards) tests/components/settings/SillyTavernPanel.test.tsx — 6 tests (button rendering, disabled-when-empty, reachable pill, unreachable pill, copy-to-clipboard, hidden-when-disabled) Existing FeedbackPopover (4) and SelectionOverlay (5) tests still pass — they pass enableSillyTavern + onSelfInsert directly to the popover/overlay components, bypassing the new ChapterView gating, so they continue to verify popover behavior in isolation. Total: 20 tests, all passing. Typecheck clean. The whole thing closes the silent-feedback-gaps loop end-to-end for #4: - Click works → spinner + disabled (the original #4 fix) - Bridge down → button hidden so users don't click into a dead end - Settings panel surfaces connection status + copy-to-run command - Startup script auto-launches both processes A user with no manual setup on first install just runs the .command and gets a working portal button if novel-analyzer is set up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
anantham
added a commit
that referenced
this pull request
May 8, 2026
After running the pipeline on issue #4 (and the bridge-UX follow-on), applied the patches drafted from skill-update observations. Patches landed in SKILL.md: §3 HARD RULE for §2 (Reproduction): Added code-reading-confirmed as a fourth completion mode alongside live-repro / vitest / Playwright. Heuristic: if you can write the failing-pre-fix unit test directly from reading the code without observing any runtime state, code-reading suffices. If you find yourself saying "we'd have to see what happens at runtime" — live repro is mandatory. Worked example contrast: #4 (qualified) vs #16 (forced live repro). §1 Scaffold — Twin issues handling: Skip-and-reference pattern: scaffold folder, copy verbatim, write "Twin of #N — see that issue's investigation. Differences: [list]." Skip §4-§9 if no meaningful differences. Fix lands as one PR; issues close together; theme roster lists each separately for accurate N counting. Worked example: #4/#5/#14 are twins. §1 Scaffold — Calibration check inline pattern: When a verdict or fix-direction depends on a non-obvious reading of the verbatim claim, add a "Calibration check" sub-paragraph stating which interpretation. Use sparingly. Worked example: #16 needed three framings; explicit calibration would have caught at least one wrong turn. §7 Archaeology — skip-conditions: Run when: a previously-working flow broke (introduced commit findable). Skip when: structural bug, no introducing commit (#4's portal button never had a pending state from inception). §5 Theme-noticing — suspected vs confirmed status: Theme rosters MUST distinguish suspected (instance proposed but not investigated) from confirmed (investigated end-to-end) from addressed (confirmed + fixed). Mandatory column shape. Promotion thresholds named: N≥2 confirmed → consider fix_generator; N≥3 suspected → invest in investigating one to promote it. Theme docs updated to use the new Status column shape: - jit-vs-precompute.md: 10 instances now have status (1 confirmed, 1 addressed, 8 suspected). Fixed table-column-mismatch caused by earlier partial edit. - silent-failure-deep.md: 1 suspected. (silent-feedback-gaps.md and completion-only-guards.md already had the column from prior commits; co-mingled-commits.md uses a different table shape that already encodes status implicitly.) Version bumped 0.1.0 → 0.2.0. NOT pushed to expansion-marketplace yet — staying in LexiconForge for 1-2 more iteration cycles before externalizing. Validation gates from the skill's own README still open: no non-author agent has used it cold; no human has followed rules through one issue without being a co-designer; no theme has gone full ratify→enforce; scripts haven't been tested on a non-LexiconForge codebase. Next move (in this session): fix issue #5 (illustration icon) as the mechanical twin port of #4 — exercises the new twin-issues handling rule. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
anantham
added a commit
that referenced
this pull request
May 8, 2026
…#4) Twin port of #4's silent-feedback-gaps fix to the 🎨 illustration button. First exercise of the skill v0.2 twin-issues handling pattern (skip-and -reference): same bug shape at a different site, fix lands alongside the original. Implementation differences from #4 (documented in the issue README): Handler shape: onFeedback is sync fire-and-forget (returns void), not awaitable like #4's onSelfInsert. Can't tie pending state to await resolution. Pending duration: fixed timeout (1200ms) instead of bound-to-await. Long enough to register as "click acknowledged" without overstaying the downstream image-generation spinner's welcome. Empirical knob; may auto-tune later. Re-entry guard: timeout-bounded, so state-only guard suffices (no separate ref). Click during pending window is ignored. Sites updated (same shape both): components/FeedbackPopover.tsx (desktop selection popover) components/chapter/SelectionOverlay.tsx (mobile touch overlay) Regression tests: tests/components/FeedbackPopover.spec.tsx — 2 new tests - shows pending state on click and clears after timeout - blocks re-clicks during the pending window Pre-fix: both fail (no data-testid="illustration-button" attribute). Post-fix: 11/11 across desktop and mobile, both portal and illustration. silent-feedback-gaps theme: N count moves from 1 fixed + 2 suspected to 2 fixed + 1 suspected. After #14 (retry spinner) is also fixed, three confirmed instances justify extracting <AsyncButton> / useAsyncAction as the generator-fix. Validates skill v0.2's twin-issues handling rule end-to-end: - #5 README is a skip-and-reference document, not a duplicate investigation. Shorter, points to #4, lists deltas. - Both issues' fixes ship in the same PR (this commit). - Theme roster lists each instance separately for accurate N counting. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
anantham
added a commit
that referenced
this pull request
May 8, 2026
Per skill v0.2 twin-issues handling: short README that points to #4's full investigation and lists only the deltas (handler shape, pending duration, re-entry guard). No duplicate investigation. Should have landed in the previous fix commit; missed because Write required a Read first. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
anantham
added a commit
that referenced
this pull request
May 8, 2026
Closes silent-feedback-gaps theme to N=3 confirmed-fixed. Pre-fix: components/ChapterView.tsx:104 canManualRetranslate = !!translationResult → in failed state (no result), button stays gray + disabled components/chapter/ChapterContent.tsx renders the "Translation Failed" red box with no retry control — failed state is a dead-end. User has to navigate away or change a setting to force re-translation. Fix in two parts (both in this commit): 1. Header retranslate button enabled in failed state. canManualRetranslate now also true when translationError is set (viewMode === 'english' && !translationResult && !!error). The header button becomes clickable; user can retry from there. 2. Inline retry button in the failed-state UI. ChapterContent.tsx now optionally renders a "Retry translation" button right inside the red error box, where the user is already looking. Wires through onRetryTranslation prop from ChapterView, which calls the existing handleRetranslateClick. Button uses the silent-feedback-gaps pending-state pattern (same as #4/#5): useState + useRef guard, spinner SVG swap on pending, "Retrying…" label. Pending state externally clears when translationError clears (e.g. retry started). This is NOT a pure twin of #4/#5 (skill v0.2 twin-issues handling explicitly distinguishes same-theme-different-fix-shape from twin). The differences: - #4/#5: button silent during click → add pending state to button - #14: failure state has no recovery → add a button + then make THAT button satisfy the #4/#5 pattern Same theme, different bug at root, different fix shape. Tests added in tests/components/chapter/ChapterContent.test.tsx (+4): - no retry button when onRetryTranslation undefined (default-off) - renders when prop provided - click fires handler, shows pending state (disabled, aria-busy, "Retrying", spinner svg) - re-clicks blocked while pending - pending state clears when translationError clears Pre-fix: 4 of 4 new tests fail. Post-fix: 14/14 (10 existing + 4 new). Closing gate satisfied. silent-feedback-gaps theme now at N=3 confirmed-fixed (#4 portal, #5 illustration, #14 retry). The three variants share a button-level pending-state shape but differ in pending-bound semantics (await / timeout / external-signal) and re-entry guard mechanism. Generator-fix (<AsyncButton> / useAsyncAction) is now empirically grounded but needs to support all three shapes — design spec lives in the three issue READMEs collectively. Deferred to a dedicated design session rather than rushing as a follow-on. Manual validation pending from Aditya: trigger a translation failure (bad API key works), confirm the inline retry button is visible and clickable, AND that the header retranslate button is also clickable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
anantham
added a commit
that referenced
this pull request
May 11, 2026
…s (Tier-1 commit D)
Closes Tier-1 of ADR SUTTA-008 §Build order alongside the deferred C
(Aṭṭhakathā provider — to land as a follow-up). Phase-a re-curation
now has every Tier-1 data source feeding the curation helper.
services/providers/scBilaraVariants.ts
- SuttaCentralBilaraVariantsProvider fetches
raw.githubusercontent.com/suttacentral/bilara-data/published/
variant/pli/ms/sutta/<basket>/<uid>_variant-pli-ms.json
- Parses bilara's free-form "original → reading (witness1, witness2)"
notation; multi-variant entries split on `|`
- Returns VariantReading[] per canonical segment id, compatible with
DeepLoomPacket.provenance.segmentVariants
- Per-sutta cache; missing variant files (common for stable openings
like mn10:1.1) cached as empty so we don't refetch
- License: CC BY-SA 4.0
services/providers/scSuttaplex.ts
- SuttaCentralSuttaplexParallelProvider implements ParallelProvider
- Calls https://suttacentral.net/api/parallels/{uid}
- Handles work-level keys (top-level "mn10") and segment-level keys
("mn10#44.1", with # separator) — the latter projected to
canonicalSegmentId form ("mn10:44.1") for consistency with bilara
- Returns ParallelRef[] with deterministic citationIds + sourceIds
services/providers/index.ts
- Re-exports both providers + their result types
scripts/sutta-studio/lookup-phase.ts
- At phase start, prints "Phase-level evidence":
• Parallels for the sutta (top 8 of N) — workId, type, title, acronym
• Variant readings for the phase's canonicalSegmentIds, or "(none)"
when stable across witnesses
- --json mode includes the parallels + variants in the structured blob
Verified end-to-end against phase-a / mn10:1.1:
## Phase-level evidence
Parallels for mn10 (top 8 of 16):
→ dn22: full · The Long Discourse about the Ways of Attending to Mindfulness · DN 22
→ ea12.1: full · The One Way In Sūtra · EA 12.1
→ ma98: full · 念處 · MA 98
→ mn119: full · Mindfulness of the Body · MN 119
...
Variant readings: (none for mn10:1.1 — stable across witnesses)
Tests:
- 7 tests for SuttaCentralBilaraVariantsProvider (parse, cache,
network failure, malformed entries)
- 9 tests for SuttaCentralSuttaplexParallelProvider (work-level,
segment-level, # → : conversion, merging, fallbacks)
- 173 total tests pass across 11 suites; no regressions
Tier-1 status:
A 9168b5a provider abstraction + Citation extension
B.1 82fae37 DPD ingestion + MN10 subset (81.6%)
B.2 49d3eba DpdProvider impl + tests
E 5ff46c0 curation helper script
B.3 bc46e47 compiler wired to DPD via Vite bundle
D (this) SC bilara variants + suttaplex parallels providers
C pending VRI edition + Aṭṭhakathā commentary (deferred per ADR
§Open Questions #4; alignment unknown; not blocking
phase-a re-curation)
Task #14 MN10 phase-a re-curation is now fully unblocked with every
Tier-1 data source flowing through the curation helper.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
anantham
added a commit
that referenced
this pull request
May 12, 2026
…ed citations visible ADR SUTTA-008 §UI Vision #4 ("Why does this gloss say X?") promised a way to surface Sense.sourceCitationIds to the reader. 14 DPD citations have accumulated in packet.citations across phase-a/b/c, all invisible until now. Design choice — show only when pinned, not on hover: - Hover = reading flow → minimal noise - Pin = audit moment → source revealed Matches the existing pin-as-engagement model. components/sutta-studio/SuttaStudioView.tsx citationsById = useMemo lookup table from packet.citations. Threaded down to PaliWordEngine. components/sutta-studio/PaliWord.tsx - New citationsById prop. - In the segment render loop, resolve activeSense (segment-level senses take priority over word-level for compounds). - When activeSense.sourceCitationIds is non-empty, resolve via citationsById and pass as citations[] to Tooltip. - Imports Citation type. components/sutta-studio/Tooltip.tsx - New citations?: Citation[] prop. - When pinned AND citations.length > 0, render a footer below the main tooltip body: * "SOURCES" header in emerald-500/70 uppercase tracking-widest * Per-citation: short ref in slate-300 + italic excerpt in slate-500 (quoted) - Hidden entirely on hover (matches the audit-on-pin discipline). Verified via Playwright on /sutta/demo: - Click sutaṁ's ṁ segment → tooltip shows facet 2/2 ("In the formula 'me sutaṁ'...") plus SOURCES footer: DPD s.v. suta (pp — heard) "suta [pp]: heard" - Click eva segment → tooltip shows evaṁ deictic prose plus SOURCES footer with cite:dpd:dpd:18134 (the backfilled evaṁ entry). - Zero console errors. Net effect: the entire grounded-curation arc is now READABLE in the UI. Pin any Pāli segment whose word/segment senses have citations, see exactly which DPD entries support the gloss + the excerpt. 58 tests pass across providers, sutta-studio, types. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
anantham
added a commit
that referenced
this pull request
May 12, 2026
…gressions Adds 27 unit tests for the three root causes fixed in c33b115: 1. normalizeNiggahita — ṃ (U+1E43) → ṁ (U+1E41) conversion; idempotency; mixed-form handling; direct codepoint assertions. 2. PALI_ENDINGS — assertions that 'ūsu'/'ūhi' are absent (the over-greedy endings that conflated kurūsu → kura/rice) and that bare 'su'/'hi' are present (paired with vowel-shortening). 3. tryStemStrips — kurūsu produces 'kuru' (via vowel-shortening) and does NOT produce 'kur' or 'kura' (the bug-path); same pattern verified for bhikkhūsu; coverage of all three long vowels (ā→a, ī→i, ū→u); evaṁ direct-match preservation; quotative-tail handling. Refactor: build-dpd.ts now exports the pure helpers (normalizeNiggahita, PALI_ENDINGS, QUOTATIVE_TAILS, stripQuotative, tryStemStrips) and gates main() with a standard ESM Node entrypoint guard so importing the module during tests is side-effect-free. Verified: \`npm run build:dpd\` still runs correctly. Per HANDOVER §Pending #4: c33b115 shipped with only end-to-end verification. These unit tests catch regressions before they surface in coverage drops on the next phase's curation. Tests: 112 passed under \`services/providers\` + \`scripts\`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Keep local AI assistant configs out of git status and prevent accidental commits.
Root Cause (if bug fix)
N/A
Changes
.gitignore: ignore.claude/andCLAUDE.mddocs/WORKLOG.md: log entryTesting
Review Checklist