feat(lyrics): word-level karaoke (enhanced LRC + TTML)#25
Conversation
add parsers, render, edit, and persist word-level synchronized lyrics on top of the existing line-level LRC pipeline. enhanced LRC arrives via .elrc imports + embedded USLT frames; TTML via .ttml / .xml imports. the karaoke fullscreen and side panel animate the active word, and the editor gains a "word by word" granularity toggle (Space / Enter / Backspace shortcuts) that serializes back to enhanced LRC. TTML writes target ItemKey::Lyrics and skip cleanly on MP3 (no clean ID3v2 mapping), surfacing a warning so the user knows the file itself wasn't touched. i18n keys propagated to all 17 locales.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThis PR introduces word-level karaoke lyrics support: backend TTML/Enhanced LRC detection and tag-write handling (including MP3 TTML skip), unified parsing into word-timed ChangesWord-Level Lyrics Support
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning Billing warning: we have not been able to collect payment for this subscription for more than 72 hours. Please update the payment method or pay any pending invoices in Billing to avoid service interruption. Comment |
CodeQL flagged the previous `formatLrcTimestamp(...).replace("[", "<")`
chain as incomplete string escaping — it only handles the first match,
which is fine for the known-good output but is a fragile pattern.
Extract a private `formatTimestamp(ms, open, close)` helper and use it
to emit `<mm:ss.xx>` directly when serializing enhanced LRC word stamps.
the previous badge used the full localized label ("Synchronisé
mot-à-mot", "TTML mot-à-mot", …) which overflowed the footer slot and
got clipped by the `truncate` on the wrapper span. switch to a short
brand-token style ("WORD" / "TTML") with the full label exposed via
`title` for hover, add `shrink-0` so the badge survives narrow widths,
and move `truncate` onto the source-label child so it can collapse
gracefully instead.
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/components/player/FullscreenLyrics.tsx (1)
61-67:⚠️ Potential issue | 🟠 Major | ⚡ Quick winUse the shared modal a11y contract for the fullscreen overlay.
This behaves like a modal, but it only handles Escape manually and exposes no dialog semantics, so keyboard users can still tab into the page behind it. Please wire it through
useModalA11y(...)and addrole="dialog"witharia-modal/aria-labelledbyon the foreground container.As per coding guidelines,
src/components/**/*.tsx:Every modal must call useModalA11y(isOpen, onClose)andModal containers must have role="dialog", aria-modal="true", aria-labelledby (stable heading id) or aria-label (conditional heading).Also applies to: 79-133
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/components/player/FullscreenLyrics.tsx` around lines 61 - 67, The component currently implements its own Escape key handler in the useEffect and lacks dialog semantics; replace the manual key handling by calling useModalA11y(isOpen, onClose) at the top of FullscreenLyrics (pass the component's isOpen prop and onClose callback), remove the custom handleKey useEffect, and add proper dialog attributes to the foreground container element: role="dialog", aria-modal="true", and either aria-labelledby pointing to a stable heading id inside the modal (create a stable id for the title) or an appropriate aria-label if the heading is conditional; ensure useModalA11y is imported and used and the heading id is stable so keyboard users cannot tab into the page behind the overlay.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/features/integrations.md`:
- Line 143: The docs state a 120 ms opacity transition for active-word
highlighting but the implemented spec and PR describe 150 ms; update the
sentence referencing LyricsPanel and FullscreenLyrics to use "150 ms" (or
explicitly note an intentional difference) so the Rendering section matches the
implemented active-word animation timing and QA expectations.
- Line 145: The docs use mixed British/American spelling here: change the word
"synchronised" in the description for the LyricsEditorModal synchronised tab to
match the project's existing variant ("synchronized" or "synced") used
elsewhere; update the text in docs/features/integrations.md (the line describing
**Editor — word mode.** and the reference to LyricsEditorModal) to use the
chosen variant consistently and run a quick search for other occurrences of
"synchronised" to replace them as well.
In `@src-tauri/src/commands/lyrics.rs`:
- Around line 1045-1055: Currently only the target key (ItemKey::Lyrics for TTML
or ItemKey::UnsyncLyrics for others) is inserted via tag.insert_text, leaving
the opposite key intact which can cause stale reads by read_embedded_lyrics; fix
this by removing both keys before inserting: call tag.remove(ItemKey::Lyrics)
and tag.remove(ItemKey::UnsyncLyrics) (or the equivalent removal API on tag) in
the branch that saves non-empty lyrics, then insert the new content with
tag.insert_text as you already do, ensuring no old lyric format shadows the new
one.
In `@src/components/common/LyricsEditorModal.tsx`:
- Around line 419-435: The mapping before calling serializeEnhancedLrc is
converting uncaptured times to -1 which serializeEnhancedLrc turns into
<00:00.00> and is also filtering out non-empty rows with no captured stamps;
update the rowsForSave construction so you only call shift(...) when timeMs >= 0
and otherwise omit the timeMs/endMs properties (or set them to undefined/null)
for both row.level (use syncedRows -> map) and word.level (words?.map) so
unfinished rows/words are preserved as untimed rather than forced to 00:00; also
adjust the final sort to handle undefined times (treat undefined as +Infinity so
untimed lines sort after timed ones) — reference syncedRows, rowsForSave, shift,
serializeEnhancedLrc, and the words mapping when making this change.
In `@src/components/layout/LyricsPanel.tsx`:
- Around line 254-269: The side-panel word rendering in LyricsPanel.tsx (the
{hasWords ? ... line.words!.map(...) block) only animates color/opacity with
120ms; update the word span rendering to use a 150ms transition and include
transform scale so the active word gets the same scale+opacity animation as
fullscreen. Concretely, adjust the style/transition to "opacity 150ms ease,
transform 150ms ease" and apply a transform (e.g., scale(1) for active via wi
=== activeWordIndex and a slightly reduced scale for non-active) and keep the
existing color class logic (activeWordIndex) to match the fullscreen behavior.
In `@src/lib/tauri/lyrics.ts`:
- Around line 244-290: The parser currently drops any text before the first
inline word stamp and reuses the same words array for all repeated line stamps;
fix by (1) including a leading segment from body.slice(0, wordStamps[0].at) when
that prefix is non-empty and treat it as the first word whose timeMs is
wordStamps[0].timeMs (so adjust segments/words creation to account for a prefix
segment), and (2) when pushing per-line entries in the final loop, deep‑clone
the words array (e.g. words.map(w => ({ ...w }))) so each lines entry gets its
own words copy that fillEndTimestamps() can mutate independently; use the
existing symbols wordStamps, segments, words, lineStamps, matchedStampLength,
and lines to locate where to change.
---
Outside diff comments:
In `@src/components/player/FullscreenLyrics.tsx`:
- Around line 61-67: The component currently implements its own Escape key
handler in the useEffect and lacks dialog semantics; replace the manual key
handling by calling useModalA11y(isOpen, onClose) at the top of FullscreenLyrics
(pass the component's isOpen prop and onClose callback), remove the custom
handleKey useEffect, and add proper dialog attributes to the foreground
container element: role="dialog", aria-modal="true", and either aria-labelledby
pointing to a stable heading id inside the modal (create a stable id for the
title) or an appropriate aria-label if the heading is conditional; ensure
useModalA11y is imported and used and the heading id is stable so keyboard users
cannot tab into the page behind the overlay.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 6a93d6ec-c593-4217-ac14-1a61032734b9
📒 Files selected for processing (25)
CLAUDE.mddocs/features/integrations.mdsrc-tauri/migrations/app/20260516120000_lyrics_ttml_format.sqlsrc-tauri/src/commands/lyrics.rssrc/components/common/LyricsEditorModal.tsxsrc/components/layout/LyricsPanel.tsxsrc/components/player/FullscreenLyrics.tsxsrc/i18n/locales/ar.jsonsrc/i18n/locales/de.jsonsrc/i18n/locales/en.jsonsrc/i18n/locales/es.jsonsrc/i18n/locales/fr.jsonsrc/i18n/locales/hi.jsonsrc/i18n/locales/id.jsonsrc/i18n/locales/it.jsonsrc/i18n/locales/ja.jsonsrc/i18n/locales/kr.jsonsrc/i18n/locales/nl.jsonsrc/i18n/locales/pt-BR.jsonsrc/i18n/locales/pt.jsonsrc/i18n/locales/ru.jsonsrc/i18n/locales/tr.jsonsrc/i18n/locales/zh-CN.jsonsrc/i18n/locales/zh-TW.jsonsrc/lib/tauri/lyrics.ts
There was a problem hiding this comment.
♻️ Duplicate comments (1)
src/components/layout/LyricsPanel.tsx (1)
254-269:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAlign side-panel word animation with the 150ms opacity+scale karaoke spec.
Line 268 still uses
120msand no scale transform, so the side panel diverges from the stated behavior and fullscreen implementation.Proposed fix
- <span + <span key={wi} - className={ - wi === activeWordIndex - ? "text-pink-500 dark:text-pink-400" - : wi < activeWordIndex - ? "" - : "opacity-60" - } + className={`inline-block ${ + wi === activeWordIndex + ? "text-pink-500 dark:text-pink-400" + : wi < activeWordIndex + ? "" + : "opacity-60" + }`} style={{ - transition: - "color 120ms ease, opacity 120ms ease", + transition: + "color 150ms ease, opacity 150ms ease, transform 150ms ease", + transform: + wi === activeWordIndex ? "scale(1)" : "scale(0.97)", }} >🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/components/layout/LyricsPanel.tsx` around lines 254 - 269, The side-panel word span uses a 120ms color/opacity transition and lacks scale, so update the transition and add a scale transform to match the 150ms karaoke spec: in LyricsPanel (the span inside hasWords mapping over line.words in the JSX where activeWordIndex is compared) change the inline style transition from "color 120ms ease, opacity 120ms ease" to use 150ms (e.g., "color 150ms ease, opacity 150ms ease, transform 150ms ease") and ensure the class/style logic applies a scale transform (e.g., scale(1.06) when wi === activeWordIndex, scale(1) when wi < activeWordIndex or otherwise a slightly smaller scale/opacity) so the active word animates scale+opacity over 150ms consistent with fullscreen behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@src/components/layout/LyricsPanel.tsx`:
- Around line 254-269: The side-panel word span uses a 120ms color/opacity
transition and lacks scale, so update the transition and add a scale transform
to match the 150ms karaoke spec: in LyricsPanel (the span inside hasWords
mapping over line.words in the JSX where activeWordIndex is compared) change the
inline style transition from "color 120ms ease, opacity 120ms ease" to use 150ms
(e.g., "color 150ms ease, opacity 150ms ease, transform 150ms ease") and ensure
the class/style logic applies a scale transform (e.g., scale(1.06) when wi ===
activeWordIndex, scale(1) when wi < activeWordIndex or otherwise a slightly
smaller scale/opacity) so the active word animates scale+opacity over 150ms
consistent with fullscreen behavior.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 4d275864-c93b-48f1-b341-d8f6b32ff166
📒 Files selected for processing (1)
src/components/layout/LyricsPanel.tsx
- `write_lyrics_to_file` now purges both UnsyncLyrics and Lyrics keys before inserting the new content. Switching format (e.g. plain LRC → TTML) used to leave the old USLT frame intact, which `read_embedded_lyrics` reads first — the user would see the stale format on next fetch. - `parseEnhancedLrc` captures any text before the first inline word stamp as a virtual leading word at the line's own timestamp (matches what other Enhanced LRC tooling does), and deep-clones the `words` array per repeated line stamp so `fillEndTimestamps` mutations don't bleed across siblings. - `serializeEnhancedLrc` no longer emits `<00:00.00>word` for words with `timeMs < 0` — uncaptured words fold their text into the previous segment instead. Prevents a half-finished save from introducing phantom zero-second stamps on the next load. - `FullscreenLyrics` now uses `useModalA11y` (focus trap, Escape, focus restore) instead of a bespoke Escape listener, and the container carries `role="dialog"` + `aria-modal="true"` + `aria-labelledby`. Matches the modal convention applied across the rest of the codebase. - Harmonize the panel's active-word transition to 150 ms (was 120 ms) to match the fullscreen view and the doc. No scale on the panel — kept intentional given the 14 px line height inside the 320 px slot would jitter under scale. - Drop a stray British spelling in the docs.
apply the same `scale(1.04)` + transform transition to the panel's active word so the karaoke effect stays consistent across both views. the panel keeps its accent-color tint that the fullscreen leaves out, but durations and the scale ramp are now identical (150 ms, opacity / color / transform). also update the docs accordingly.
the word-mode filter only kept rows with at least one captured stamp (line or word level), so a user who typed a verse but hadn't stamped it yet would lose the text on save. line mode kept those rows via `text.trim().length > 0` — the asymmetry was incidental, not by design. extend the filter so word mode behaves the same, and bake an explicit untimed-rows-last sort so the serialised body stays monotonically ordered (matching line mode's tail behaviour). `serializeEnhancedLrc` already handles `timeMs < 0` rows with the `[--:--.--]` placeholder and folds uncaptured words into the previous segment, so the round-trip stays clean.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/lib/tauri/lyrics.ts`:
- Around line 270-310: The prefix virtual word is currently built once with
timeMs: lineStamps[0] and then cloned for every duplicate line, causing
duplicated lines to inherit the wrong first-word time; update the code that
creates per-line entries (the loop over lineStamps that pushes into lines) to,
for each timeMs, clone the words and if a prefix exists (the first built word
corresponds to prefix), set the cloned first word's timeMs to that timeMs before
pushing — reference symbols: lineStamps, built/words, lines, and the per-line
push where words.map(...) is used.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: a02cefe9-179d-4a9e-b0dd-427395d5545d
📒 Files selected for processing (6)
docs/features/integrations.mdsrc-tauri/src/commands/lyrics.rssrc/components/common/LyricsEditorModal.tsxsrc/components/layout/LyricsPanel.tsxsrc/components/player/FullscreenLyrics.tsxsrc/lib/tauri/lyrics.ts
a rare LRC convention is multiple line stamps on the same body (`[00:01][00:30]Hello <00:31>world`). the previous code built the virtual prefix word once with `lineStamps[0]` and cloned it for every duplicate, leaving later clones with a first-word time that didn't match their own line stamp. drop the placeholder timeMs from the built entry and rewrite it per-duplicate inside the per-line cloning loop instead.
Summary
[mm:ss.xx]La <mm:ss.xx>nuit <mm:ss.xx>tombe) and TTML (Apple Music XML).detect_format) so embedded USLT frames carrying Enhanced LRC are auto-recognised — no per-track config.lyrics.toast.tagWriteSkippedwarning so the user knows the file itself wasn't touched.'ttml'added to theapp.lyricsCHECK constraint via a new migration (table rebuild — SQLite has no ALTER CONSTRAINT).lyrics.format.*,lyrics.toast.tagWriteSkipped,lyricsEditor.granularity.*,lyricsEditor.captureHintWord,lyricsEditor.notCaptured) propagated to all 17 locales.Test plan
bun run typecheckbun run lintcargo check --manifest-path src-tauri/Cargo.toml --all-targetscargo test commands::lyrics— 8 tests green (Plain / LRC / EnhancedLrc / TTML with + without XML decl / LRC headers non-regression / word_stamp_present)ItemKey::Lyrics, re-fetch returns from cache.tag_write_skippedwarning surfaces, DB cache updated.write_to_file=true, verify USLT contains<mm:ss.xx>stamps via Mp3tag.Summary by CodeRabbit
New Features
Documentation
Chores
Localization