Skip to content

Conversation

@yujonglee
Copy link
Contributor

No description provided.

@coderabbitai
Copy link

coderabbitai bot commented Nov 6, 2025

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Walkthrough

Adds end-to-end batch transcription: new batch models and client, server BatchActor, listener plugin batch commands/events, desktop store slice and useRunBatch hook, audio utilities (Vorbis/chunking/import), UI integrations and many import/path adjustments. Also bumps deepgram dependency to 0.7.

Changes

Cohort / File(s) Change Summary
Dependency
Cargo.toml
Updated deepgram dependency from 0.6.80.7 (keeps default-features = false).
Interface: batch & stream types
owhisper/owhisper-interface/src/{batch.rs,stream.rs,lib.rs}
Added batch module with serializable batch models and From conversions; applied Specta renames to stream structs; restructured lib exports and added conversions from both stream and batch words.
Client: batch + live
owhisper/owhisper-client/src/{lib.rs,batch.rs,live.rs,error.rs}, owhisper/owhisper-client/Cargo.toml
New BatchClient (builder + transcribe_file), live WebSocket client (single/dual), centralized client Error enum; added reqwest/tokio/thiserror deps and URL/request builders.
Server: realtime imports
owhisper/owhisper-server/src/commands/run/{realtime.rs,state.rs}, owhisper/owhisper-server/src/misc.rs
Shortened type paths to use owhisper_interface::stream::StreamResponse in signatures/patterns; minor print reformat.
Listener plugin: batch actor, ext, events, commands, errors
plugins/listener/src/actors/{batch.rs,mod.rs,recorder.rs}, plugins/listener/src/{ext.rs,events.rs,commands.rs,error.rs,lib.rs}, plugins/listener/build.rs
New BatchActor and message types; BatchParams/BatchProvider and run_batch extension; new SessionEvent variants (BatchStarted, BatchResponse, BatchResponseStreamed, BatchFailed); run_batch command registered; new error variants for batch/spawn/start.
Desktop: store, hook, integration
apps/desktop/src/store/zustand/listener/{batch.ts,general.ts,index.ts,transcript.ts}, apps/desktop/src/hooks/useRunBatch.ts
Added batch Zustand slice and actions; nested per-session live state and SessionMode (includes "running_batch"), getSessionMode API, runBatch wiring, transcript transform/type updates, and exported useRunBatch(sessionId) hook.
Desktop: UI changes & progress
apps/desktop/src/components/** (many files)
Replaced global status/sessionId checks with per-session getSessionMode; added TranscriptionProgress component; refactored floating listen controls (StartButton → ListenSplitButton + OptionsMenu), removed generate UI and raw editor upload logic; added FloatingButton className prop and layout tweaks.
Audio utilities: Vorbis, chunking, import
crates/audio-utils/src/{lib.rs,vorbis.rs,error.rs}, crates/audio-utils/Cargo.toml, plugins/misc/src/{audio.rs,commands.rs,error.rs,lib.rs}, plugins/misc/Cargo.toml, plugins/misc/build.rs
New vorbis module (encode/decode), chunking API (ChunkedAudio, chunk_audio_file), expanded audio error variants, import_audio pipeline and tauri audio_import command; added hound, vorbis_rs, rodio and hypr-audio-utils deps.
Actors & recorder updates
plugins/listener/src/actors/recorder.rs
Replaced manual Vorbis/WAV handling with audio-utils helpers and unified error mapping.
Utilities, tests, and small refactors
assorted files (hooks, tinybase, ws, tiptap, tests, package.json updates)
Added useLatestRef hook; memoized STT connection; tinybase metric for custom vocabs; transformWords type updates; placeholder error boundary; WS client parse logging improvements; tests updated and many import/path adjustments.

Sequence Diagram(s)

sequenceDiagram
    participant Desktop as Desktop App (useRunBatch)
    participant Store as Zustand Store (general + batch)
    participant Plugin as Listener Plugin (run_batch)
    participant Client as BatchClient (owhisper-client)
    participant Server as BatchActor / owhisper

    Desktop->>Store: call runBatch(params)
    Store->>Store: validate session mode, create transcript proto, build BatchParams
    Store->>Plugin: invoke run_batch(params)
    Plugin->>Client: transcribe_file(file_path) or spawn BatchActor
    Client->>Server: POST /batch or stream audio to actor
    Server-->>Client: returns BatchResponse / emits progress events
    Client-->>Plugin: forwards BatchResponse / progress events
    Plugin->>Store: emit SessionEvent::BatchResponse / BatchProgress
    Store->>Store: handleBatchResponseStreamed/handleBatchResponse -> transform & persist words/hints
    Store-->>Desktop: update UI (TranscriptionProgress, transcript)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Areas requiring focused review:

  • plugins/listener/src/actors/batch.rs — actor lifecycle, shutdown, timeouts, and stream processing.
  • crates/audio-utils/src/vorbis.rs and plugins/misc/src/audio.rs — Vorbis encode/decode, resampling, interleaving, and cross-device rename semantics.
  • apps/desktop/src/hooks/useRunBatch.ts and listener store slices — param correctness, persist callback, per-session routing and mode gating.
  • owhisper-interface and owhisper-client conversions — Serde/Specta renames and From impls alignment with callsites.
  • Error propagation from owhisper-client → plugins/listener → frontend.

Possibly related PRs

  • Batch transcribe support #1638 — implements the same "batch transcribe" feature and touches Batch types/actors, listener events, and client integration.
  • Deepgram compat v2 #1307 — overlaps on batch transcription support and Deepgram/AM integration across owhisper interface and listener layers.
  • Binary diarization #1102 — related live dual-audio / ListenClient API changes (dual audio / diarization work).

Suggested reviewers

  • duckduckhero

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 11.21% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Description check ❓ Inconclusive No pull request description was provided by the author, making it impossible to assess relevance to the changeset. Add a description explaining the batch transcription feature, its purpose, and how it integrates with the existing transcription system.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The PR title 'Batch transcribe support' directly and clearly describes the primary functionality addition - batch transcription capability across the codebase.

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 984f21a and 7a169d5.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (8)
  • apps/desktop/package.json (5 hunks)
  • apps/desktop/src/store/zustand/listener/transcript.ts (5 hunks)
  • apps/web/package.json (2 hunks)
  • packages/codemirror/package.json (1 hunks)
  • packages/obsidian/package.json (1 hunks)
  • packages/tiptap/src/shared/extensions/placeholder.tsx (4 hunks)
  • packages/ui/package.json (1 hunks)
  • packages/utils/package.json (1 hunks)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
apps/desktop/src/store/zustand/listener/transcript.ts (1)

152-187: Add guard check for alternative.words before calling transformWords at line 115-119.

The code removes optional chaining but provides no runtime guard. The defensive check in batch.ts (line 39: !alternative.words) proves that alternative.words can be undefined at runtime despite the type definition. Without checking alternative.words before calling transformWords, this will crash if words is undefined.

Add a check similar to batch.ts:

const alternative = response.channel.alternatives[0];
- if (channelIndex === undefined || !alternative) {
+ if (channelIndex === undefined || !alternative || !alternative.words) {
  return;
}
🧹 Nitpick comments (2)
apps/desktop/src/store/zustand/listener/general.ts (1)

207-263: Consider adding timeout handling for batch operations.

The runBatch implementation correctly handles cleanup and error paths. However, if the backend fails to emit a batchResponse event after successfully accepting the command, the promise would hang indefinitely.

Consider adding a timeout mechanism to reject the promise after a reasonable duration.

Example approach:

 runBatch: async (params, options) => {
+  const BATCH_TIMEOUT_MS = 300000; // 5 minutes
   const shouldResetPersist = Boolean(options?.handlePersist);
   
   if (options?.handlePersist) {
     get().setTranscriptPersist(options.handlePersist);
   }
   
   let unlisten: (() => void) | undefined;
+  let timeoutId: NodeJS.Timeout | undefined;
   
   const cleanup = () => {
     if (unlisten) {
       unlisten();
       unlisten = undefined;
     }
+    if (timeoutId) {
+      clearTimeout(timeoutId);
+      timeoutId = undefined;
+    }
     
     if (shouldResetPersist) {
       get().setTranscriptPersist(undefined);
     }
   };
   
   await new Promise<void>((resolve, reject) => {
+    timeoutId = setTimeout(() => {
+      cleanup();
+      reject(new Error('Batch operation timed out'));
+    }, BATCH_TIMEOUT_MS);
+
     listenerEvents.sessionEvent
       .listen(({ payload }) => {
apps/desktop/src/routes/app/onboarding/shared.tsx (1)

22-44: Consider accessibility and text formatting improvements.

The component works well, but could benefit from minor enhancements:

  1. The action button displays action.kind directly (line 38), which would render as lowercase "skip" or "next". Consider capitalizing these for better UI presentation.
  2. Adding an aria-label to the action button would improve screen reader accessibility by providing more context.

Example:

         {action && !action.hide && (
           <button
+            aria-label={action.kind === "skip" ? "Skip onboarding step" : "Continue to next step"}
             className="self-center text-sm font-medium text-neutral-400 transition hover:text-neutral-600"
             onClick={action.onClick}
           >
-            {action.kind}
+            {action.kind.charAt(0).toUpperCase() + action.kind.slice(1)}
           </button>
         )}
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0e75ebb and 5ca4386.

⛔ Files ignored due to path filters (3)
  • Cargo.lock is excluded by !**/*.lock
  • apps/desktop/src/routeTree.gen.ts is excluded by !**/*.gen.ts
  • plugins/listener/js/bindings.gen.ts is excluded by !**/*.gen.ts
📒 Files selected for processing (26)
  • Cargo.toml (1 hunks)
  • apps/desktop/src/hooks/useRunBatch.ts (1 hunks)
  • apps/desktop/src/routes/app/onboarding/calendar.tsx (1 hunks)
  • apps/desktop/src/routes/app/onboarding/index.tsx (1 hunks)
  • apps/desktop/src/routes/app/onboarding/permissions.tsx (1 hunks)
  • apps/desktop/src/routes/app/onboarding/shared.tsx (1 hunks)
  • apps/desktop/src/routes/app/onboarding/welcome.tsx (1 hunks)
  • apps/desktop/src/store/zustand/listener/batch.ts (1 hunks)
  • apps/desktop/src/store/zustand/listener/general.ts (6 hunks)
  • apps/desktop/src/store/zustand/listener/index.ts (2 hunks)
  • apps/desktop/src/store/zustand/listener/transcript.ts (2 hunks)
  • crates/transcribe-moonshine/src/service/streaming.rs (1 hunks)
  • crates/transcribe-whisper-local/src/service/streaming.rs (1 hunks)
  • owhisper/owhisper-client/Cargo.toml (1 hunks)
  • owhisper/owhisper-client/src/lib.rs (6 hunks)
  • owhisper/owhisper-interface/src/batch.rs (1 hunks)
  • owhisper/owhisper-interface/src/lib.rs (2 hunks)
  • owhisper/owhisper-interface/src/stream.rs (5 hunks)
  • owhisper/owhisper-server/src/commands/run/realtime.rs (5 hunks)
  • owhisper/owhisper-server/src/commands/run/state.rs (2 hunks)
  • plugins/listener/src/actors/listener.rs (2 hunks)
  • plugins/listener/src/commands.rs (2 hunks)
  • plugins/listener/src/error.rs (1 hunks)
  • plugins/listener/src/events.rs (2 hunks)
  • plugins/listener/src/ext.rs (3 hunks)
  • plugins/listener/src/lib.rs (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (22)
apps/desktop/src/routes/app/onboarding/welcome.tsx (4)
apps/desktop/src/routes/app/onboarding/shared.tsx (1)
  • OnboardingNext (7-7)
packages/ui/src/components/ui/text-animate.tsx (1)
  • TextAnimate (302-386)
packages/utils/src/cn.ts (1)
  • cn (20-22)
apps/desktop/src/routes/app/onboarding.tsx (1)
  • Welcome (55-96)
owhisper/owhisper-server/src/commands/run/state.rs (1)
plugins/listener/js/bindings.gen.ts (1)
  • StreamResponse (114-114)
crates/transcribe-whisper-local/src/service/streaming.rs (2)
plugins/listener/js/bindings.gen.ts (1)
  • StreamResponse (114-114)
apps/desktop/src/store/tinybase/schema-external.ts (1)
  • Word (118-118)
plugins/listener/src/events.rs (1)
plugins/listener/js/bindings.gen.ts (2)
  • BatchResponse (103-103)
  • StreamResponse (114-114)
plugins/listener/src/commands.rs (3)
plugins/listener/js/bindings.gen.ts (2)
  • SessionParams (108-108)
  • BatchParams (101-101)
plugins/listener/src/ext.rs (2)
  • run_batch (50-50)
  • run_batch (150-181)
owhisper/owhisper-client/src/lib.rs (1)
  • params (55-58)
plugins/listener/src/lib.rs (3)
plugins/listener/js/bindings.gen.ts (1)
  • commands (9-82)
plugins/listener/src/commands.rs (1)
  • run_batch (77-82)
plugins/listener/src/ext.rs (2)
  • run_batch (50-50)
  • run_batch (150-181)
crates/transcribe-moonshine/src/service/streaming.rs (2)
plugins/listener/js/bindings.gen.ts (1)
  • StreamResponse (114-114)
apps/desktop/src/store/tinybase/schema-external.ts (1)
  • Word (118-118)
apps/desktop/src/routes/app/onboarding/calendar.tsx (1)
apps/desktop/src/routes/app/onboarding/shared.tsx (4)
  • OnboardingNext (7-7)
  • OnboardingContainer (22-44)
  • IntegrationRow (54-85)
  • Divider (87-95)
apps/desktop/src/routes/app/onboarding/index.tsx (5)
apps/desktop/src/routes/app/onboarding/welcome.tsx (1)
  • Welcome (13-52)
apps/desktop/src/routes/app/onboarding/permissions.tsx (1)
  • Permissions (13-76)
apps/desktop/src/routes/app/onboarding/calendar.tsx (1)
  • Calendars (10-63)
plugins/windows/src/ext.rs (1)
  • navigate (19-41)
apps/desktop/src/routes/app/onboarding/shared.tsx (1)
  • OnboardingNext (7-7)
owhisper/owhisper-interface/src/lib.rs (1)
plugins/db/js/bindings.gen.ts (2)
  • Word2 (204-204)
  • SpeakerIdentity (200-200)
plugins/listener/src/actors/listener.rs (1)
plugins/listener/js/bindings.gen.ts (1)
  • StreamResponse (114-114)
owhisper/owhisper-interface/src/batch.rs (1)
apps/desktop/src/store/tinybase/schema-external.ts (1)
  • Word (118-118)
plugins/listener/src/ext.rs (3)
plugins/listener/js/bindings.gen.ts (4)
  • BatchProvider (102-102)
  • BatchParams (101-101)
  • SessionEvent (107-107)
  • BatchResponse (103-103)
plugins/listener/src/commands.rs (1)
  • run_batch (77-82)
owhisper/owhisper-client/src/lib.rs (3)
  • params (55-58)
  • builder (293-295)
  • builder (313-315)
apps/desktop/src/routes/app/onboarding/permissions.tsx (4)
apps/desktop/src/routes/app/onboarding/shared.tsx (2)
  • OnboardingNext (7-7)
  • OnboardingContainer (22-44)
apps/desktop/src/hooks/use-permissions.ts (1)
  • usePermissions (8-140)
apps/desktop/src/components/shared/permission-row.tsx (1)
  • PermissionRow (8-139)
apps/desktop/src/routes/app/onboarding.tsx (1)
  • Permissions (98-162)
apps/desktop/src/store/zustand/listener/batch.ts (3)
plugins/listener/js/bindings.gen.ts (2)
  • BatchResponse (103-103)
  • BatchAlternatives (99-99)
apps/desktop/src/store/zustand/listener/transcript.ts (2)
  • HandlePersistCallback (9-9)
  • fixSpacingForWords (189-213)
apps/desktop/src/utils/segment.ts (2)
  • WordLike (11-16)
  • RuntimeSpeakerHint (26-29)
owhisper/owhisper-server/src/commands/run/realtime.rs (1)
plugins/listener/js/bindings.gen.ts (1)
  • StreamResponse (114-114)
owhisper/owhisper-client/src/lib.rs (2)
plugins/listener/js/bindings.gen.ts (2)
  • BatchResponse (103-103)
  • StreamResponse (114-114)
crates/am/src/client.rs (2)
  • response (87-87)
  • status (25-34)
apps/desktop/src/routes/app/onboarding/shared.tsx (1)
packages/utils/src/cn.ts (1)
  • cn (20-22)
apps/desktop/src/store/zustand/listener/transcript.ts (2)
plugins/listener/js/bindings.gen.ts (1)
  • StreamAlternatives (109-109)
apps/desktop/src/utils/segment.ts (2)
  • WordLike (11-16)
  • RuntimeSpeakerHint (26-29)
apps/desktop/src/hooks/useRunBatch.ts (6)
apps/desktop/src/store/zustand/listener/transcript.ts (1)
  • HandlePersistCallback (9-9)
plugins/listener/js/bindings.gen.ts (2)
  • runBatch (74-81)
  • BatchParams (101-101)
apps/desktop/src/contexts/listener.tsx (1)
  • useListener (33-47)
apps/desktop/src/hooks/useSTTConnection.ts (1)
  • useSTTConnection (14-85)
apps/desktop/src/hooks/useKeywords.ts (1)
  • useKeywords (7-20)
apps/desktop/src/config/use-config.ts (1)
  • useConfigValue (20-32)
apps/desktop/src/store/zustand/listener/index.ts (3)
apps/desktop/src/store/zustand/listener/general.ts (2)
  • GeneralState (19-28)
  • GeneralActions (30-41)
apps/desktop/src/store/zustand/listener/transcript.ts (2)
  • TranscriptState (11-15)
  • TranscriptActions (17-21)
apps/desktop/src/store/zustand/listener/batch.ts (3)
  • BatchState (9-9)
  • BatchActions (11-13)
  • createBatchSlice (15-29)
apps/desktop/src/store/zustand/listener/general.ts (3)
plugins/listener/js/bindings.gen.ts (2)
  • BatchParams (101-101)
  • BatchResponse (103-103)
apps/desktop/src/store/zustand/listener/transcript.ts (2)
  • HandlePersistCallback (9-9)
  • TranscriptActions (17-21)
apps/desktop/src/store/zustand/listener/batch.ts (1)
  • BatchActions (11-13)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: ci (macos, macos-14)
🔇 Additional comments (32)
owhisper/owhisper-interface/src/stream.rs (1)

7-7: LGTM! Clear type naming for generated bindings.

The specta(rename) attributes provide clearer distinction between stream and batch types in the generated TypeScript bindings without affecting Rust code.

Also applies to: 20-20, 31-31, 38-38, 47-47, 59-59

owhisper/owhisper-client/Cargo.toml (1)

19-21: LGTM! Dependencies support batch HTTP client.

The additions of reqwest, thiserror, and tokio appropriately support the new BatchClient functionality for file transcription.

plugins/listener/src/lib.rs (1)

45-45: LGTM! Batch command properly registered.

The run_batch command is correctly added to the Specta command collection, consistent with the existing pattern.

crates/transcribe-moonshine/src/service/streaming.rs (1)

24-25: LGTM! Import paths updated for module reorganization.

The import paths correctly reflect the new structure where stream-related types are now in the owhisper_interface::stream submodule.

plugins/listener/src/commands.rs (1)

1-1: LGTM! Clean batch command implementation.

The run_batch command follows the established pattern, with proper annotations and error handling. The delegation to app.run_batch(params) maintains clean separation of concerns.

Also applies to: 75-82

crates/transcribe-whisper-local/src/service/streaming.rs (1)

22-23: LGTM! Import paths updated for module reorganization.

The import restructuring correctly aligns with the new module organization where stream types are in the owhisper_interface::stream submodule.

Cargo.toml (1)

229-229: No issues found; upgrade to deepgram 0.7 is compatible and secure.

No security vulnerabilities are reported for deepgram 0.7, and the 0.7.0 release notes show no breaking changes—only additive features for detected-languages support and batch responses. The codebase already uses 0.7.0-compatible APIs throughout (transcription streaming, Options builder pattern, Language/Model/Encoding types), confirming compatibility.

plugins/listener/src/error.rs (1)

23-26: LGTM!

The new error variants are well-defined. The transparent wrapper for BatchError and the simple message for UnsupportedBatchProvider follow Rust error handling best practices.

apps/desktop/src/routes/app/onboarding/welcome.tsx (1)

1-52: LGTM!

The component is well-structured and follows React best practices. The extraction from the main onboarding file improves modularity, and the prop-based onNext pattern provides good flexibility.

owhisper/owhisper-server/src/commands/run/state.rs (2)

3-3: LGTM!

The import simplification improves readability without changing functionality.


61-76: LGTM!

Using the locally imported StreamResponse type instead of the fully qualified name improves code readability while maintaining the same functionality.

apps/desktop/src/store/zustand/listener/index.ts (3)

3-3: LGTM!

The batch slice import follows the established pattern used for other slices.


7-8: LGTM!

The type composition correctly integrates batch state and actions alongside existing slices.


13-18: LGTM!

The batch slice is properly integrated into the store using the same pattern as the general and transcript slices.

plugins/listener/src/actors/listener.rs (3)

8-9: LGTM!

The import organization is cleaner with the explicit imports from owhisper_interface::stream.


18-26: LGTM!

Using the locally imported StreamResponse type improves readability.


98-109: LGTM!

The handler correctly uses the StreamResponse type with the same functionality as before.

apps/desktop/src/routes/app/onboarding/permissions.tsx (1)

1-76: LGTM!

The Permissions component is well-structured and follows React best practices. The logic for checking all permissions is correct, and the conditional rendering of the Continue button and helper message provides clear user feedback.

apps/desktop/src/store/zustand/listener/general.ts (3)

6-7: LGTM!

The import additions are clean and properly typed.

Also applies to: 16-16


37-40: LGTM!

The type signature follows the same pattern as the start method with proper Promise return type.

Also applies to: 63-63


143-148: LGTM!

The batch response handling follows the same pattern as the existing stream response handler.

plugins/listener/src/ext.rs (3)

11-32: LGTM!

The BatchProvider enum and BatchParams struct are well-defined with appropriate serde attributes for optional fields.


50-50: LGTM!

The trait method signature is consistent with other async methods in the trait.


149-181: LGTM!

The run_batch implementation correctly handles both provider types and properly propagates errors. The error mapping on line 175 reuses StartSessionFailed for batch emit failures, which is acceptable though semantically not perfect.

owhisper/owhisper-server/src/commands/run/realtime.rs (1)

5-5: LGTM!

This refactoring improves code readability by importing StreamResponse and using the short path throughout the file instead of the fully qualified path.

Also applies to: 26-26, 96-96, 126-126, 187-187

apps/desktop/src/routes/app/onboarding/shared.tsx (1)

87-95: LGTM!

The Divider component follows a common pattern for text-separated horizontal rules.

owhisper/owhisper-client/src/lib.rs (6)

2-3: LGTM!

The new imports are appropriate for the batch transcription functionality.

Also applies to: 6-10


60-161: LGTM!

The URL building refactor cleanly separates URL construction from WebSocket scheme selection, enabling reuse for the batch HTTP endpoint. The .expect() calls are appropriate for a builder pattern where invalid configuration should fail fast.


177-191: LGTM!

The build_batch method correctly constructs a BatchClient using the HTTP URL (not WebSocket) from build_url.


209-224: LGTM!

The BatchClient struct and BatchError enum are well-defined with appropriate error variants covering file I/O, HTTP, and status code failures.


312-315: LGTM!

The builder method follows the same pattern as ListenClient::builder().


317-347: LGTM!

The transcribe_file implementation correctly handles file reading, HTTP request construction with proper headers, and error cases. The file is read entirely into memory (line 322), which is acceptable for typical batch audio files but could be optimized with streaming for very large files if needed in the future.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5ca4386 and 38d9e2c.

📒 Files selected for processing (4)
  • plugins/listener/src/actors/batch.rs (1 hunks)
  • plugins/listener/src/actors/mod.rs (1 hunks)
  • plugins/listener/src/error.rs (1 hunks)
  • plugins/listener/src/ext.rs (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • plugins/listener/src/error.rs
🧰 Additional context used
🧬 Code graph analysis (2)
plugins/listener/src/actors/batch.rs (2)
plugins/listener/js/bindings.gen.ts (3)
  • StreamResponse (114-114)
  • SessionEvent (107-107)
  • BatchResponse (103-103)
crates/audio-utils/src/lib.rs (3)
  • source_from_path (72-79)
  • resample_audio (81-130)
  • f32_to_i16_bytes (51-61)
plugins/listener/src/ext.rs (4)
plugins/listener/js/bindings.gen.ts (5)
  • SessionParams (108-108)
  • SessionEvent (107-107)
  • BatchProvider (102-102)
  • BatchParams (101-101)
  • BatchResponse (103-103)
plugins/listener/src/commands.rs (1)
  • run_batch (77-82)
owhisper/owhisper-client/src/lib.rs (1)
  • params (55-58)
plugins/listener/src/actors/batch.rs (1)
  • name (40-42)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
owhisper/owhisper-client/Cargo.toml (1)

4-4: Invalid Rust edition "2024".

Rust editions are currently limited to 2015, 2018, and 2021. The edition "2024" does not exist and will cause compilation failure. Correct this to a valid edition (likely "2021").

-edition = "2024"
+edition = "2021"
apps/desktop/src/components/main/body/sessions/floating/listen.tsx (1)

104-146: Configure action must not start listening. Calling onPrimaryClick() inside handleAction means pressing the “Configure” button (or tooltip action) kicks off listening before navigation. When the button is disabled because prerequisites aren’t met, this still fires the start-session logic, producing the exact failure the warning was trying to prevent. Please drop the onPrimaryClick() call and open settings without starting the session.

🧹 Nitpick comments (9)
packages/tiptap/src/shared/extensions/placeholder.tsx (1)

162-189: LGTM: Error boundary protects editor from placeholder rendering failures.

The error boundary implementation correctly follows React patterns and prevents rendering errors in custom placeholder content from breaking the entire editor. Returning null on error (line 185) ensures graceful degradation—the placeholder simply disappears rather than crashing the editor.

Optional: Consider whether logging to console.error is sufficient or if errors should be reported to a centralized error tracking service (if one exists in the project).

apps/desktop/src/components/main/body/sessions/note-input/raw.tsx (1)

25-30: Verify that empty mention search is intentional.

The mention configuration's handleSearch always returns an empty array. If mentions are not needed in this context, consider removing the mentionConfig entirely to simplify the component. If this is a placeholder for future implementation, consider adding a comment.

apps/desktop/src/hooks/useSTTConnection.ts (1)

64-90: Consider removing isLocalModel from the dependency array.

Since isLocalModel is derived from current_stt_provider and current_stt_model (both already in the dependency array), including it is redundant. The memoized function will already recompute when those primitive values change.

Apply this diff to streamline the dependency array:

   }, [
     current_stt_provider,
     current_stt_model,
-    isLocalModel,
     localConnection,
     baseUrl,
     apiKey,
   ]);

Alternatively, if you prefer to keep isLocalModel in the deps for clarity, consider memoizing it with useMemo to avoid recomputing it on every render:

+  const isLocalModel = useMemo(
+    () => current_stt_provider === "hyprnote"
+      && (current_stt_model?.startsWith("am-") || current_stt_model?.startsWith("Quantized")),
+    [current_stt_provider, current_stt_model]
+  );
apps/desktop/src/store/zustand/listener/general.test.ts (1)

48-59: Consider testing edge cases for batch progress.

The batch state tests cover happy path scenarios but could benefit from:

  • Multiple simultaneous sessions with different progress states
  • Progress updates with missing or zero duration values
  • Clearing a session that was never tracked

Apply this diff to add edge case tests:

+    test("handleBatchProgress handles multiple sessions independently", () => {
+      const { handleBatchProgress } = store.getState();
+      
+      handleBatchProgress("session-1", { audioDuration: 10, transcriptDuration: 5 });
+      handleBatchProgress("session-2", { audioDuration: 20, transcriptDuration: 15 });
+      
+      expect(store.getState().batchProgressBySession["session-1"]).toEqual({ audioDuration: 10, transcriptDuration: 5 });
+      expect(store.getState().batchProgressBySession["session-2"]).toEqual({ audioDuration: 20, transcriptDuration: 15 });
+    });
+
+    test("clearBatchSession is idempotent for non-existent sessions", () => {
+      const { clearBatchSession } = store.getState();
+      expect(() => clearBatchSession("never-existed")).not.toThrow();
+    });
apps/desktop/src/components/main/body/sessions/note-input/transcript/progress.tsx (1)

52-63: Consider handling stale progress after batch completion.

If batchProgressBySession[sessionId] is not cleared immediately when the mode transitions away from "running_batch", the progress UI could briefly render stale data before disappearing.

Verify that clearBatchSession is called as part of the batch completion flow to prevent this edge case:

#!/bin/bash
# Description: Check if clearBatchSession is called on batch completion

rg -n 'clearBatchSession' --type=ts --type=tsx -B 5 -A 5
apps/desktop/src/hooks/useRunBatch.ts (1)

132-146: Consider validating file path before batch invocation.

The hook doesn't validate that filePath exists or is accessible before calling runBatch. Consider adding a check or documenting that callers are responsible for validation.

Add a validation step:

     return useCallback(
       async (filePath: string, options?: RunOptions) => {
         if (!store || !conn || !runBatch) {
           console.error("no_batch_connection");
           return;
         }
+
+        // Validate file path exists
+        // Note: Add actual validation using Tauri fs commands if needed
+        if (!filePath || filePath.trim().length === 0) {
+          console.error("invalid_file_path");
+          return;
+        }
 
         const provider: BatchParams["provider"] | null = (() => {
owhisper/owhisper-client/src/lib.rs (1)

56-130: Consider extracting common query parameter logic.

There's significant duplication between build_batch_url and build_url. Both methods:

  • Call listen_endpoint_url()
  • Create query pairs
  • Append language queries
  • Set common parameters like model, sample_rate, encoding, etc.
  • Append keyword queries

Consider extracting shared logic into helper methods to improve maintainability:

fn append_common_params(
    query_pairs: &mut Serializer<UrlQuery>,
    params: &owhisper_interface::ListenParams,
) {
    let model = params.model.as_deref().unwrap_or("hypr-whisper");
    let sample_rate = RESAMPLED_SAMPLE_RATE_HZ.to_string();
    
    query_pairs.append_pair("model", model);
    query_pairs.append_pair("encoding", "linear16");
    query_pairs.append_pair("sample_rate", &sample_rate);
    query_pairs.append_pair("diarize", "true");
    query_pairs.append_pair("punctuate", "true");
    query_pairs.append_pair("smart_format", "true");
    query_pairs.append_pair("numerals", "true");
    query_pairs.append_pair("mip_opt_out", "true");
}

Then call this from both methods and only append the method-specific parameters afterward.

apps/desktop/src/store/zustand/listener/general.ts (1)

298-349: Consider simplifying nested callback structure.

The event handling uses deeply nested callbacks that could be harder to maintain. While functional, consider refactoring to use a more linear async/await pattern if the underlying event listener API supports it.

For example, if you can await the event listener setup:

runBatch: async (params, options) => {
  // ... validation ...
  
  try {
    const unlisten = await listenerEvents.sessionEvent.listen(/* ... */);
    
    try {
      const result = await listenerCommands.runBatch(params);
      if (result.status === "error") {
        throw new Error(result.error);
      }
    } catch (error) {
      unlisten();
      throw error;
    }
  } catch (error) {
    cleanup();
    throw error;
  }
}

This improves readability by reducing nesting depth.

plugins/misc/src/audio.rs (1)

52-85: Broaden the cross-device rename fallback

Right now we only trigger the copy/remove fallback when raw_os_error() == 18 on Unix. On Windows and other platforms, a cross-volume rename surfaces as ErrorKind::CrossDeviceLink (or ERROR_NOT_SAME_DEVICE) and this branch never fires, so the command will bubble up an error if we ever move the temp file outside the session directory (e.g. NamedTempFile on a different drive). Please switch the check to err.kind() == ErrorKind::CrossDeviceLink and add the corresponding Windows raw code guard so the fallback stays reliable across platforms.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 38d9e2c and d1f9a55.

⛔ Files ignored due to path filters (12)
  • Cargo.lock is excluded by !**/*.lock
  • plugins/listener/js/bindings.gen.ts is excluded by !**/*.gen.ts
  • plugins/listener/permissions/autogenerated/commands/run_batch.toml is excluded by !plugins/**/permissions/**
  • plugins/listener/permissions/autogenerated/reference.md is excluded by !plugins/**/permissions/**
  • plugins/listener/permissions/default.toml is excluded by !plugins/**/permissions/**
  • plugins/listener/permissions/schemas/schema.json is excluded by !plugins/**/permissions/**
  • plugins/misc/js/bindings.gen.ts is excluded by !**/*.gen.ts
  • plugins/misc/permissions/autogenerated/commands/audio_import.toml is excluded by !plugins/**/permissions/**
  • plugins/misc/permissions/autogenerated/reference.md is excluded by !plugins/**/permissions/**
  • plugins/misc/permissions/default.toml is excluded by !plugins/**/permissions/**
  • plugins/misc/permissions/schemas/schema.json is excluded by !plugins/**/permissions/**
  • plugins/permissions/src/error.rs is excluded by !plugins/**/permissions/**
📒 Files selected for processing (50)
  • apps/desktop/src/components/main/body/sessions/floating/generate.tsx (0 hunks)
  • apps/desktop/src/components/main/body/sessions/floating/index.tsx (1 hunks)
  • apps/desktop/src/components/main/body/sessions/floating/listen.tsx (3 hunks)
  • apps/desktop/src/components/main/body/sessions/floating/shared.tsx (3 hunks)
  • apps/desktop/src/components/main/body/sessions/index.tsx (1 hunks)
  • apps/desktop/src/components/main/body/sessions/note-input/header.tsx (1 hunks)
  • apps/desktop/src/components/main/body/sessions/note-input/raw.tsx (1 hunks)
  • apps/desktop/src/components/main/body/sessions/note-input/transcript/index.tsx (1 hunks)
  • apps/desktop/src/components/main/body/sessions/note-input/transcript/progress.tsx (1 hunks)
  • apps/desktop/src/components/main/body/sessions/note-input/transcript/shared/index.tsx (1 hunks)
  • apps/desktop/src/components/main/body/sessions/outer-header/listen.tsx (1 hunks)
  • apps/desktop/src/components/main/body/sessions/outer-header/overflow.tsx (2 hunks)
  • apps/desktop/src/components/main/body/sessions/shared.tsx (2 hunks)
  • apps/desktop/src/components/onboarding/permissions.tsx (1 hunks)
  • apps/desktop/src/components/settings/memory/custom-vocabulary.tsx (6 hunks)
  • apps/desktop/src/hooks/useAITaskTask.ts (1 hunks)
  • apps/desktop/src/hooks/useLatestRef.ts (1 hunks)
  • apps/desktop/src/hooks/useRunBatch.ts (1 hunks)
  • apps/desktop/src/hooks/useSTTConnection.ts (2 hunks)
  • apps/desktop/src/routes/app/onboarding/index.tsx (1 hunks)
  • apps/desktop/src/store/tinybase/main.ts (2 hunks)
  • apps/desktop/src/store/zustand/listener/batch.ts (1 hunks)
  • apps/desktop/src/store/zustand/listener/general.test.ts (2 hunks)
  • apps/desktop/src/store/zustand/listener/general.ts (10 hunks)
  • apps/desktop/src/store/zustand/listener/index.ts (1 hunks)
  • crates/audio-utils/Cargo.toml (1 hunks)
  • crates/audio-utils/src/error.rs (1 hunks)
  • crates/audio-utils/src/lib.rs (3 hunks)
  • crates/audio-utils/src/vorbis.rs (1 hunks)
  • owhisper/owhisper-client/Cargo.toml (1 hunks)
  • owhisper/owhisper-client/src/batch.rs (1 hunks)
  • owhisper/owhisper-client/src/error.rs (1 hunks)
  • owhisper/owhisper-client/src/lib.rs (4 hunks)
  • owhisper/owhisper-client/src/live.rs (1 hunks)
  • owhisper/owhisper-interface/src/batch.rs (1 hunks)
  • owhisper/owhisper-server/src/misc.rs (1 hunks)
  • packages/tiptap/src/shared/extensions/placeholder.tsx (4 hunks)
  • plugins/listener/build.rs (1 hunks)
  • plugins/listener/src/actors/batch.rs (1 hunks)
  • plugins/listener/src/actors/recorder.rs (4 hunks)
  • plugins/listener/src/error.rs (1 hunks)
  • plugins/listener/src/events.rs (2 hunks)
  • plugins/listener/src/ext.rs (3 hunks)
  • plugins/local-stt/src/server/external.rs (1 hunks)
  • plugins/misc/Cargo.toml (2 hunks)
  • plugins/misc/build.rs (1 hunks)
  • plugins/misc/src/audio.rs (1 hunks)
  • plugins/misc/src/commands.rs (2 hunks)
  • plugins/misc/src/error.rs (1 hunks)
  • plugins/misc/src/lib.rs (2 hunks)
💤 Files with no reviewable changes (1)
  • apps/desktop/src/components/main/body/sessions/floating/generate.tsx
✅ Files skipped from review due to trivial changes (1)
  • owhisper/owhisper-server/src/misc.rs
🚧 Files skipped from review as they are similar to previous changes (4)
  • plugins/listener/src/error.rs
  • owhisper/owhisper-interface/src/batch.rs
  • apps/desktop/src/store/zustand/listener/index.ts
  • plugins/listener/src/events.rs
🧰 Additional context used
🧬 Code graph analysis (25)
plugins/misc/src/commands.rs (1)
plugins/misc/src/audio.rs (1)
  • import_audio (14-88)
apps/desktop/src/components/main/body/sessions/note-input/transcript/index.tsx (1)
apps/desktop/src/components/main/body/sessions/note-input/transcript/progress.tsx (1)
  • TranscriptionProgress (13-64)
apps/desktop/src/components/main/body/sessions/note-input/transcript/progress.tsx (1)
apps/desktop/src/contexts/listener.tsx (1)
  • useListener (33-47)
apps/desktop/src/components/main/body/sessions/index.tsx (1)
apps/desktop/src/contexts/listener.tsx (1)
  • useListener (33-47)
apps/desktop/src/components/main/body/sessions/floating/index.tsx (2)
apps/desktop/src/components/main/body/sessions/floating/listen.tsx (1)
  • ListenButton (28-48)
apps/desktop/src/components/main/body/sessions/outer-header/listen.tsx (1)
  • ListenButton (15-28)
owhisper/owhisper-client/src/live.rs (2)
plugins/listener/js/bindings.gen.ts (1)
  • StreamResponse (114-114)
owhisper/owhisper-client/src/batch.rs (1)
  • builder (40-42)
plugins/misc/src/audio.rs (2)
crates/audio-utils/src/vorbis.rs (2)
  • encode_vorbis_mono (103-109)
  • mix_down_to_mono (177-189)
crates/audio-utils/src/lib.rs (1)
  • resample_audio (84-133)
crates/audio-utils/src/lib.rs (2)
crates/moonshine/src/model.rs (1)
  • decoder (279-279)
crates/audio/src/resampler.rs (1)
  • source (117-117)
plugins/misc/src/error.rs (1)
crates/moonshine/src/model.rs (1)
  • decoder (279-279)
plugins/listener/src/actors/recorder.rs (1)
crates/audio-utils/src/vorbis.rs (2)
  • decode_vorbis_to_wav_file (111-150)
  • encode_wav_to_vorbis_file (152-175)
owhisper/owhisper-client/src/batch.rs (1)
crates/audio-utils/src/lib.rs (3)
  • f32_to_i16_bytes (55-65)
  • resample_audio (84-133)
  • source_from_path (76-82)
apps/desktop/src/components/main/body/sessions/note-input/transcript/shared/index.tsx (1)
apps/desktop/src/contexts/listener.tsx (1)
  • useListener (33-47)
apps/desktop/src/components/main/body/sessions/outer-header/listen.tsx (1)
apps/desktop/src/contexts/listener.tsx (1)
  • useListener (33-47)
apps/desktop/src/hooks/useRunBatch.ts (6)
apps/desktop/src/store/zustand/listener/transcript.ts (1)
  • HandlePersistCallback (9-9)
plugins/listener/js/bindings.gen.ts (1)
  • BatchParams (101-101)
apps/desktop/src/contexts/listener.tsx (1)
  • useListener (33-47)
apps/desktop/src/hooks/useSTTConnection.ts (1)
  • useSTTConnection (15-91)
apps/desktop/src/hooks/useKeywords.ts (1)
  • useKeywords (7-20)
apps/desktop/src/config/use-config.ts (1)
  • useConfigValue (20-32)
apps/desktop/src/components/settings/memory/custom-vocabulary.tsx (1)
apps/desktop/src/store/tinybase/main.ts (3)
  • UI (59-59)
  • METRICS (475-479)
  • STORE_ID (25-25)
apps/desktop/src/components/main/body/sessions/note-input/header.tsx (2)
apps/desktop/src/contexts/listener.tsx (1)
  • useListener (33-47)
apps/desktop/src/components/main/body/sessions/shared.tsx (1)
  • useHasTranscript (13-21)
plugins/misc/src/lib.rs (2)
plugins/misc/js/bindings.gen.ts (1)
  • commands (9-85)
plugins/misc/src/commands.rs (1)
  • audio_import (76-84)
apps/desktop/src/components/main/body/sessions/floating/listen.tsx (5)
apps/desktop/src/components/main/body/sessions/floating/shared.tsx (2)
  • FloatingButton (9-71)
  • ActionableTooltipContent (7-7)
apps/desktop/src/components/main/body/sessions/shared.tsx (1)
  • ActionableTooltipContent (84-109)
plugins/listener/js/bindings.gen.ts (1)
  • runBatch (74-81)
apps/desktop/src/hooks/useRunBatch.ts (1)
  • useRunBatch (24-147)
packages/utils/src/cn.ts (1)
  • cn (20-22)
apps/desktop/src/store/zustand/listener/batch.ts (3)
plugins/listener/js/bindings.gen.ts (2)
  • BatchResponse (103-103)
  • BatchAlternatives (99-99)
apps/desktop/src/store/zustand/listener/transcript.ts (2)
  • HandlePersistCallback (9-9)
  • fixSpacingForWords (189-213)
apps/desktop/src/utils/segment.ts (2)
  • WordLike (11-16)
  • RuntimeSpeakerHint (26-29)
owhisper/owhisper-client/src/lib.rs (2)
crates/transcribe-whisper-local/src/service/streaming.rs (1)
  • params (99-103)
crates/am/src/client.rs (2)
  • new (11-16)
  • new (99-106)
apps/desktop/src/components/main/body/sessions/outer-header/overflow.tsx (2)
apps/desktop/src/contexts/listener.tsx (1)
  • useListener (33-47)
apps/desktop/src/hooks/useStartListening.ts (1)
  • useStartListening (11-109)
apps/desktop/src/components/main/body/sessions/shared.tsx (4)
apps/desktop/src/contexts/listener.tsx (1)
  • useListener (33-47)
apps/desktop/src/store/zustand/ai-task/task-configs/index.ts (1)
  • createTaskId (57-62)
apps/desktop/src/hooks/useAITaskTask.ts (1)
  • useAITaskTask (31-97)
apps/desktop/src/hooks/useSTTConnection.ts (1)
  • useSTTConnection (15-91)
plugins/listener/src/ext.rs (4)
plugins/listener/js/bindings.gen.ts (5)
  • SessionParams (108-108)
  • SessionEvent (107-107)
  • BatchProvider (102-102)
  • BatchParams (101-101)
  • BatchResponse (103-103)
plugins/listener/src/commands.rs (1)
  • run_batch (77-82)
owhisper/owhisper-client/src/lib.rs (1)
  • params (33-36)
plugins/listener/src/actors/batch.rs (1)
  • name (81-83)
plugins/listener/src/actors/batch.rs (3)
plugins/listener/js/bindings.gen.ts (3)
  • StreamResponse (114-114)
  • SessionEvent (107-107)
  • BatchResponse (103-103)
owhisper/owhisper-interface/src/batch.rs (5)
  • channel (85-89)
  • from (52-63)
  • from (67-80)
  • from (84-92)
  • from (96-98)
crates/audio-utils/src/lib.rs (1)
  • chunk_audio_file (141-167)
apps/desktop/src/store/zustand/listener/general.ts (3)
plugins/listener/js/bindings.gen.ts (5)
  • SessionEvent (107-107)
  • BatchParams (101-101)
  • SessionParams (108-108)
  • StreamResponse (114-114)
  • BatchResponse (103-103)
apps/desktop/src/store/zustand/listener/transcript.ts (2)
  • HandlePersistCallback (9-9)
  • TranscriptActions (17-21)
apps/desktop/src/store/zustand/listener/batch.ts (1)
  • BatchActions (18-22)
🪛 Biome (2.1.2)
packages/tiptap/src/shared/extensions/placeholder.tsx

[error] 96-96: Avoid passing children using a prop

The canonical way to pass children in React is to use additional arguments to React.createElement

(lint/correctness/noChildrenProp)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: ci (macos, macos-14)
🔇 Additional comments (53)
plugins/listener/build.rs (1)

10-10: LGTM! Command registration follows the correct pattern.

The addition of "run_batch" to the COMMANDS array properly registers the new batch transcription command with the Tauri plugin builder.

plugins/local-stt/src/server/external.rs (1)

60-60: Verify the impact of increased log verbosity.

The simplified filter now only excludes /v1/status lines, which will increase logging from the external STT process. While this cleaner condition is easier to maintain and may help debug the new batch transcription feature, ensure the additional log output (previously filtered lines like [TranscriptionHandler], [WebSocket], "Sent interim") doesn't negatively impact production log readability.

apps/desktop/src/components/onboarding/permissions.tsx (1)

5-6: LGTM – Import paths updated correctly.

The import paths have been updated to reflect the reorganization of files into apps/desktop/src/. The relative paths are correct based on the current file location.

apps/desktop/src/routes/app/onboarding/index.tsx (1)

8-11: LGTM – Components correctly moved to shared location.

The onboarding components have been moved from the route directory to a shared components directory, which is a good architectural practice. The updated import paths are correct.

owhisper/owhisper-client/Cargo.toml (2)

14-22: Dependency structure aligns well with batch transcription needs.

The new dependencies support the PR's batch transcription feature:

  • reqwest (with "json" feature) enables HTTP client for batch requests
  • tokio provides async runtime for concurrent operations
  • futures-util supports async stream/future composition
  • thiserror provides ergonomic error handling for the client
  • url supports URL parsing/manipulation

All dependencies use workspace = true, maintaining consistent versioning across the crate.


14-22: Tokio duplication is intentional and necessary.

The verification confirms that tokio appears in both dependencies and dev-dependencies for legitimate reasons:

  • Main code usage: batch.rs uses tokio::task and error.rs uses tokio::task::JoinError for core async functionality
  • Test code usage: Multiple #[tokio::test] macros in lib.rs require tokio in dev-dependencies for test compilation

This is not redundant; the duplication is correct and should be kept as-is.

packages/tiptap/src/shared/extensions/placeholder.tsx (1)

1-1: LGTM: Selective imports for better tree-shaking.

Switching to selective imports is a good practice that can reduce bundle size.

apps/desktop/src/components/main/body/sessions/note-input/raw.tsx (2)

7-35: Clean component structure.

The RawEditor component is well-structured with proper forwardRef typing and integration with the TinyBase store. The simplification from removing file upload logic makes the component more focused on its core editing functionality.


37-53: LGTM!

The placeholder implementation is clean and provides clear user guidance. The conditional rendering ensures the placeholder only appears in the appropriate context (first paragraph at position 0).

owhisper/owhisper-client/src/error.rs (3)

1-2: LGTM!

The imports are minimal and properly scoped for the error types below.


4-5: LGTM!

Appropriate use of thiserror::Error for idiomatic Rust error handling. The derives are suitable given that some wrapped error types (like JoinError) don't support Clone or PartialEq.


6-11: LGTM!

Good use of thiserror attributes:

  • #[from] enables automatic error conversion
  • #[error(transparent)] correctly delegates to the underlying error's formatting

The error variants appropriately cover audio processing failures and HTTP client errors.

plugins/listener/src/actors/recorder.rs (4)

6-8: Good refactoring to use audio utility library.

Replacing manual Vorbis encoding/decoding with library utilities improves maintainability and reduces code duplication.


54-57: LGTM! Proper error handling for OGG decode.

The decoding logic correctly handles existing OGG files by converting to WAV and cleaning up the original, with appropriate error mapping.


140-142: LGTM! Clean error conversion helper.

The error mapping function is a clean and idiomatic way to convert between error types.


117-123: The web search didn't provide the specific repository history. Let me search the codebase directly for what was used before this change.

Now let me search the recorder.rs file directly to see if there are any hints about previous encoding settings:

Based on my investigation of the codebase, I found that VorbisEncodeSettings::default() returns sensible, well-reasoned defaults: quality 0.7 (using Variable BitRate encoding) and block_size 4096. Vorbis quality is typically from -0.1 to 1.0, making 0.7 a mid-to-high quality setting. The encoder is built using VorbisBitrateManagementStrategy::QualityVbr with these settings, which is the recommended approach.

The defaults are not arbitrary but intentional refactoring. The DEFAULT_VORBIS_QUALITY and DEFAULT_VORBIS_BLOCK_SIZE constants are explicitly defined in crates/audio-utils/src/vorbis.rs, indicating these settings were consciously chosen for the encoding pipeline.

No issues found. The use of VorbisEncodeSettings::default() is appropriate and maintains consistent encoding behavior.

apps/desktop/src/hooks/useSTTConnection.ts (1)

39-58: LGTM: Simplified error handling.

The removal of the try/catch block and simplified null returns for error states makes the code cleaner and more readable. React Query will handle any thrown errors appropriately.

owhisper/owhisper-client/src/batch.rs (3)

16-37: LGTM: Appropriate use of spawn_blocking for audio processing.

The CPU-intensive audio decoding is correctly offloaded to a blocking thread. The empty samples validation is a good safeguard.


51-56: Defensive channel count handling is appropriate.

The .max(1) on line 52 ensures at least one channel is reported, which is reasonable defensive programming. While a decoder returning 0 channels likely indicates a corrupt file (already caught by the empty samples check), this guards against edge cases without adding complexity.


76-84: LGTM: Error handling is appropriate.

Using unwrap_or_default() for the error body on line 82 is a pragmatic choice. If reading the response body fails (rare), an empty string is acceptable fallback for the error message context.

apps/desktop/src/hooks/useAITaskTask.ts (1)

8-8: LGTM: Good refactor to use centralized hook.

Extracting useLatestRef to a shared module is a good practice for reusability and maintainability.

apps/desktop/src/components/main/body/sessions/note-input/transcript/index.tsx (1)

2-8: LGTM: Clean integration of progress indicator.

The TranscriptionProgress component is appropriately positioned before the transcript content, and the layout adjustments accommodate it correctly.

apps/desktop/src/hooks/useLatestRef.ts (1)

3-11: LGTM: Standard and correct implementation.

This is a well-established React pattern for tracking the latest value in a ref. The implementation correctly updates the ref after each render when the value changes, which is useful for accessing the latest value in callbacks without causing re-renders.

apps/desktop/src/components/main/body/sessions/note-input/transcript/shared/index.tsx (1)

48-50: LGTM: Clean refactor to session-mode-based logic.

The shift from status-based checks to sessionMode centralization improves consistency and clarity. The derived currentActive and editable flags correctly reflect the session state for UI rendering.

apps/desktop/src/components/main/body/sessions/floating/index.tsx (1)

12-19: LGTM: Simplified rendering logic.

The refactored condition is more explicit, and removing the intermediate button variable makes the code cleaner. The updated container classes prepare for flexible multi-button layouts.

apps/desktop/src/components/main/body/sessions/floating/shared.tsx (1)

19-45: LGTM: Standard className prop pattern.

Adding the optional className prop and merging it at the end of the cn() array is a well-established pattern that allows consumer customization while preserving default styling behavior.

apps/desktop/src/components/main/body/sessions/note-input/header.tsx (1)

218-223: LGTM—Batch mode included in tab rendering logic.

This correctly includes "running_batch" when determining which editor tabs to show during active transcription. Note that apps/desktop/src/components/main/body/sessions/index.tsx (line 29) excludes "running_batch" from its isActive check, which may be intentional but should be verified for consistency.

apps/desktop/src/store/zustand/listener/general.test.ts (1)

31-46: LGTM—Session mode helpers validated.

The tests correctly validate:

  • Default "inactive" mode for unmapped sessions
  • Mode updates via setSessionMode
  • Cleanup when mode is set to "inactive"
apps/desktop/src/components/main/body/sessions/outer-header/overflow.tsx (1)

106-143: LGTM—Proper batch mode handling with UI safeguards.

The implementation correctly:

  • Derives session mode via getSessionMode(sessionId)
  • Prevents user interaction during batch processing
  • Provides clear UI feedback for all states (listening, finalizing, batching)
  • Disables the dropdown item when finalizing or batching
apps/desktop/src/components/main/body/sessions/shared.tsx (2)

23-42: LGTM—Tab selection updated for session modes.

The logic correctly derives the current editor tab based on:

  • User's explicit tab selection
  • Session mode (active/finalizing → raw)
  • Transcript availability (→ enhanced or raw)

59-82: LGTM—Batch awareness added to listen button state.

The hook properly:

  • Derives batch state from session mode
  • Disables the listen button during batch processing
  • Provides a clear warning message for batch operations
apps/desktop/src/components/main/body/sessions/note-input/transcript/progress.tsx (2)

21-37: LGTM—Robust progress calculation with proper clamping.

The progress display logic correctly:

  • Handles missing progress data
  • Clamps transcript duration to audio duration
  • Guards against division by zero
  • Applies Math.max(0, ...) to prevent negative values

66-77: LGTM—Time formatting handles hours correctly.

The formatSeconds helper properly formats durations with zero-padded minutes and seconds, and includes hours when needed.

apps/desktop/src/components/main/body/sessions/outer-header/listen.tsx (1)

85-100: LGTM—Mode-based indicator rendering.

The InMeetingIndicator correctly:

  • Derives active and finalizing states from session mode
  • Shows only during actual listening (not batch processing)
  • Disables the stop button during finalization
apps/desktop/src/hooks/useRunBatch.ts (3)

82-130: LGTM—Persistence logic mirrors established patterns.

The default handlePersist callback correctly:

  • Creates word rows with proper foreign keys
  • Handles speaker hints with provider metadata
  • Guards against empty word arrays and missing data
  • Uses fresh timestamps for each row

This mirrors the implementation in useStartListening (as shown in relevant code snippets), ensuring consistency.


69-77: No issues found. Code correctly matches schema definition.

The implementation is schema-compliant and intentional:

  • created_at: new Date().toISOString(){ type: "string" }
  • started_at: Date.now(){ type: "number" }

The different timestamp types serve distinct purposes (ISO string for sorting/display, numeric milliseconds for calculations) and are correctly implemented per the schema defined in schema-external.ts.

Likely an incorrect or invalid review comment.


48-63: Provider support is correctly limited by backend constraints; clarify that only "deepgram" and "am" are supported by the API.

The code correctly handles all available providers. The BatchProvider type defined in the generated bindings (plugins/listener/js/bindings.gen.ts:102) explicitly restricts batch support to only "deepgram" | "am". This reflects backend limitations, not incomplete code:

  • deepgram → supported ✓
  • hyprnote with am-* models → supported via "am"
  • custom, groq, fireworks → unsupported (backend constraint)

The silent console error for unsupported providers is a valid UX concern—users attempting batch with custom provider receive no feedback. However, no other STT provider can support batch unless backend adds support.

apps/desktop/src/components/main/body/sessions/index.tsx (1)

28-29: Current implementation is correct—no changes needed.

The exclusion of "running_batch" from the isActive calculation is intentional. The red highlighting applied when active={isActive} in TabItemBase is meant to indicate live session activity ("running_active" or "finalizing"), not background batch processing. The codebase consistently distinguishes between "active/listening" states and "batching" states across multiple components (e.g., shared.tsx:61-62, outer-header/overflow.tsx:111), indicating deliberate design.

plugins/listener/src/ext.rs (1)

150-208: Nice separation of provider flows. Cleanly constructing listen_params once and branching into the AM actor vs. Deepgram direct client keeps the API easy to extend; 👍 on emitting SessionEvent::BatchResponse directly from here.

owhisper/owhisper-client/src/live.rs (1)

22-101: Audio interleaving looks solid. The helper pads uneven mic/speaker buffers and the WebSocketIO impl cleanly converts control vs. audio frames—this should keep the dual-stream path stable.

apps/desktop/src/store/zustand/listener/batch.ts (1)

78-146: Thanks for fixing the hint offsets. Offsetting wordIndex by the accumulated word count before pushing hints keeps multi-channel transcripts aligned with the combined word array; exactly what we needed.

plugins/listener/src/actors/batch.rs (1)

96-109: Good call preserving the requested channel count. Capturing listen_params.channels up front and reusing it for both the accumulator and builder avoids the multi-channel drop we saw earlier.

owhisper/owhisper-client/src/lib.rs (6)

1-13: LGTM! Good modular organization.

The module structure and public API surface are well-organized. Extracting RESAMPLED_SAMPLE_RATE_HZ as a constant improves maintainability.


132-144: LGTM! WebSocket scheme selection is correct.

The localhost detection for choosing between ws:// and wss:// schemes is appropriate.


146-156: LGTM! Clean conditional authorization header.

The pattern of conditionally adding the Authorization header only when api_key is present is correct.


158-181: LGTM! Well-designed builder convenience methods.

The builder methods provide a clean API for creating clients with different channel configurations. The BatchClient construction pattern follows established patterns in the codebase.


183-227: LGTM! Correct language and keyword query handling.

The logic correctly handles:

  • Auto-detection when no languages specified
  • Single language with both language and languages params
  • Multi-language with language=multi and individual languages entries
  • Nova-3 model detection for keyterm vs keywords parameter names

229-379: LGTM! Comprehensive test coverage.

The tests cover multiple scenarios:

  • Single-channel streaming (Deepgram and local)
  • Dual-channel streaming
  • Different providers and models
  • Integration with the new module structure
apps/desktop/src/store/zustand/listener/general.ts (5)

1-20: LGTM! Clean type definitions.

The use of Extract to derive LiveSessionStatus from SessionEvent["type"] ensures type safety. The SessionMode union appropriately extends live statuses with batch mode.


22-49: LGTM! Well-structured state and action types.

The addition of sessionModes for per-session tracking and the utility methods (isSessionLive, isSessionRunningBatch) provide a clean API for session state management.


77-226: LGTM! Robust session lifecycle management.

The start method includes good safeguards:

  • Validates targetSessionId is present (lines 80-83)
  • Prevents starting live session during batch processing (lines 85-89)
  • Updates session mode at all lifecycle transitions
  • Handles both live streaming and batch response events

The per-session state tracking ensures multiple sessions can be managed independently.


254-296: Good validation and cleanup logic.

The method includes appropriate safeguards:

  • Validates sessionId is provided
  • Prevents concurrent live/batch operations
  • Properly cleans up state including transcript persistence

350-380: LGTM! Clean session mode utilities.

The implementation correctly:

  • Removes inactive sessions from the map (lines 357-360)
  • Checks the current active session first, then falls back to the map (lines 372-374)
  • Provides convenient boolean checks for common states

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
apps/desktop/src/store/zustand/listener/general.ts (1)

288-358: Remove debug logging and add timeout protection.

The event handling logic in runBatch is well-structured, properly routing batch lifecycle events. However, two issues remain from the previous review:

  1. Line 322 contains debug logging that should be removed or replaced with proper logging
  2. No timeout mechanism exists - if the backend never sends batchResponse, the promise hangs indefinitely

Apply this diff to address both issues:

+ const BATCH_TIMEOUT_MS = 300000; // 5 minutes
+
  await new Promise<void>((resolve, reject) => {
+   const timeoutId = setTimeout(() => {
+     console.error("[runBatch] timeout waiting for batch response");
+     cleanup();
+     reject(new Error("Batch processing timeout"));
+   }, BATCH_TIMEOUT_MS);
+
    listenerEvents.sessionEvent
      .listen(({ payload }) => {
        // ... handle batchStarted, batchProgress, batchFailed ...
        
        if (payload.type !== "batchResponse") {
          return;
        }
        
-       console.log("[runBatch] batch response", payload.response);
        
        try {
+         clearTimeout(timeoutId);
          get().handleBatchResponse(sessionId, payload.response);
          cleanup();
          resolve();
        } catch (error) {
+         clearTimeout(timeoutId);
          console.error("[runBatch] error handling batch response", error);
          cleanup();
          reject(error);
        }
      })
      .then((fn) => {
        unlisten = fn;
        
        listenerCommands
          .runBatch(params)
          .then((result) => {
            if (result.status === "error") {
+             clearTimeout(timeoutId);
              console.error("[runBatch] command failed", result.error);
              cleanup();
              reject(result.error);
            }
          })
          .catch((error) => {
+           clearTimeout(timeoutId);
            console.error("[runBatch] command error", error);
            cleanup();
            reject(error);
          });
      })
      .catch((error) => {
+       clearTimeout(timeoutId);
        console.error("[runBatch] listener setup failed", error);
        cleanup();
        reject(error);
      });
  });
🧹 Nitpick comments (6)
crates/ws/src/client.rs (1)

120-127: Consider simplifying the pre-match metadata extraction.

The is_text and is_binary flags are redundant—within the Message::Text(_) | Message::Binary(_) branch, you can determine the type by checking text_preview.is_some() (or by directly matching on the original message type in a nested match).

Additionally, text_preview is computed unconditionally via to_string(), even when from_message succeeds. While not critical, this incurs a small allocation overhead on the happy path.

One approach to eliminate the flags and defer the clone:

-                                let is_text = matches!(msg, Message::Text(_));
-                                let is_binary = matches!(msg, Message::Binary(_));
-                                let text_preview = if let Message::Text(ref t) = msg {
-                                    Some(t.to_string())
-                                } else {
-                                    None
-                                };
-
                                match msg {
                                    Message::Text(_) | Message::Binary(_) => {
                                        if let Some(output) = T::from_message(msg) {
                                            yield Ok(output);
                                        } else {
-                                            if is_text {
-                                                if let Some(text) = text_preview {
-                                                    tracing::warn!("ws_message_parse_failed: {}", text);
-                                                }
-                                            } else if is_binary {
-                                                tracing::warn!("ws_binary_message_parse_failed");
+                                            // msg is consumed; re-inspect the message type if needed
+                                            // Note: this requires cloning/storing the message before from_message
                                            }
                                        }
                                    },

However, because from_message consumes msg, you'd need to clone or store the original before calling it. The current approach is reasonable given that constraint; just note that simpler conditional logic is possible if you're willing to restructure.

apps/desktop/src/hooks/useRunBatch.ts (2)

43-46: Improve error handling instead of silent failures.

Currently, errors are only logged to console with no user feedback or error propagation. Consider returning a result object or throwing errors so callers can handle failures appropriately.

For example:

-    async (filePath: string, options?: RunOptions) => {
+    async (filePath: string, options?: RunOptions): Promise<{ success: boolean; error?: string }> => {
       if (!store || !conn || !runBatch) {
         console.error("no_batch_connection");
-        return;
+        return { success: false, error: "No batch connection available" };
       }

82-130: Add error handling for database operations in persist callback.

The persist callback performs multiple database writes (store.setRow) but doesn't handle potential failures. If a write fails, subsequent operations may be inconsistent.

Consider wrapping database operations in try-catch or checking return values:

try {
  store.setRow("words", wordId, { /* ... */ });
  wordIds.push(wordId);
} catch (error) {
  console.error("Failed to persist word:", error);
  // Consider what to do on failure - continue or abort?
}
plugins/listener/src/ext.rs (1)

226-251: Consider consistent error types across provider paths.

The Deepgram provider implementation is straightforward and correct. However, note that error handling differs from the Am path:

  • Am path uses BatchStartFailed for start failures (lines 214, 220)
  • Deepgram path maps errors to StartSessionFailed (lines 233, 248)

This inconsistency may cause confusion. Consider using BatchStartFailed consistently for batch-related errors, or document why different error types are appropriate for each provider.

plugins/listener/src/actors/batch.rs (1)

312-410: Consider increasing the batch stream timeout.

The stream processing logic is sound, with proper shutdown handling and response routing. However, the BATCH_STREAM_TIMEOUT_SECS is set to 10 seconds (line 14), which may be too aggressive for batch transcription:

  • With DEFAULT_CHUNK_MS = 500 and DEFAULT_DELAY_MS = 25 (lines 15-16), chunks are sent every ~25ms
  • Processing latency from the server could legitimately exceed 10 seconds between responses for longer audio files or under load
  • A timeout between responses (not total processing time) of 10 seconds might cause false failures

Consider increasing the timeout to 30-60 seconds, or making it configurable based on expected processing characteristics.

apps/desktop/src/components/main/body/sessions/floating/listen.tsx (1)

169-207: Consider detecting audio channels from file metadata.

The file handling logic is well-structured with proper validation and Effect-based composition. However, line 203 hardcodes channels: 1 for all audio imports:

Effect.flatMap((importedPath) => Effect.promise(() => runBatch(importedPath, { channels: 1 }))),

Audio files can be stereo (2 channels). Consider:

  • Detecting channel count from the imported audio file
  • Allowing users to specify channel handling (mono/stereo/auto)
  • Documenting why mono is always used
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d1f9a55 and e02039b.

⛔ Files ignored due to path filters (11)
  • .cursor/rules/simple.mdc is excluded by !**/.cursor/**
  • plugins/analytics/js/bindings.gen.ts is excluded by !**/*.gen.ts
  • plugins/apple-calendar/js/bindings.gen.ts is excluded by !**/*.gen.ts
  • plugins/db/js/bindings.gen.ts is excluded by !**/*.gen.ts
  • plugins/listener/js/bindings.gen.ts is excluded by !**/*.gen.ts
  • plugins/mcp/js/bindings.gen.ts is excluded by !**/*.gen.ts
  • plugins/sse/js/bindings.gen.ts is excluded by !**/*.gen.ts
  • plugins/store2/js/bindings.gen.ts is excluded by !**/*.gen.ts
  • plugins/webhook/js/bindings.gen.ts is excluded by !**/*.gen.ts
  • plugins/windows/js/bindings.gen.ts is excluded by !**/*.gen.ts
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (19)
  • apps/desktop/src/components/main/body/sessions/floating/listen.tsx (3 hunks)
  • apps/desktop/src/components/main/body/sessions/index.tsx (2 hunks)
  • apps/desktop/src/components/main/body/sessions/note-input/header.tsx (3 hunks)
  • apps/desktop/src/components/main/body/sessions/note-input/index.tsx (1 hunks)
  • apps/desktop/src/components/main/body/sessions/note-input/transcript/index.tsx (1 hunks)
  • apps/desktop/src/components/main/body/sessions/note-input/transcript/progress.tsx (1 hunks)
  • apps/desktop/src/components/main/body/sessions/outer-header/listen.tsx (1 hunks)
  • apps/desktop/src/components/settings/ai/stt/configure.tsx (1 hunks)
  • apps/desktop/src/config/use-config.ts (1 hunks)
  • apps/desktop/src/hooks/useAutoEnhance.ts (1 hunks)
  • apps/desktop/src/hooks/useRunBatch.ts (1 hunks)
  • apps/desktop/src/store/zustand/listener/batch.ts (1 hunks)
  • apps/desktop/src/store/zustand/listener/general.test.ts (1 hunks)
  • apps/desktop/src/store/zustand/listener/general.ts (9 hunks)
  • crates/ws/src/client.rs (1 hunks)
  • plugins/listener/src/actors/batch.rs (1 hunks)
  • plugins/listener/src/error.rs (1 hunks)
  • plugins/listener/src/events.rs (2 hunks)
  • plugins/listener/src/ext.rs (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
  • apps/desktop/src/components/main/body/sessions/index.tsx
  • apps/desktop/src/store/zustand/listener/general.test.ts
  • apps/desktop/src/components/main/body/sessions/note-input/transcript/index.tsx
  • apps/desktop/src/components/main/body/sessions/outer-header/listen.tsx
  • apps/desktop/src/components/main/body/sessions/note-input/transcript/progress.tsx
  • apps/desktop/src/store/zustand/listener/batch.ts
🧰 Additional context used
🧬 Code graph analysis (11)
apps/desktop/src/config/use-config.ts (1)
apps/desktop/src/contexts/listener.tsx (1)
  • useListener (33-47)
apps/desktop/src/hooks/useRunBatch.ts (7)
apps/desktop/src/store/zustand/listener/transcript.ts (1)
  • HandlePersistCallback (9-9)
plugins/listener/js/bindings.gen.ts (2)
  • runBatch (74-81)
  • BatchParams (101-101)
apps/desktop/src/contexts/listener.tsx (1)
  • useListener (33-47)
apps/desktop/src/store/zustand/tabs/index.ts (1)
  • useTabs (21-30)
apps/desktop/src/hooks/useSTTConnection.ts (1)
  • useSTTConnection (15-91)
apps/desktop/src/hooks/useKeywords.ts (1)
  • useKeywords (7-20)
apps/desktop/src/config/use-config.ts (1)
  • useConfigValue (20-32)
apps/desktop/src/components/settings/ai/stt/configure.tsx (1)
apps/desktop/src/contexts/listener.tsx (1)
  • useListener (33-47)
apps/desktop/src/hooks/useAutoEnhance.ts (1)
apps/desktop/src/contexts/listener.tsx (1)
  • useListener (33-47)
apps/desktop/src/components/main/body/sessions/note-input/header.tsx (4)
apps/desktop/src/contexts/listener.tsx (1)
  • useListener (33-47)
apps/desktop/src/components/main/body/sessions/note-input/transcript/progress.tsx (1)
  • TranscriptionProgress (7-38)
apps/desktop/src/store/zustand/tabs/schema.ts (1)
  • EditorView (11-11)
apps/desktop/src/components/main/body/sessions/shared.tsx (1)
  • useHasTranscript (13-21)
plugins/listener/src/ext.rs (5)
plugins/listener/js/bindings.gen.ts (4)
  • SessionEvent (107-107)
  • BatchProvider (102-102)
  • BatchParams (101-101)
  • BatchResponse (103-103)
plugins/listener/src/commands.rs (1)
  • run_batch (77-82)
owhisper/owhisper-client/src/lib.rs (1)
  • params (33-36)
plugins/listener/src/actors/batch.rs (1)
  • name (89-91)
owhisper/owhisper-client/src/batch.rs (1)
  • builder (40-42)
apps/desktop/src/components/main/body/sessions/note-input/index.tsx (1)
apps/desktop/src/contexts/listener.tsx (1)
  • useListener (33-47)
plugins/listener/src/events.rs (1)
plugins/listener/js/bindings.gen.ts (2)
  • BatchResponse (103-103)
  • StreamResponse (114-114)
apps/desktop/src/components/main/body/sessions/floating/listen.tsx (5)
apps/desktop/src/components/main/body/sessions/shared.tsx (2)
  • useListenButtonState (59-82)
  • ActionableTooltipContent (84-109)
apps/desktop/src/contexts/listener.tsx (1)
  • useListener (33-47)
apps/desktop/src/components/main/body/sessions/floating/shared.tsx (2)
  • FloatingButton (9-71)
  • ActionableTooltipContent (7-7)
apps/desktop/src/hooks/useRunBatch.ts (1)
  • useRunBatch (24-148)
packages/utils/src/cn.ts (1)
  • cn (20-22)
apps/desktop/src/store/zustand/listener/general.ts (3)
plugins/listener/js/bindings.gen.ts (5)
  • SessionEvent (107-107)
  • BatchParams (101-101)
  • SessionParams (108-108)
  • StreamResponse (114-114)
  • BatchResponse (103-103)
apps/desktop/src/store/zustand/listener/transcript.ts (2)
  • HandlePersistCallback (9-9)
  • TranscriptActions (17-21)
apps/desktop/src/store/zustand/listener/batch.ts (2)
  • BatchActions (19-24)
  • BatchState (9-17)
plugins/listener/src/actors/batch.rs (4)
plugins/listener/js/bindings.gen.ts (2)
  • StreamResponse (114-114)
  • SessionEvent (107-107)
plugins/listener/src/ext.rs (1)
  • tokio (166-166)
crates/audio-utils/src/lib.rs (1)
  • chunk_audio_file (141-167)
owhisper/owhisper-client/src/batch.rs (1)
  • builder (40-42)
🔇 Additional comments (32)
apps/desktop/src/config/use-config.ts (1)

45-45: LGTM - State structure migration.

The change correctly migrates from top-level state.status to nested state.live.status, aligning with the new per-session state management introduced in this PR.

apps/desktop/src/components/settings/ai/stt/configure.tsx (1)

443-443: LGTM - Consistent state structure migration.

The inactive check now correctly uses state.live.status, maintaining the same guard logic while adopting the new nested state structure.

apps/desktop/src/hooks/useAutoEnhance.ts (1)

17-17: LGTM - State structure updated.

The status observation correctly migrates to state.live.status, and the subsequent comparison logic in line 65 remains valid with this change.

apps/desktop/src/components/main/body/sessions/note-input/index.tsx (1)

21-21: LGTM - State structure migration.

The inactive check correctly uses state.live.status === "inactive", consistent with the broader state structure changes in this PR.

plugins/listener/src/error.rs (1)

23-28: LGTM - Proper error handling for batch operations.

The new error variants are well-structured:

  • Batch and SpawnError use #[from] for automatic conversion
  • BatchStartFailed provides contextual error messages
  • All maintain consistency with existing error handling patterns
apps/desktop/src/components/main/body/sessions/note-input/header.tsx (3)

176-176: LGTM - Batch processing state check.

The check sessionId in state.batch correctly determines if batch processing is active for this session by checking for the presence of the session in the batch state object.


182-182: LGTM - Progress indicator integration.

The conditional rendering of TranscriptionProgress appropriately shows progress during batch processing or when inactive, replacing the previous editing controls approach.

Also applies to: 210-210


216-221: LGTM - Session mode-based tab logic.

The refactor to use getSessionMode(sessionId) provides cleaner session-specific state management, properly handling both running_active and running_batch modes.

apps/desktop/src/hooks/useRunBatch.ts (3)

146-146: Dependencies array looks comprehensive.

The dependency array includes all external values used in the callback, ensuring proper memoization behavior.


48-63: Batch processing support is intentionally limited to Deepgram and HyprNote's "am-*" models—other providers will silently fail with console error only.

The concern is valid. The codebase defines 5 STT providers (hyprnote, deepgram, custom, groq, fireworks), but batch processing only supports:

  • Deepgram (all models)
  • HyprNote's "am-parakeet-v2" and "am-parakeet-v3" only

Other HyprNote models (QuantizedTinyEn, QuantizedSmallEn), custom, and disabled providers (groq, fireworks) will silently fail. The error is only logged to console, not surfaced to the user. Consider:

  • Adding explicit validation and user-facing error message when batch is triggered with unsupported providers
  • Documenting supported providers in UI settings or API
  • Disabling batch UI option for unsupported providers

122-122: Incorrect scope in concern—line 122 is separate from BatchParams type checking.

Line 122 stores conn.provider (type ProviderId) in a JSON string for speaker hints metadata, which is distinct from BatchParams construction. The BatchParams.provider field uses an explicitly mapped variable (lines 48–54) that safely converts conn.provider to BatchProvider ("deepgram" | "am") through conditional logic. No type incompatibility exists between line 122 and BatchParams.

Likely an incorrect or invalid review comment.

plugins/listener/src/events.rs (2)

1-2: LGTM - Clean imports for batch support.

The imports for BatchResponse and StreamResponse are properly scoped from owhisper_interface.


26-34: LGTM - Comprehensive batch event coverage.

The new event variants provide complete batch operation lifecycle management:

  • BatchStarted signals initiation
  • BatchResponse delivers final results
  • BatchResponseStreamed enables progress tracking with percentage
  • BatchFailed handles errors

All variants use appropriate serde renames for consistent JSON serialization and carry the necessary payload data.

plugins/listener/src/ext.rs (4)

12-34: LGTM!

The BatchProvider enum and BatchParams struct are well-designed with appropriate serde attributes for serialization. The optional fields with #[serde(default)] provide good flexibility for batch transcription configuration.


52-52: LGTM!

The trait method signature is well-defined and follows async patterns consistently with other methods in the trait.


151-162: LGTM!

The ListenParams construction is correct, with sensible defaults for batch processing (channels defaulting to 1, and redemption_time_ms set to None).


164-225: Code implementation verified and correct.

The crate::Error::BatchStartFailed error variant is defined in plugins/listener/src/error.rs:28 as a tuple variant accepting a String. The Am provider implementation correctly orchestrates actor spawning, start notification, and error handling, with proper usage of this error variant.

plugins/listener/src/actors/batch.rs (6)

18-27: LGTM!

The BatchMsg enum comprehensively covers batch lifecycle events, and the BatchStartNotifier type appropriately uses Arc<Mutex<...>> for thread-safe one-time signaling.


28-84: LGTM!

The structs and helper methods are well-designed:

  • BatchArgs is appropriately Clone while BatchState is not (since it owns task handles)
  • Duration validation in on_audio_duration is defensive, clamping infinite/negative values to 0.0
  • Progress percentage calculation properly clamps to [0.0, 1.0] and handles division-by-zero

86-126: LGTM!

The actor lifecycle methods are correctly implemented:

  • pre_start properly spawns the batch task and initializes state
  • post_stop ensures graceful shutdown by signaling via shutdown_tx and awaiting task completion

128-175: LGTM!

Message handling is well-structured:

  • StreamResponse processing filters to only final responses (line 138-141), reducing event noise while still tracking progress
  • Error cases properly emit failure events before stopping the actor
  • Clean actor termination on StreamEnded

277-310: LGTM!

The streaming setup is correctly implemented:

  • Audio chunks are properly wrapped as MixedMessage::Audio and chained with a Finalize message
  • Throttling prevents flooding the server with chunks
  • Error handling ensures the start notifier receives failure status before the task exits
  • Success notification (line 303) occurs only after the stream successfully starts

177-276: LGTM!

The chunk calculation and audio loading logic is robust:

  • chunk_samples() uses saturating arithmetic with proper rounding (ceiling division via +999)
  • Audio loading correctly uses spawn_blocking for potentially blocking I/O
  • Channel count is properly clamped to 1-2 (line 270)
  • Error paths properly notify the start notifier and send StreamStartFailed messages
  • Verified: chunk_audio_file function exists in crates/audio-utils/src/lib.rs
apps/desktop/src/components/main/body/sessions/floating/listen.tsx (4)

1-48: LGTM!

The imports are well-organized, and the ListenButton component correctly accesses the nested live.loading state, consistent with the store refactoring.


50-150: LGTM!

The refactored ListenSplitButton cleanly separates primary action from auxiliary options:

  • Primary button maintains existing functionality with clear onPrimaryClick prop
  • OptionsMenu is properly positioned and receives necessary props (sessionId, disabled, warningMessage)
  • Tooltip integration provides good UX for disabled states

209-264: LGTM!

The file selection and upload handlers are well-implemented:

  • Effect pipeline properly sequences async operations (get directory → open dialog → process file)
  • File type filters match validation logic in handleFilePath
  • Disabled state is checked consistently before allowing actions
  • Popover closes before dialog opens, providing good UX

266-336: LGTM with note on incomplete feature.

The OptionsMenu rendering logic correctly handles all states (disabled with/without tooltip, enabled). The UI is well-structured with appropriate accessibility attributes.

Note that "Upload transcript" is currently disabled (line 327), indicating this feature is not yet complete. Consider adding a TODO comment or tracking issue if this is intended for future implementation.

apps/desktop/src/store/zustand/listener/general.ts (5)

1-48: LGTM!

The type refactoring improves clarity by nesting live session state under a live object, making the distinction between live and batch sessions explicit. The SessionMode type appropriately extends LiveSessionStatus with "running_batch".


78-218: LGTM!

The start method refactoring is thorough and correct:

  • Validates session_id is present (lines 81-84)
  • Prevents conflicts by checking if the session is already in batch mode (lines 86-90)
  • Consistently uses nested live.* state throughout
  • Properly routes batchResponse events with targetSessionId (lines 166-172)
  • Comprehensive cleanup on failure (lines 198-213)

219-245: LGTM!

The stop and setMuted methods are correctly updated to use the nested live.* state structure.


246-287: LGTM!

The runBatch setup and validation logic is robust:

  • Requires sessionId for per-session tracking (lines 249-252)
  • Prevents conflicts with live sessions and duplicate batch runs (lines 254-263)
  • Cleanup function properly manages listeners, persistence, and batch state (lines 275-286)

359-375: LGTM!

The getSessionMode method correctly determines session state by checking both live.sessionId and the batch record, returning the appropriate SessionMode for the given session.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
apps/desktop/src/store/zustand/listener/general.ts (1)

288-354: Add timeout protection to prevent indefinite hangs.

The batch processing promise lacks timeout protection. If the backend fails to send batchResponse or batchFailed, the promise will hang indefinitely.

While lines 303-307 provide a fallback via the batchProgress completion check, this assumes progress events are consistently emitted. Network issues or backend failures could prevent any events from arriving.

Consider adding timeout protection:

+ const BATCH_TIMEOUT_MS = 300000; // 5 minutes, adjust as needed
+
  await new Promise<void>((resolve, reject) => {
+   const timeoutId = setTimeout(() => {
+     console.error(`[runBatch] timeout waiting for batch completion (sessionId: ${sessionId})`);
+     cleanup();
+     reject(new Error("Batch processing timeout"));
+   }, BATCH_TIMEOUT_MS);
+
+   const cleanupWithTimeout = () => {
+     clearTimeout(timeoutId);
+     cleanup();
+   };
+
    listenerEvents.sessionEvent
      .listen(({ payload }) => {
        if (payload.type === "batchStarted") {
          get().handleBatchStarted(payload.session_id);
          return;
        }

        if (payload.type === "batchProgress") {
          get().handleBatchResponseStreamed(
            sessionId,
            payload.response,
            payload.percentage,
          );

          const batchState = get().batch[sessionId];
          if (batchState?.isComplete) {
-           cleanup();
+           cleanupWithTimeout();
            resolve();
          }
          return;
        }

        if (payload.type === "batchFailed") {
-         cleanup();
+         cleanupWithTimeout();
          reject(payload.error);
          return;
        }

        if (payload.type !== "batchResponse") {
          return;
        }

        try {
          get().handleBatchResponse(sessionId, payload.response);
-         cleanup();
+         cleanupWithTimeout();
          resolve();
        } catch (error) {
          console.error("[runBatch] error handling batch response", error);
-         cleanup();
+         cleanupWithTimeout();
          reject(error);
        }
      })
      .then((fn) => {
        unlisten = fn;

        listenerCommands
          .runBatch(params)
          .then((result) => {
            if (result.status === "error") {
              console.error(result.error);
-             cleanup();
+             cleanupWithTimeout();
              reject(result.error);
            }
          })
          .catch((error) => {
            console.error(error);
-           cleanup();
+           cleanupWithTimeout();
            reject(error);
          });
      })
      .catch((error) => {
        console.error(error);
-       cleanup();
+       cleanupWithTimeout();
        reject(error);
      });
  });
🧹 Nitpick comments (1)
plugins/listener/src/actors/batch.rs (1)

384-410: Consider whether iterating all alternatives is necessary.

The function computes the maximum end timestamp across all alternatives in the channel. Typically, only the first alternative (best hypothesis) contains the relevant timing information for progress calculation.

If only the first alternative should be considered, this would be clearer:

 fn transcript_end_from_response(response: &StreamResponse) -> Option<f64> {
     let StreamResponse::TranscriptResponse {
         start,
         duration,
         channel,
         ..
     } = response
     else {
         return None;
     };

     let mut end = (*start + *duration).max(0.0);

-    for alternative in &channel.alternatives {
-        for word in &alternative.words {
+    if let Some(alternative) = channel.alternatives.first() {
+        for word in &alternative.words {
             if word.end.is_finite() {
                 end = end.max(word.end);
             }
         }
     }

     if end.is_finite() {
         Some(end)
     } else {
         None
     }
 }
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e02039b and 1b8ff02.

📒 Files selected for processing (3)
  • apps/desktop/src/store/zustand/listener/general.ts (9 hunks)
  • apps/desktop/src/store/zustand/listener/transcript.ts (6 hunks)
  • plugins/listener/src/actors/batch.rs (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
apps/desktop/src/store/zustand/listener/transcript.ts (2)
apps/desktop/src/utils/segment.ts (2)
  • WordLike (11-16)
  • RuntimeSpeakerHint (26-29)
plugins/listener/js/bindings.gen.ts (1)
  • StreamAlternatives (109-109)
apps/desktop/src/store/zustand/listener/general.ts (4)
plugins/listener/js/bindings.gen.ts (5)
  • SessionEvent (107-107)
  • BatchParams (101-101)
  • SessionParams (108-108)
  • StreamResponse (114-114)
  • BatchResponse (103-103)
apps/desktop/src/store/zustand/listener/index.ts (1)
  • SessionMode (12-12)
apps/desktop/src/store/zustand/listener/transcript.ts (2)
  • HandlePersistCallback (9-9)
  • TranscriptActions (18-22)
apps/desktop/src/store/zustand/listener/batch.ts (2)
  • BatchActions (19-24)
  • BatchState (9-17)
plugins/listener/src/actors/batch.rs (3)
plugins/listener/js/bindings.gen.ts (2)
  • StreamResponse (114-114)
  • SessionEvent (107-107)
plugins/listener/src/ext.rs (1)
  • tokio (166-166)
crates/audio-utils/src/lib.rs (1)
  • chunk_audio_file (141-167)
🔇 Additional comments (17)
plugins/listener/src/actors/batch.rs (7)

18-24: LGTM!

The BatchMsg enum provides a clear and complete set of message variants for batch processing actor communication.


26-43: LGTM!

The type definitions are well-structured. The use of Arc<Mutex<Option<...>>> for the notifier and Option for cleanup resources follows Rust best practices.


45-84: LGTM!

The BatchState methods demonstrate good defensive programming with proper handling of edge cases (non-finite values, zero duration) and safe percentage calculations.


86-175: LGTM!

The BatchActor implementation follows the Ractor pattern correctly with proper lifecycle management (pre_start, post_stop) and comprehensive message handling covering all success, error, and completion paths.


177-201: LGTM!

The BatchStreamConfig implementation includes careful overflow protection and boundary checks. The sample calculation correctly converts milliseconds to samples with proper rounding.


203-310: LGTM!

The spawn_batch_task function demonstrates robust error handling at each step, proper use of spawn_blocking for CPU-bound work, and correct notification of start results. The channel count clamping addresses the previously identified issue.


312-382: LGTM!

The process_batch_stream function correctly implements concurrent shutdown signaling and stream processing with timeout protection. All success, error, and completion paths are properly handled.

apps/desktop/src/store/zustand/listener/transcript.ts (5)

4-29: LGTM!

The type updates and new state field finalWordsMaxEndMsByChannel provide per-channel tracking of persisted word boundaries, which is a solid design for incremental persistence.


35-81: LGTM!

The refactored handleFinalWords correctly implements incremental persistence by tracking the last persisted end time per channel and only persisting newly transcribed words. The index adjustments for hints are accurate.


83-120: LGTM!

The handlePartialWords function correctly maintains partial word state by merging new words with existing ones while preserving temporal ordering and adjusting hint indices appropriately.


155-172: LGTM!

The resetTranscript function correctly resets all state fields including the newly added finalWordsMaxEndMsByChannel.


178-213: LGTM!

The transformWords function correctly adapts to the StreamAlternatives type and safely accesses word data within the loop bounds.

apps/desktop/src/store/zustand/listener/general.ts (5)

1-58: LGTM!

The type refactoring to nest live session state and introduce batch processing types is well-structured. The SessionMode union type clearly distinguishes between live and batch processing states.


78-217: LGTM!

The start function enhancements properly validate the session ID, prevent conflicts with batch processing, and consistently update the nested live state throughout the event handling and error paths.


219-244: LGTM!

The stop and setMuted functions correctly update the nested live state fields.


246-286: LGTM!

The validation logic correctly prevents conflicts between live and batch sessions, and the cleanup setup properly manages listener lifecycle and persistence state.


356-372: LGTM!

The getSessionMode function correctly determines session state by checking live and batch session tracking, with appropriate fallback to "inactive".

@yujonglee yujonglee merged commit ac25cd5 into main Nov 7, 2025
3 of 4 checks passed
@yujonglee yujonglee deleted the batch-transcript-support branch November 7, 2025 06:37
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
apps/desktop/src/components/main/body/sessions/note-input/header.tsx (1)

57-63: Remove unused dependency from callback.

The handleTabClick callback includes onRegenerate in its dependency array but doesn't use it in the function body. This can cause unnecessary re-renders.

Apply this diff:

  const handleTabClick = useCallback(() => {
    if (!isActive) {
      onClick();
    } else {
      setOpen(true);
    }
- }, [isActive, onClick, onRegenerate, setOpen]);
+ }, [isActive, onClick, setOpen]);
🧹 Nitpick comments (2)
apps/desktop/src/components/main/body/sessions/note-input/header.tsx (2)

182-189: Simplify the showProgress condition.

The showProgress logic includes isInactive even though TranscriptionProgress internally only renders when mode === "running_batch" (returning null otherwise). This causes the component to mount unnecessarily when isInactive is true but not batch processing, adding confusion.

Apply this diff to simplify:

- const showProgress = currentTab === "transcript" && (isInactive || isBatchProcessing);
+ const showProgress = currentTab === "transcript" && isBatchProcessing;
  const showEditingControls = currentTab === "transcript" && isInactive && !isBatchProcessing;

192-222: Verify the layout structure.

The restructured layout uses a flex-col wrapper (line 192) containing a single flex justify-between child (line 193). While this works correctly, the outer column container appears to serve no immediate purpose unless you're planning to add more rows in the future.

If no additional rows are planned, you could simplify to:

- <div className="flex flex-col">
-   <div className="flex justify-between items-center">
+ <div className="flex justify-between items-center">
-     <div className="flex gap-1">
+   <div className="flex gap-1">
-       {editorTabs.map((view) => {
-         // ... tabs
-       })}
-     </div>
-     {showProgress && <TranscriptionProgress sessionId={sessionId} />}
-     {showEditingControls && <EditingControls isEditing={isEditing} setIsEditing={setIsEditing} />}
-   </div>
+     {editorTabs.map((view) => {
+       // ... tabs
+     })}
+   </div>
+   {showProgress && <TranscriptionProgress sessionId={sessionId} />}
+   {showEditingControls && <EditingControls isEditing={isEditing} setIsEditing={setIsEditing} />}
- </div>
+</div>

However, if you're planning to add more content below the tabs/controls row, the current structure is fine.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1b8ff02 and 984f21a.

📒 Files selected for processing (3)
  • apps/desktop/src/components/main/body/sessions/note-input/header.tsx (3 hunks)
  • apps/desktop/src/components/main/body/sessions/note-input/index.tsx (2 hunks)
  • apps/desktop/src/components/main/body/sessions/outer-header/folder.tsx (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • apps/desktop/src/components/main/body/sessions/note-input/index.tsx
🧰 Additional context used
🧬 Code graph analysis (2)
apps/desktop/src/components/main/body/sessions/outer-header/folder.tsx (2)
packages/ui/src/components/ui/breadcrumb.tsx (1)
  • BreadcrumbList (112-112)
packages/ui/src/components/ui/button.tsx (1)
  • Button (53-53)
apps/desktop/src/components/main/body/sessions/note-input/header.tsx (5)
apps/desktop/src/store/zustand/tabs/schema.ts (1)
  • EditorView (11-11)
apps/desktop/src/contexts/listener.tsx (1)
  • useListener (33-47)
apps/desktop/src/components/main/body/sessions/note-input/transcript/progress.tsx (1)
  • TranscriptionProgress (7-38)
apps/desktop/src/components/main/body/sessions/note-input/transcript/editing-controls.tsx (1)
  • EditingControls (6-95)
apps/desktop/src/components/main/body/sessions/shared.tsx (1)
  • useHasTranscript (13-21)
🔇 Additional comments (5)
apps/desktop/src/components/main/body/sessions/outer-header/folder.tsx (2)

32-32: LGTM! Minor spacing refinement.

The addition of gap-0.5 provides consistent spacing between breadcrumb items, improving visual clarity.


63-68: LGTM! Clean button styling updates.

The ghost variant and padding removal (px-0) create a cleaner breadcrumb appearance while maintaining proper functionality and accessibility.

apps/desktop/src/components/main/body/sessions/note-input/header.tsx (3)

1-1: LGTM!

The new imports are necessary and correctly used throughout the file.

Also applies to: 17-17


163-180: LGTM!

The isInactive prop is cleanly integrated into the Header API and properly typed.


225-238: LGTM!

The shift from status-based checking to getSessionMode(sessionId) correctly implements the mode-based session logic for batch transcription support. Including both "running_active" and "running_batch" modes for transcript visibility is appropriate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants