Explicit sample_rate in owhisper client #1651

yujonglee · 2025-11-11T10:08:26Z

No description provided.

coderabbitai · 2025-11-11T10:08:45Z

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Walkthrough

Replaces fixed resample constants with per-file audio metadata and time-based chunking, adds sample_rate accessors, removes per-call channels from batch options, introduces ractor-supervisor spawn path for batch actors, and annotates multiple Actor implementations with #[ractor::async_trait].

Changes

Cohort / File(s)	Summary
Workspace manifests `Cargo.toml`, `plugins/listener/Cargo.toml`	ractor dependency changed to { version = "0.14.3" }; added `ractor-supervisor = "0.1.9"` and `tracing`.
Task automation `Taskfile.yaml`	Added `db` task that queries a SQLite DB and opens the first store value via jless.
Chunking & metadata (audio-utils) `crates/audio-utils/src/lib.rs`	Added `AudioMetadata` (sample_rate, channels); `audio_file_metadata()`; `metadata_from_source()`; `ChunkedAudio` now includes `frame_count` and `metadata`; `chunk_audio_file(path, chunk_ms)` now chunks by ms using metadata.
Audio sample rate accessors `crates/audio/src/mic.rs`, `crates/audio/src/speaker/*` (`mod.rs`, `linux.rs`, `macos.rs`, `windows.rs`)	Added `pub fn sample_rate(&self) -> u32` across MicInput and SpeakerInput implementations; platform-specific returns/delegation.
owhisper interface & client `owhisper/owhisper-interface/src/lib.rs`, `owhisper/owhisper-client/src/lib.rs`, `owhisper/owhisper-client/src/batch.rs`	`ListenParams` gains `pub sample_rate: u32` (default 16000); client and batch URL builders use per-request `sample_rate`; `decode_audio_to_linear16` now returns bytes plus sample_rate and forces mono averaging.
Listener plugin — actor spawn & params `plugins/listener/src/ext.rs`, `plugins/listener/src/actors/batch.rs`	Replaced direct Actor::spawn with `spawn_batch_actor(args)` (supervisor-based); removed `channels` from BatchParams; derive channels/sample_rate from audio metadata and use metadata in ListenParams and duration/frame calculations.
Listener actor async annotations `plugins/listener/src/actors/{listener,processor,recorder,session,source}.rs`	Added `#[ractor::async_trait]` to Actor impls; `processor.rs` converted to async Actor impl with `pre_start` and `handle`.
Listener events `plugins/listener/src/events.rs`	Filter out non-finite samples (NaN/Inf) when computing mic/speaker magnitudes for SessionEvent.
Local STT actors `plugins/local-stt/src/server/{external.rs,internal.rs}`	Annotated Actor impls with `#[ractor::async_trait]`; tightened log filtering in external stdout/stderr handling.
Desktop changes (hooks/components) `apps/desktop/src/hooks/useRunBatch.ts`, `apps/desktop/src/components/.../listen.tsx`, `apps/desktop/src/hooks/useAutoEnhance.ts`, `apps/desktop/src/components/.../shared/hooks.ts`	Removed `channels?: number` from `RunOptions` and omitted channels from runBatch calls; removed effect in `useAutoEnhance` that mutated editor based on enhanceTask.status; small refactor in `useFinalWords`.

Sequence Diagram(s)

sequenceDiagram
    participant Desktop as Desktop App
    participant Ext as plugins/listener::ext
    participant Supervisor as Batch Supervisor
    participant Batch as Batch Actor
    participant Audio as audio-utils
    participant Listen as owhisper Client

    Desktop->>Ext: request start batch (path, args)
    Ext->>Audio: audio_file_metadata(path) (blocking task)
    Audio-->>Ext: AudioMetadata { sample_rate, channels }
    Ext->>Supervisor: spawn_batch_actor(BatchArgs{..., metadata})
    Supervisor-->>Batch: start actor
    Batch->>Audio: chunk_audio_file(path, chunk_ms)
    Audio-->>Batch: ChunkedAudio { chunks, sample_count, frame_count, metadata }
    Batch->>Listen: build client with metadata.sample_rate & metadata.channels
    Batch->>Listen: stream chunks for transcription

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Pay special attention to:
- crates/audio-utils/src/lib.rs — correctness of sample/frame calculations, clamping, mono conversion, and returned metadata.
- plugins/listener/src/{ext.rs,actors/batch.rs} — supervisor spawn changes, removed channels field, and duration/frame_count usage.
- owhisper changes — URL parameter composition and decode_audio_to_linear16 sample rate handling and mono downmix.
- ractor downgrade and addition of ractor-supervisor — ensure spawn/Actor trait semantics remain compatible with annotated async_trait changes.

Possibly related PRs

Explicit sample_rate in owhisper client #1651 — similar changes introducing per-request sample_rate, audio metadata propagation, and removing fixed resample constants.
Batch transcribe support #1638 — overlaps on batch transcription surfaces, useRunBatch changes, and listener actor refactors.
Deepgram compat v2 #1307 — related audio metadata and chunking updates that propagate sample_rate/channels through listener and client code.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 13.16% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.
Description check	❓ Inconclusive	No pull request description was provided by the author.	Consider adding a description explaining the motivation, scope, and impact of making sample_rate explicit across the audio pipeline.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Explicit sample_rate in owhisper client' accurately reflects the main changes in this PR, which center on making sample_rate an explicit, configurable parameter throughout the audio pipeline rather than a fixed constant.

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fe1b223 and 5ef7309.

📒 Files selected for processing (2)

owhisper/owhisper-client/src/batch.rs (3 hunks)
owhisper/owhisper-client/src/lib.rs (2 hunks)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

apps/desktop/src/hooks/useAutoEnhance.ts (1)
56-64: Missing dependencies in useEffect may cause stale closure issues.

The effect uses tab and updateSessionTabState but doesn't include them in the dependency array. If tab changes (e.g., when switching sessions), the effect won't re-run and may update the wrong session's state.

Apply this diff to include the missing dependencies:
-  }, [listenerStatus, prevListenerStatus, startEnhance]);
+  }, [listenerStatus, prevListenerStatus, startEnhance, tab, updateSessionTabState]);
Alternatively, if updateSessionTabState is guaranteed to be stable (which is common for Zustand store methods), you can include just tab:
-  }, [listenerStatus, prevListenerStatus, startEnhance]);
+  }, [listenerStatus, prevListenerStatus, startEnhance, tab]);
owhisper/owhisper-client/src/lib.rs (1)
13-13: Remove unused constant.

The RESAMPLED_SAMPLE_RATE_HZ constant is no longer used after the migration to dynamic sample rate configuration and should be removed to avoid confusion.

Apply this diff:
-const RESAMPLED_SAMPLE_RATE_HZ: u32 = 16_000;
-

🧹 Nitpick comments (3)

apps/desktop/src/components/main/body/sessions/note-input/transcript/shared/hooks.ts (1)

16-20: Consider removing unnecessary intermediate variable.

The local ret variable stores the result of a simple mapping and sort operation without adding clarity or enabling reuse. Returning the computation directly from useMemo is more idiomatic for React hooks.
  return useMemo(() => {
    if (!resultTable) {
      return [];
    }

-    const ret = Object.entries(resultTable)
-      .map(([wordId, row]) => ({ ...(row as unknown as main.Word), id: wordId }))
-      .sort((a, b) => a.start_ms - b.start_ms);
-
-    return ret;
+    return Object.entries(resultTable)
+      .map(([wordId, row]) => ({ ...(row as unknown as main.Word), id: wordId }))
+      .sort((a, b) => a.start_ms - b.start_ms);
  }, [resultTable]);

plugins/local-stt/src/server/external.rs (1)

63-68: Consider refactoring log filters for better maintainability.

The log filtering logic now has four separate string checks. While functional, this could be refactored to use a pattern list for improved maintainability.

Consider this refactor:

                         if let Ok(text) = String::from_utf8(bytes) {
                             let text = text.trim();
-                            if !text.is_empty()
-                                && !text.contains("[WebSocket]")
-                                && !text.contains("Sent interim text:")
-                                && !text.contains("[TranscriptionHandler]")
-                                && !text.contains("/v1/status")
-                            {
+                            const EXCLUDED_PATTERNS: &[&str] = &[
+                                "[WebSocket]",
+                                "Sent interim text:",
+                                "[TranscriptionHandler]",
+                                "/v1/status",
+                            ];
+                            
+                            if !text.is_empty() && !EXCLUDED_PATTERNS.iter().any(|p| text.contains(p)) {
                                 tracing::info!("{}", text);
                             }
                         }

Taskfile.yaml (1)

76-83: Make the database path configurable.

The task hardcodes a user-specific path (/Users/yujonglee/...), making it unusable for other developers. Consider using an environment variable or a configurable path.

Apply this diff to make it configurable:
   db:
     env:
-      DB: /Users/yujonglee/Library/Application Support/com.hyprnote.nightly/db.sqlite
+      DB: ${DB_PATH:-~/Library/Application Support/com.hyprnote.nightly/db.sqlite}
     cmds:
       - |
           sqlite3 -json "$DB" 'SELECT store FROM main LIMIT 1;' |
           jq -r '.[0].store' |
           jless

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9f7b009 and fe1b223.

⛔ Files ignored due to path filters (3)

.cursor/rules/simple.mdc is excluded by !**/.cursor/**
Cargo.lock is excluded by !**/*.lock
plugins/listener/js/bindings.gen.ts is excluded by !**/*.gen.ts

📒 Files selected for processing (25)

Cargo.toml (1 hunks)
Taskfile.yaml (1 hunks)
apps/desktop/src/components/main/body/sessions/floating/listen.tsx (1 hunks)
apps/desktop/src/components/main/body/sessions/note-input/transcript/shared/hooks.ts (1 hunks)
apps/desktop/src/hooks/useAutoEnhance.ts (1 hunks)
apps/desktop/src/hooks/useRunBatch.ts (0 hunks)
crates/audio-utils/src/lib.rs (4 hunks)
crates/audio/src/mic.rs (1 hunks)
crates/audio/src/speaker/linux.rs (1 hunks)
crates/audio/src/speaker/macos.rs (1 hunks)
crates/audio/src/speaker/mod.rs (1 hunks)
crates/audio/src/speaker/windows.rs (1 hunks)
owhisper/owhisper-client/src/lib.rs (2 hunks)
owhisper/owhisper-interface/src/lib.rs (2 hunks)
plugins/listener/Cargo.toml (1 hunks)
plugins/listener/src/actors/batch.rs (4 hunks)
plugins/listener/src/actors/listener.rs (1 hunks)
plugins/listener/src/actors/processor.rs (1 hunks)
plugins/listener/src/actors/recorder.rs (1 hunks)
plugins/listener/src/actors/session.rs (1 hunks)
plugins/listener/src/actors/source.rs (1 hunks)
plugins/listener/src/events.rs (1 hunks)
plugins/listener/src/ext.rs (4 hunks)
plugins/local-stt/src/server/external.rs (2 hunks)
plugins/local-stt/src/server/internal.rs (1 hunks)

💤 Files with no reviewable changes (1)

apps/desktop/src/hooks/useRunBatch.ts

🧰 Additional context used

🧬 Code graph analysis (10)

crates/audio/src/speaker/windows.rs (4)

crates/audio/src/mic.rs (2)

sample_rate (69-71)

sample_rate (193-195)

crates/audio/src/speaker/linux.rs (2)

sample_rate (10-12)

sample_rate (26-28)

crates/audio/src/speaker/macos.rs (2)

sample_rate (36-38)

sample_rate (94-96)

crates/audio/src/speaker/mod.rs (4)

sample_rate (46-48)

sample_rate (51-53)

sample_rate (99-101)

sample_rate (104-106)

crates/audio/src/speaker/linux.rs (4)

crates/audio/src/mic.rs (2)

sample_rate (69-71)

sample_rate (193-195)

crates/audio/src/speaker/macos.rs (2)

sample_rate (36-38)

sample_rate (94-96)

crates/audio/src/speaker/mod.rs (4)

sample_rate (46-48)

sample_rate (51-53)

sample_rate (99-101)

sample_rate (104-106)

crates/audio/src/speaker/windows.rs (2)

sample_rate (18-20)

sample_rate (65-67)

crates/audio/src/speaker/mod.rs (5)

crates/audio/src/mic.rs (2)

sample_rate (69-71)

sample_rate (193-195)

crates/audio/src/speaker/linux.rs (2)

sample_rate (10-12)

sample_rate (26-28)

crates/audio/src/speaker/macos.rs (2)

sample_rate (36-38)

sample_rate (94-96)

crates/audio/src/speaker/windows.rs (2)

sample_rate (18-20)

sample_rate (65-67)

crates/audio/src/lib.rs (1)

sample_rate (195-201)

owhisper/owhisper-interface/src/lib.rs (1)

crates/ws-utils/src/lib.rs (2)

sample_rate (117-119)

sample_rate (145-147)

apps/desktop/src/components/main/body/sessions/note-input/transcript/shared/hooks.ts (1)

apps/desktop/src/store/tinybase/schema-external.ts (1)

Word (118-118)

crates/audio/src/speaker/macos.rs (5)

crates/audio/src/mic.rs (2)

sample_rate (69-71)

sample_rate (193-195)

crates/audio/src/speaker/linux.rs (2)

sample_rate (10-12)

sample_rate (26-28)

crates/audio/src/speaker/mod.rs (4)

sample_rate (46-48)

sample_rate (51-53)

sample_rate (99-101)

sample_rate (104-106)

crates/audio/src/speaker/windows.rs (2)

sample_rate (18-20)

sample_rate (65-67)

crates/audio/src/lib.rs (1)

sample_rate (195-201)

owhisper/owhisper-client/src/lib.rs (1)

crates/ws-utils/src/lib.rs (2)

sample_rate (117-119)

sample_rate (145-147)

plugins/listener/src/ext.rs (2)

plugins/listener/src/actors/batch.rs (1)

spawn_batch_actor (92-95)

crates/audio-utils/src/lib.rs (1)

audio_file_metadata (119-124)

crates/audio/src/mic.rs (5)

crates/audio/src/speaker/linux.rs (2)

sample_rate (10-12)

sample_rate (26-28)

crates/audio/src/speaker/macos.rs (2)

sample_rate (36-38)

sample_rate (94-96)

crates/audio/src/speaker/mod.rs (4)

sample_rate (46-48)

sample_rate (51-53)

sample_rate (99-101)

sample_rate (104-106)

crates/audio/src/speaker/windows.rs (2)

sample_rate (18-20)

sample_rate (65-67)

crates/audio/src/lib.rs (1)

sample_rate (195-201)

plugins/listener/src/actors/batch.rs (1)

crates/audio-utils/src/lib.rs (1)

chunk_audio_file (185-225)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: ci (macos, macos-14)

🔇 Additional comments (22)

plugins/local-stt/src/server/internal.rs (1)

39-39: LGTM! Necessary for ractor compatibility.

The #[ractor::async_trait] attribute is correctly applied to enable async trait method dispatch for the Actor implementation.

plugins/local-stt/src/server/external.rs (1)

40-40: LGTM! Necessary for ractor compatibility.

The #[ractor::async_trait] attribute is correctly applied to enable async trait method dispatch for the Actor implementation.

apps/desktop/src/components/main/body/sessions/floating/listen.tsx (1)

203-203: Code change verified as correct and consistent across codebase.

The removal of { channels: 1 } is properly implemented. The useRunBatch hook signature has been updated to accept only a file path, and the RunOptions type no longer includes a channels parameter. Both listen.tsx (line 203) and editing-controls.tsx (line 56) now call runBatch consistently with a single argument, delegating audio metadata extraction to the plugin layer.

plugins/listener/src/actors/listener.rs (1)

57-57: LGTM: async_trait annotation added.

This annotation aligns ListenerActor with the project-wide async_trait pattern for Actor implementations, consistent with changes across other actors in this PR.

plugins/listener/src/actors/recorder.rs (1)

37-37: LGTM: async_trait annotation added.

Consistent with the project-wide Actor implementation pattern.

plugins/listener/src/actors/source.rs (1)

53-53: LGTM: async_trait annotation added.

Consistent with the project-wide Actor implementation pattern.

plugins/listener/src/actors/session.rs (1)

61-61: LGTM: async_trait annotation added.

Consistent with the project-wide Actor implementation pattern.

plugins/listener/Cargo.toml (1)

56-58: LGTM: ractor-supervisor dependency added.

This addition supports the supervisor-based actor spawning pattern introduced in this PR, consistent with the workspace-level dependency changes.

plugins/listener/src/events.rs (2)

43-43: Good defensive programming: filtering non-finite values.

The addition of .filter(|x| x.is_finite()) prevents NaN and Inf values from contributing to the maximum calculation, improving robustness of audio amplitude measurements.

51-51: Good defensive programming: filtering non-finite values.

The addition of .filter(|x| x.is_finite()) prevents NaN and Inf values from contributing to the maximum calculation, improving robustness of audio amplitude measurements.

Cargo.toml (1)

135-136: Verification complete: the ractor version downgrade is intentional and required.

The change from ractor 0.15 to 0.14.3 is necessary for compatibility with ractor-supervisor 0.1.9, which requires ractor ^0.14.3 (>= 0.14.3, < 0.15.0). ractor-supervisor 0.1.9 requires ractor ^0.14.3. The Cargo.lock shows ractor 0.14.7 resolved, which is compatible. No action needed.

plugins/listener/src/actors/processor.rs (1)

58-112: LGTM! Clean migration to ractor's async actor pattern.

The addition of #[ractor::async_trait] and the trait-based actor implementation correctly align with ractor's asynchronous actor model. The business logic remains unchanged, and the refactor ensures lifecycle management compatibility.

crates/audio/src/speaker/macos.rs (1)

94-96: LGTM! Exposes dynamic sample rate from macOS audio tap.

The implementation correctly retrieves the sample rate from the underlying audio tap descriptor. The use of unwrap() is consistent with the existing pattern at line 153 and is acceptable given the tap is initialized in new().

crates/audio/src/speaker/linux.rs (1)

10-12: LGTM! Reasonable stub for Linux speaker input.

The fixed sample rate of 16000 Hz is appropriate for the Linux stub implementation, aligning with the default value in the broader audio pipeline.

crates/audio/src/mic.rs (1)

69-71: LGTM! Clean accessor for microphone sample rate.

The implementation correctly exposes the device's configured sample rate from the underlying CPAL config. This enables dynamic sample rate handling across the audio pipeline.

owhisper/owhisper-client/src/lib.rs (2)

66-70: LGTM! Correctly uses dynamic sample rate for batch requests.

The change from the fixed RESAMPLED_SAMPLE_RATE_HZ constant to params.sample_rate enables per-request sample rate configuration, aligning with the PR objective.

107-114: LGTM! Dynamic sample rate correctly propagated to streaming requests.

The change ensures that streaming listen requests use the configured sample rate from params.sample_rate, maintaining consistency with the batch implementation.

owhisper/owhisper-interface/src/lib.rs (1)

140-140: LGTM! Clean API addition for explicit sample rate control.

The new sample_rate field with a sensible default of 16000 Hz enables explicit configuration while maintaining backward compatibility through the Default implementation.

Also applies to: 156-156

crates/audio/src/speaker/mod.rs (2)

45-53: LGTM! Clean cross-platform abstraction for sample rate access.

The conditional compilation correctly delegates to platform-specific implementations on macOS/Windows and returns a sentinel value (0) for unsupported platforms, consistent with the existing patterns in this module.

98-106: LGTM! AsyncSource trait correctly implements sample rate accessor.

The trait implementation maintains consistency with the struct's public method and enables stream consumers to query the sample rate dynamically.

crates/audio/src/speaker/windows.rs (2)

18-20: Fixed sample rate matches Windows audio configuration.

The hardcoded 44100 Hz matches the WaveFormat configuration at line 78. While autoconvert: true (line 83) may cause internal format conversion, returning the requested rate is appropriate for this use case.

65-67: LGTM! Consistent sample rate across Windows speaker interfaces.

The fixed return value maintains consistency with SpeakerInput and the requested audio format configuration.

plugins/listener/src/actors/batch.rs

yujonglee added 10 commits November 11, 2025 10:26

chores

a6f359d

deps

b3d83ec

slight cursorrule change

fa1e30e

add sample rate in input level

52cc55f

logging chores

1c31954

update owhisper to accept sample_rate

d250e5f

small unwrap fix

41d7c73

read audio metadata when doing batching

0afdd6f

fix compile error

b1ab98b

done

fe1b223

coderabbitai bot reviewed Nov 11, 2025

View reviewed changes

plugins/listener/src/actors/batch.rs Show resolved Hide resolved

better deepgram support for batch

5ef7309

yujonglee merged commit 0df8e3d into main Nov 11, 2025
3 of 4 checks passed

yujonglee deleted the explicit-sample-rate-owhisper branch November 11, 2025 11:39

coderabbitai bot mentioned this pull request Nov 13, 2025

feat: Linux Support #1659

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Explicit sample_rate in owhisper client #1651

Explicit sample_rate in owhisper client #1651

Uh oh!

yujonglee commented Nov 11, 2025

Uh oh!

coderabbitai bot commented Nov 11, 2025 •

edited

Loading

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Explicit sample_rate in owhisper client #1651

Explicit sample_rate in owhisper client #1651

Uh oh!

Conversation

yujonglee commented Nov 11, 2025

Uh oh!

coderabbitai bot commented Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai bot commented Nov 11, 2025 •

edited

Loading