Skip to content

Conversation

@yujonglee
Copy link
Contributor

No description provided.

@coderabbitai
Copy link

coderabbitai bot commented Nov 11, 2025

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Walkthrough

Replaces fixed resample constants with per-file audio metadata and time-based chunking, adds sample_rate accessors, removes per-call channels from batch options, introduces ractor-supervisor spawn path for batch actors, and annotates multiple Actor implementations with #[ractor::async_trait].

Changes

Cohort / File(s) Summary
Workspace manifests
Cargo.toml, plugins/listener/Cargo.toml
ractor dependency changed to { version = "0.14.3" }; added ractor-supervisor = "0.1.9" and tracing.
Task automation
Taskfile.yaml
Added db task that queries a SQLite DB and opens the first store value via jless.
Chunking & metadata (audio-utils)
crates/audio-utils/src/lib.rs
Added AudioMetadata (sample_rate, channels); audio_file_metadata(); metadata_from_source(); ChunkedAudio now includes frame_count and metadata; chunk_audio_file(path, chunk_ms) now chunks by ms using metadata.
Audio sample rate accessors
crates/audio/src/mic.rs, crates/audio/src/speaker/* (mod.rs, linux.rs, macos.rs, windows.rs)
Added pub fn sample_rate(&self) -> u32 across MicInput and SpeakerInput implementations; platform-specific returns/delegation.
owhisper interface & client
owhisper/owhisper-interface/src/lib.rs, owhisper/owhisper-client/src/lib.rs, owhisper/owhisper-client/src/batch.rs
ListenParams gains pub sample_rate: u32 (default 16000); client and batch URL builders use per-request sample_rate; decode_audio_to_linear16 now returns bytes plus sample_rate and forces mono averaging.
Listener plugin — actor spawn & params
plugins/listener/src/ext.rs, plugins/listener/src/actors/batch.rs
Replaced direct Actor::spawn with spawn_batch_actor(args) (supervisor-based); removed channels from BatchParams; derive channels/sample_rate from audio metadata and use metadata in ListenParams and duration/frame calculations.
Listener actor async annotations
plugins/listener/src/actors/{listener,processor,recorder,session,source}.rs
Added #[ractor::async_trait] to Actor impls; processor.rs converted to async Actor impl with pre_start and handle.
Listener events
plugins/listener/src/events.rs
Filter out non-finite samples (NaN/Inf) when computing mic/speaker magnitudes for SessionEvent.
Local STT actors
plugins/local-stt/src/server/{external.rs,internal.rs}
Annotated Actor impls with #[ractor::async_trait]; tightened log filtering in external stdout/stderr handling.
Desktop changes (hooks/components)
apps/desktop/src/hooks/useRunBatch.ts, apps/desktop/src/components/.../listen.tsx, apps/desktop/src/hooks/useAutoEnhance.ts, apps/desktop/src/components/.../shared/hooks.ts
Removed channels?: number from RunOptions and omitted channels from runBatch calls; removed effect in useAutoEnhance that mutated editor based on enhanceTask.status; small refactor in useFinalWords.

Sequence Diagram(s)

sequenceDiagram
    participant Desktop as Desktop App
    participant Ext as plugins/listener::ext
    participant Supervisor as Batch Supervisor
    participant Batch as Batch Actor
    participant Audio as audio-utils
    participant Listen as owhisper Client

    Desktop->>Ext: request start batch (path, args)
    Ext->>Audio: audio_file_metadata(path) (blocking task)
    Audio-->>Ext: AudioMetadata { sample_rate, channels }
    Ext->>Supervisor: spawn_batch_actor(BatchArgs{..., metadata})
    Supervisor-->>Batch: start actor
    Batch->>Audio: chunk_audio_file(path, chunk_ms)
    Audio-->>Batch: ChunkedAudio { chunks, sample_count, frame_count, metadata }
    Batch->>Listen: build client with metadata.sample_rate & metadata.channels
    Batch->>Listen: stream chunks for transcription
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

  • Pay special attention to:
    • crates/audio-utils/src/lib.rs — correctness of sample/frame calculations, clamping, mono conversion, and returned metadata.
    • plugins/listener/src/{ext.rs,actors/batch.rs} — supervisor spawn changes, removed channels field, and duration/frame_count usage.
    • owhisper changes — URL parameter composition and decode_audio_to_linear16 sample rate handling and mono downmix.
    • ractor downgrade and addition of ractor-supervisor — ensure spawn/Actor trait semantics remain compatible with annotated async_trait changes.

Possibly related PRs

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 13.16% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Description check ❓ Inconclusive No pull request description was provided by the author. Consider adding a description explaining the motivation, scope, and impact of making sample_rate explicit across the audio pipeline.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Explicit sample_rate in owhisper client' accurately reflects the main changes in this PR, which center on making sample_rate an explicit, configurable parameter throughout the audio pipeline rather than a fixed constant.

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fe1b223 and 5ef7309.

📒 Files selected for processing (2)
  • owhisper/owhisper-client/src/batch.rs (3 hunks)
  • owhisper/owhisper-client/src/lib.rs (2 hunks)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
apps/desktop/src/hooks/useAutoEnhance.ts (1)

56-64: Missing dependencies in useEffect may cause stale closure issues.

The effect uses tab and updateSessionTabState but doesn't include them in the dependency array. If tab changes (e.g., when switching sessions), the effect won't re-run and may update the wrong session's state.

Apply this diff to include the missing dependencies:

-  }, [listenerStatus, prevListenerStatus, startEnhance]);
+  }, [listenerStatus, prevListenerStatus, startEnhance, tab, updateSessionTabState]);

Alternatively, if updateSessionTabState is guaranteed to be stable (which is common for Zustand store methods), you can include just tab:

-  }, [listenerStatus, prevListenerStatus, startEnhance]);
+  }, [listenerStatus, prevListenerStatus, startEnhance, tab]);
owhisper/owhisper-client/src/lib.rs (1)

13-13: Remove unused constant.

The RESAMPLED_SAMPLE_RATE_HZ constant is no longer used after the migration to dynamic sample rate configuration and should be removed to avoid confusion.

Apply this diff:

-const RESAMPLED_SAMPLE_RATE_HZ: u32 = 16_000;
-
🧹 Nitpick comments (3)
apps/desktop/src/components/main/body/sessions/note-input/transcript/shared/hooks.ts (1)

16-20: Consider removing unnecessary intermediate variable.

The local ret variable stores the result of a simple mapping and sort operation without adding clarity or enabling reuse. Returning the computation directly from useMemo is more idiomatic for React hooks.

  return useMemo(() => {
    if (!resultTable) {
      return [];
    }

-    const ret = Object.entries(resultTable)
-      .map(([wordId, row]) => ({ ...(row as unknown as main.Word), id: wordId }))
-      .sort((a, b) => a.start_ms - b.start_ms);
-
-    return ret;
+    return Object.entries(resultTable)
+      .map(([wordId, row]) => ({ ...(row as unknown as main.Word), id: wordId }))
+      .sort((a, b) => a.start_ms - b.start_ms);
  }, [resultTable]);
plugins/local-stt/src/server/external.rs (1)

63-68: Consider refactoring log filters for better maintainability.

The log filtering logic now has four separate string checks. While functional, this could be refactored to use a pattern list for improved maintainability.

Consider this refactor:

                         if let Ok(text) = String::from_utf8(bytes) {
                             let text = text.trim();
-                            if !text.is_empty()
-                                && !text.contains("[WebSocket]")
-                                && !text.contains("Sent interim text:")
-                                && !text.contains("[TranscriptionHandler]")
-                                && !text.contains("/v1/status")
-                            {
+                            const EXCLUDED_PATTERNS: &[&str] = &[
+                                "[WebSocket]",
+                                "Sent interim text:",
+                                "[TranscriptionHandler]",
+                                "/v1/status",
+                            ];
+                            
+                            if !text.is_empty() && !EXCLUDED_PATTERNS.iter().any(|p| text.contains(p)) {
                                 tracing::info!("{}", text);
                             }
                         }
Taskfile.yaml (1)

76-83: Make the database path configurable.

The task hardcodes a user-specific path (/Users/yujonglee/...), making it unusable for other developers. Consider using an environment variable or a configurable path.

Apply this diff to make it configurable:

   db:
     env:
-      DB: /Users/yujonglee/Library/Application Support/com.hyprnote.nightly/db.sqlite
+      DB: ${DB_PATH:-~/Library/Application Support/com.hyprnote.nightly/db.sqlite}
     cmds:
       - |
           sqlite3 -json "$DB" 'SELECT store FROM main LIMIT 1;' |
           jq -r '.[0].store' |
           jless
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9f7b009 and fe1b223.

⛔ Files ignored due to path filters (3)
  • .cursor/rules/simple.mdc is excluded by !**/.cursor/**
  • Cargo.lock is excluded by !**/*.lock
  • plugins/listener/js/bindings.gen.ts is excluded by !**/*.gen.ts
📒 Files selected for processing (25)
  • Cargo.toml (1 hunks)
  • Taskfile.yaml (1 hunks)
  • apps/desktop/src/components/main/body/sessions/floating/listen.tsx (1 hunks)
  • apps/desktop/src/components/main/body/sessions/note-input/transcript/shared/hooks.ts (1 hunks)
  • apps/desktop/src/hooks/useAutoEnhance.ts (1 hunks)
  • apps/desktop/src/hooks/useRunBatch.ts (0 hunks)
  • crates/audio-utils/src/lib.rs (4 hunks)
  • crates/audio/src/mic.rs (1 hunks)
  • crates/audio/src/speaker/linux.rs (1 hunks)
  • crates/audio/src/speaker/macos.rs (1 hunks)
  • crates/audio/src/speaker/mod.rs (1 hunks)
  • crates/audio/src/speaker/windows.rs (1 hunks)
  • owhisper/owhisper-client/src/lib.rs (2 hunks)
  • owhisper/owhisper-interface/src/lib.rs (2 hunks)
  • plugins/listener/Cargo.toml (1 hunks)
  • plugins/listener/src/actors/batch.rs (4 hunks)
  • plugins/listener/src/actors/listener.rs (1 hunks)
  • plugins/listener/src/actors/processor.rs (1 hunks)
  • plugins/listener/src/actors/recorder.rs (1 hunks)
  • plugins/listener/src/actors/session.rs (1 hunks)
  • plugins/listener/src/actors/source.rs (1 hunks)
  • plugins/listener/src/events.rs (1 hunks)
  • plugins/listener/src/ext.rs (4 hunks)
  • plugins/local-stt/src/server/external.rs (2 hunks)
  • plugins/local-stt/src/server/internal.rs (1 hunks)
💤 Files with no reviewable changes (1)
  • apps/desktop/src/hooks/useRunBatch.ts
🧰 Additional context used
🧬 Code graph analysis (10)
crates/audio/src/speaker/windows.rs (4)
crates/audio/src/mic.rs (2)
  • sample_rate (69-71)
  • sample_rate (193-195)
crates/audio/src/speaker/linux.rs (2)
  • sample_rate (10-12)
  • sample_rate (26-28)
crates/audio/src/speaker/macos.rs (2)
  • sample_rate (36-38)
  • sample_rate (94-96)
crates/audio/src/speaker/mod.rs (4)
  • sample_rate (46-48)
  • sample_rate (51-53)
  • sample_rate (99-101)
  • sample_rate (104-106)
crates/audio/src/speaker/linux.rs (4)
crates/audio/src/mic.rs (2)
  • sample_rate (69-71)
  • sample_rate (193-195)
crates/audio/src/speaker/macos.rs (2)
  • sample_rate (36-38)
  • sample_rate (94-96)
crates/audio/src/speaker/mod.rs (4)
  • sample_rate (46-48)
  • sample_rate (51-53)
  • sample_rate (99-101)
  • sample_rate (104-106)
crates/audio/src/speaker/windows.rs (2)
  • sample_rate (18-20)
  • sample_rate (65-67)
crates/audio/src/speaker/mod.rs (5)
crates/audio/src/mic.rs (2)
  • sample_rate (69-71)
  • sample_rate (193-195)
crates/audio/src/speaker/linux.rs (2)
  • sample_rate (10-12)
  • sample_rate (26-28)
crates/audio/src/speaker/macos.rs (2)
  • sample_rate (36-38)
  • sample_rate (94-96)
crates/audio/src/speaker/windows.rs (2)
  • sample_rate (18-20)
  • sample_rate (65-67)
crates/audio/src/lib.rs (1)
  • sample_rate (195-201)
owhisper/owhisper-interface/src/lib.rs (1)
crates/ws-utils/src/lib.rs (2)
  • sample_rate (117-119)
  • sample_rate (145-147)
apps/desktop/src/components/main/body/sessions/note-input/transcript/shared/hooks.ts (1)
apps/desktop/src/store/tinybase/schema-external.ts (1)
  • Word (118-118)
crates/audio/src/speaker/macos.rs (5)
crates/audio/src/mic.rs (2)
  • sample_rate (69-71)
  • sample_rate (193-195)
crates/audio/src/speaker/linux.rs (2)
  • sample_rate (10-12)
  • sample_rate (26-28)
crates/audio/src/speaker/mod.rs (4)
  • sample_rate (46-48)
  • sample_rate (51-53)
  • sample_rate (99-101)
  • sample_rate (104-106)
crates/audio/src/speaker/windows.rs (2)
  • sample_rate (18-20)
  • sample_rate (65-67)
crates/audio/src/lib.rs (1)
  • sample_rate (195-201)
owhisper/owhisper-client/src/lib.rs (1)
crates/ws-utils/src/lib.rs (2)
  • sample_rate (117-119)
  • sample_rate (145-147)
plugins/listener/src/ext.rs (2)
plugins/listener/src/actors/batch.rs (1)
  • spawn_batch_actor (92-95)
crates/audio-utils/src/lib.rs (1)
  • audio_file_metadata (119-124)
crates/audio/src/mic.rs (5)
crates/audio/src/speaker/linux.rs (2)
  • sample_rate (10-12)
  • sample_rate (26-28)
crates/audio/src/speaker/macos.rs (2)
  • sample_rate (36-38)
  • sample_rate (94-96)
crates/audio/src/speaker/mod.rs (4)
  • sample_rate (46-48)
  • sample_rate (51-53)
  • sample_rate (99-101)
  • sample_rate (104-106)
crates/audio/src/speaker/windows.rs (2)
  • sample_rate (18-20)
  • sample_rate (65-67)
crates/audio/src/lib.rs (1)
  • sample_rate (195-201)
plugins/listener/src/actors/batch.rs (1)
crates/audio-utils/src/lib.rs (1)
  • chunk_audio_file (185-225)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: ci (macos, macos-14)
🔇 Additional comments (22)
plugins/local-stt/src/server/internal.rs (1)

39-39: LGTM! Necessary for ractor compatibility.

The #[ractor::async_trait] attribute is correctly applied to enable async trait method dispatch for the Actor implementation.

plugins/local-stt/src/server/external.rs (1)

40-40: LGTM! Necessary for ractor compatibility.

The #[ractor::async_trait] attribute is correctly applied to enable async trait method dispatch for the Actor implementation.

apps/desktop/src/components/main/body/sessions/floating/listen.tsx (1)

203-203: Code change verified as correct and consistent across codebase.

The removal of { channels: 1 } is properly implemented. The useRunBatch hook signature has been updated to accept only a file path, and the RunOptions type no longer includes a channels parameter. Both listen.tsx (line 203) and editing-controls.tsx (line 56) now call runBatch consistently with a single argument, delegating audio metadata extraction to the plugin layer.

plugins/listener/src/actors/listener.rs (1)

57-57: LGTM: async_trait annotation added.

This annotation aligns ListenerActor with the project-wide async_trait pattern for Actor implementations, consistent with changes across other actors in this PR.

plugins/listener/src/actors/recorder.rs (1)

37-37: LGTM: async_trait annotation added.

Consistent with the project-wide Actor implementation pattern.

plugins/listener/src/actors/source.rs (1)

53-53: LGTM: async_trait annotation added.

Consistent with the project-wide Actor implementation pattern.

plugins/listener/src/actors/session.rs (1)

61-61: LGTM: async_trait annotation added.

Consistent with the project-wide Actor implementation pattern.

plugins/listener/Cargo.toml (1)

56-58: LGTM: ractor-supervisor dependency added.

This addition supports the supervisor-based actor spawning pattern introduced in this PR, consistent with the workspace-level dependency changes.

plugins/listener/src/events.rs (2)

43-43: Good defensive programming: filtering non-finite values.

The addition of .filter(|x| x.is_finite()) prevents NaN and Inf values from contributing to the maximum calculation, improving robustness of audio amplitude measurements.


51-51: Good defensive programming: filtering non-finite values.

The addition of .filter(|x| x.is_finite()) prevents NaN and Inf values from contributing to the maximum calculation, improving robustness of audio amplitude measurements.

Cargo.toml (1)

135-136: Verification complete: the ractor version downgrade is intentional and required.

The change from ractor 0.15 to 0.14.3 is necessary for compatibility with ractor-supervisor 0.1.9, which requires ractor ^0.14.3 (>= 0.14.3, < 0.15.0). ractor-supervisor 0.1.9 requires ractor ^0.14.3. The Cargo.lock shows ractor 0.14.7 resolved, which is compatible. No action needed.

plugins/listener/src/actors/processor.rs (1)

58-112: LGTM! Clean migration to ractor's async actor pattern.

The addition of #[ractor::async_trait] and the trait-based actor implementation correctly align with ractor's asynchronous actor model. The business logic remains unchanged, and the refactor ensures lifecycle management compatibility.

crates/audio/src/speaker/macos.rs (1)

94-96: LGTM! Exposes dynamic sample rate from macOS audio tap.

The implementation correctly retrieves the sample rate from the underlying audio tap descriptor. The use of unwrap() is consistent with the existing pattern at line 153 and is acceptable given the tap is initialized in new().

crates/audio/src/speaker/linux.rs (1)

10-12: LGTM! Reasonable stub for Linux speaker input.

The fixed sample rate of 16000 Hz is appropriate for the Linux stub implementation, aligning with the default value in the broader audio pipeline.

crates/audio/src/mic.rs (1)

69-71: LGTM! Clean accessor for microphone sample rate.

The implementation correctly exposes the device's configured sample rate from the underlying CPAL config. This enables dynamic sample rate handling across the audio pipeline.

owhisper/owhisper-client/src/lib.rs (2)

66-70: LGTM! Correctly uses dynamic sample rate for batch requests.

The change from the fixed RESAMPLED_SAMPLE_RATE_HZ constant to params.sample_rate enables per-request sample rate configuration, aligning with the PR objective.


107-114: LGTM! Dynamic sample rate correctly propagated to streaming requests.

The change ensures that streaming listen requests use the configured sample rate from params.sample_rate, maintaining consistency with the batch implementation.

owhisper/owhisper-interface/src/lib.rs (1)

140-140: LGTM! Clean API addition for explicit sample rate control.

The new sample_rate field with a sensible default of 16000 Hz enables explicit configuration while maintaining backward compatibility through the Default implementation.

Also applies to: 156-156

crates/audio/src/speaker/mod.rs (2)

45-53: LGTM! Clean cross-platform abstraction for sample rate access.

The conditional compilation correctly delegates to platform-specific implementations on macOS/Windows and returns a sentinel value (0) for unsupported platforms, consistent with the existing patterns in this module.


98-106: LGTM! AsyncSource trait correctly implements sample rate accessor.

The trait implementation maintains consistency with the struct's public method and enables stream consumers to query the sample rate dynamically.

crates/audio/src/speaker/windows.rs (2)

18-20: Fixed sample rate matches Windows audio configuration.

The hardcoded 44100 Hz matches the WaveFormat configuration at line 78. While autoconvert: true (line 83) may cause internal format conversion, returning the requested rate is appropriate for this use case.


65-67: LGTM! Consistent sample rate across Windows speaker interfaces.

The fixed return value maintains consistency with SpeakerInput and the requested audio format configuration.

@yujonglee yujonglee merged commit 0df8e3d into main Nov 11, 2025
3 of 4 checks passed
@yujonglee yujonglee deleted the explicit-sample-rate-owhisper branch November 11, 2025 11:39
@coderabbitai coderabbitai bot mentioned this pull request Nov 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants