Skip to content

Conversation

@yujonglee
Copy link
Contributor

@yujonglee yujonglee commented Aug 9, 2025

Summary by cubic

Added Deepgram v2 compatibility by updating the streaming interface, introducing a new AM client crate, and refactoring model and API handling across the codebase.

  • New Features

    • Added the am crate for interacting with the AG/Deepgram v2 API.
    • Updated streaming protocols to support new transcript response formats and dual-channel audio.
    • Refactored model selection and download logic to use the new WhisperModel type.
  • Refactors

    • Replaced legacy types and interfaces with unified StreamResponse and related structures.
    • Updated dependencies and internal APIs to support the new Deepgram v2 workflow.

@coderabbitai
Copy link

coderabbitai bot commented Aug 9, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Walkthrough

This change introduces two new Rust workspace crates, hypr-am and hypr-moonshine, and adds them to the workspace. It updates the local STT plugin and desktop app to use a new WhisperModel type in place of SupportedModel, and integrates a new am_api_key field into the local STT plugin state, with corresponding workflow and environment variable updates. Several crates and plugins are refactored for improved type consistency, error handling, and streaming response structures.

Changes

Cohort / File(s) Change Summary
Desktop App STT Model Type Refactor
apps/desktop/src/components/settings/components/ai/stt-view.tsx, apps/desktop/src/components/toast/model-select.tsx, apps/desktop/src/components/toast/shared.tsx, apps/desktop/src/components/welcome-modal/download-progress-view.tsx, apps/desktop/src/components/welcome-modal/index.tsx, apps/desktop/src/components/welcome-modal/model-selection-view.tsx
All usages of the STT model type are changed from SupportedModel to WhisperModel, updating imports, type annotations, function parameters, and state variables for consistency.
Local STT Plugin Model Type and API Key
plugins/local-stt/js/bindings.gen.ts, plugins/local-stt/src/lib.rs, .github/workflows/desktop_cd.yaml
The type alias SupportedModel is renamed to WhisperModel throughout JS bindings and command signatures. The Word alias becomes Word2. A new am_api_key field is added to the plugin state, initialized from an environment variable or set to None in debug. The workflow adds AM_API_KEY as a secret.
Workspace and Dependency Updates
Cargo.toml, crates/am/Cargo.toml, crates/moonshine/Cargo.toml, crates/file/Cargo.toml, crates/transcribe-moonshine/Cargo.toml, plugins/local-stt/Cargo.toml
Adds hypr-am and hypr-moonshine as workspace dependencies. Updates dependencies in several crates to use workspace settings, removes or replaces dependencies in transcribe-moonshine, and adds hypr-am to the local STT plugin.
New Crate: hypr-am
crates/am/Cargo.toml, crates/am/src/client.rs, crates/am/src/error.rs, crates/am/src/lib.rs, crates/am/src/types.rs
Introduces the hypr-am crate with an async HTTP client, error types, request/response models, and a builder for initialization. Exports all public types and provides a basic async test.
New Crate: hypr-moonshine
crates/moonshine/Cargo.toml, crates/moonshine/src/error.rs, crates/moonshine/src/lib.rs
Adds the hypr-moonshine crate with error handling and module exports.
Transcribe Moonshine Refactor
crates/transcribe-moonshine/src/error.rs, crates/transcribe-moonshine/src/lib.rs, crates/transcribe-moonshine/src/service/streaming.rs
Replaces hypr-onnx error variants with a single Moonshine variant, removes public export of the model module, and updates streaming logic to use a richer StreamResponse structure.
Streaming Response Refactor
owhisper/owhisper-client/src/lib.rs, owhisper/owhisper-server/src/commands/run/realtime.rs, owhisper/owhisper-server/src/commands/run/state.rs, owhisper/owhisper-interface/src/stream.rs, crates/transcribe-moonshine/src/service/streaming.rs, crates/transcribe-whisper-local/src/service/streaming.rs, plugins/listener/src/fsm.rs
Unified streaming output from ListenOutputChunk to StreamResponse, updating all client, server, and plugin code to handle the new structure. Includes changes to channel types, function signatures, and message serialization.
File Crate and Audio Processing
crates/file/Cargo.toml, crates/file/src/lib.rs, crates/ws-utils/src/lib.rs
Updates dependencies, and changes file download logic to handle missing Content-Length headers gracefully. Adds a deinterleave_audio helper and updates audio message processing to support multi-channel audio.
Whisper Local Model and Logging
crates/whisper-local/src/model.rs
Adds a language field to Segment, a new accessor method, and logs completed transcription text.
Local STT Plugin Server Refactor and API
plugins/local-stt/src/commands.rs, plugins/local-stt/src/ext.rs, plugins/local-stt/src/lib.rs, plugins/local-stt/build.rs, plugins/local-stt/permissions/*, plugins/local-stt/src/server/*
Removes old server commands and permissions related to server status and restart. Introduces a unified server management interface with optional ServerType parameter for internal and external servers. Adds new internal and external server modules with typed handles and startup logic. Updates plugin state to track API key and separate internal/external server handles.
Desktop App and Plugin Integration Updates
.github/workflows/desktop_cd.yaml, apps/desktop/src-tauri/src/ext.rs, apps/desktop/src/routes/app.tsx, plugins/connector/src/ext.rs
Updates CI workflow to add AM_API_KEY secret. Changes calls to start_server and stop_server to accept optional server type parameter. Refactors restart logic to explicit stop-then-start calls for the internal server. Simplifies API base retrieval logic using new unified plugin API.
Miscellaneous and Test Updates
crates/transcribe-deepgram/src/lib.rs, crates/transcribe-whisper-local/src/lib.rs, owhisper/owhisper-server/src/commands/run/mod.rs, owhisper/owhisper-server/src/main.rs, owhisper/owhisper-interface/Cargo.toml
Removes unnecessary test code, updates argument passing, improves error handling in the main function, and adjusts dependency orderings.

Sequence Diagram(s)

sequenceDiagram
    participant DesktopApp
    participant LocalSTTPlugin
    participant hypr-am
    participant hypr-moonshine

    DesktopApp->>LocalSTTPlugin: Set/Get Current Model (WhisperModel)
    LocalSTTPlugin->>LocalSTTPlugin: Use am_api_key for initialization
    LocalSTTPlugin->>hypr-am: (On request) Initialize or check status via HTTP
    LocalSTTPlugin->>hypr-moonshine: (If needed) Load or use moonshine models
Loading
sequenceDiagram
    participant Client
    participant Server
    participant Plugin

    Client->>Server: Send audio stream (binary, channels=1 or 2)
    Server->>Plugin: Forward/process audio data
    Plugin->>Server: Return StreamResponse (transcript, metadata, words, etc.)
    Server->>Client: Send StreamResponse (JSON)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • fastrepl/hyprnote#1102: Adds handling for a new DualAudio variant in the listener's realtime input stream and refactors the listen client to support dual audio streams, which is complementary to the model and API key changes in this PR.
  • fastrepl/hyprnote#1307: Shares related workspace dependency additions and STT model type refactor, overlapping in dependency management and type usage for the local STT plugin.
  • fastrepl/hyprnote#1191: Modifies local STT server restart logic with explicit stop/start calls, related to this PR's refactor of restart mechanism with server type handling.

Note

🔌 MCP (Model Context Protocol) integration is now available in Early Access!

Pro users can now connect to remote MCP servers under the Integrations page to get reviews and chat conversations that understand additional development context.


📜 Recent review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5d29237 and 4649bb7.

📒 Files selected for processing (24)
  • .github/workflows/desktop_cd.yaml (1 hunks)
  • apps/desktop/src-tauri/src/ext.rs (1 hunks)
  • apps/desktop/src/components/settings/components/ai/stt-view.tsx (3 hunks)
  • apps/desktop/src/components/toast/shared.tsx (3 hunks)
  • apps/desktop/src/components/welcome-modal/download-progress-view.tsx (3 hunks)
  • apps/desktop/src/components/welcome-modal/index.tsx (4 hunks)
  • apps/desktop/src/locales/en/messages.po (3 hunks)
  • apps/desktop/src/locales/ko/messages.po (3 hunks)
  • apps/desktop/src/routes/app.tsx (1 hunks)
  • crates/am/src/client.rs (1 hunks)
  • owhisper/owhisper-interface/src/lib.rs (1 hunks)
  • plugins/connector/src/ext.rs (1 hunks)
  • plugins/local-stt/build.rs (0 hunks)
  • plugins/local-stt/js/bindings.gen.ts (2 hunks)
  • plugins/local-stt/permissions/autogenerated/reference.md (0 hunks)
  • plugins/local-stt/permissions/default.toml (0 hunks)
  • plugins/local-stt/permissions/schemas/schema.json (1 hunks)
  • plugins/local-stt/src/commands.rs (2 hunks)
  • plugins/local-stt/src/error.rs (1 hunks)
  • plugins/local-stt/src/ext.rs (2 hunks)
  • plugins/local-stt/src/lib.rs (5 hunks)
  • plugins/local-stt/src/server/external.rs (1 hunks)
  • plugins/local-stt/src/server/internal.rs (2 hunks)
  • plugins/local-stt/src/server/mod.rs (1 hunks)
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch deepgram-compat-v2

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cubic analysis

8 issues found across 37 files • Review in cubic

React with 👍 or 👎 to teach cubic. You can also tag @cubic-dev-ai to give feedback, ask questions, or re-run the review.

};

let msg = Message::Text(serde_json::to_string(&data).unwrap().into());
let msg = Message::Text(serde_json::to_string(&response).unwrap().into());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unwrap() will panic on serialization failure, potentially tearing down the WebSocket connection; propagate or log the error instead of panicking. (This reflects your team's feedback about avoiding panics in production code.)

Prompt for AI agents
Address the following comment on crates/transcribe-whisper-local/src/service/streaming.rs at line 253:

<comment>unwrap() will panic on serialization failure, potentially tearing down the WebSocket connection; propagate or log the error instead of panicking. (This reflects your team&#39;s feedback about avoiding panics in production code.)</comment>

<file context>
@@ -200,37 +200,57 @@ async fn process_transcription_stream(
 
                 let meta = chunk.meta();
                 let text = chunk.text().to_string();
-                let start = chunk.start() as u64;
-                let duration = chunk.duration() as u64;
-                let confidence = chunk.confidence();
+                let language = chunk.language().map(|s| s.to_string()).map(|s| vec![s]).unwrap_or_default();
+                let start_f64 = chunk.start() as f64;
+                let duration_f64 = chunk.duration() as f64;
</file context>

confidence,
}],
},
metadata: Metadata::default(),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A new Metadata instance is created for every chunk, producing different request_id/model_uuid values within a single stream; these identifiers should remain constant across the connection to let clients correlate results.

Prompt for AI agents
Address the following comment on crates/transcribe-whisper-local/src/service/streaming.rs at line 249:

<comment>A new Metadata instance is created for every chunk, producing different request_id/model_uuid values within a single stream; these identifiers should remain constant across the connection to let clients correlate results.</comment>

<file context>
@@ -200,37 +200,57 @@ async fn process_transcription_stream(
 
                 let meta = chunk.meta();
                 let text = chunk.text().to_string();
-                let start = chunk.start() as u64;
-                let duration = chunk.duration() as u64;
-                let confidence = chunk.confidence();
+                let language = chunk.language().map(|s| s.to_string()).map(|s| vec![s]).unwrap_or_default();
+                let start_f64 = chunk.start() as f64;
+                let duration_f64 = chunk.duration() as f64;
</file context>

while let Some(chunk) = stream.next().await {
let msg = Message::Text(serde_json::to_string(&chunk).unwrap().into());
while let Some(response) = stream.next().await {
let msg = Message::Text(serde_json::to_string(&response).unwrap().into());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using unwrap() on JSON serialization can panic and crash the service; handle the Result instead

Prompt for AI agents
Address the following comment on crates/transcribe-moonshine/src/service/streaming.rs at line 211:

<comment>Using unwrap() on JSON serialization can panic and crash the service; handle the Result instead</comment>

<file context>
@@ -205,10 +205,10 @@ async fn handle_dual_channel(
 
 async fn process_transcription_stream(
     mut ws_sender: futures_util::stream::SplitSink&lt;WebSocket, Message&gt;,
-    mut stream: Pin&lt;Box&lt;dyn futures_util::Stream&lt;Item = ListenOutputChunk&gt; + Send&gt;&gt;,
+    mut stream: Pin&lt;Box&lt;dyn futures_util::Stream&lt;Item = StreamResponse&gt; + Send&gt;&gt;,
 ) {
-    while let Some(chunk) = stream.next().await {
-        let msg = Message::Text(serde_json::to_string(&amp;chunk).unwrap().into());
+    while let Some(response) = stream.next().await {
</file context>


#[tokio::test]
#[ignore]
// cargo test -p owhisper-client test_client_deepgram -- --nocapture
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Integration test is no longer marked #[ignore], meaning it will execute during every cargo test run and fail if network connectivity or the DEEPGRAM_API_KEY env var is missing (Based on your team's feedback about keeping flaky/external-dependency tests opt-in).

Prompt for AI agents
Address the following comment on owhisper/owhisper-client/src/lib.rs at line 216:

<comment>Integration test is no longer marked `#[ignore]`, meaning it will execute during every `cargo test` run and fail if network connectivity or the DEEPGRAM_API_KEY env var is missing (Based on your team&#39;s feedback about keeping flaky/external-dependency tests opt-in).</comment>

<file context>
@@ -187,7 +213,7 @@ mod tests {
     use hypr_audio_utils::AudioFormatExt;
 
     #[tokio::test]
-    #[ignore]
+    // cargo test -p owhisper-client test_client_deepgram -- --nocapture
     async fn test_client_deepgram() {
         let audio = rodio::Decoder::new(std::io::BufReader::new(
</file context>

run: |
chmod +x apps/desktop/src-tauri/binaries/stt-${{ matrix.target }}
./scripts/sidecar.sh "apps/desktop/${{ env.TAURI_CONF_PATH }}" "binaries/stt"
run: chmod +x ./apps/desktop/src-tauri/binaries/stt-${{ matrix.target }} && ./scripts/sidecar.sh "./apps/desktop/${{ env.TAURI_CONF_PATH }}" "binaries/stt"

Check warning

Code scanning / zizmor

code injection via template expansion Warning

code injection via template expansion
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 16

🔭 Outside diff range comments (5)
crates/file/src/lib.rs (1)

153-171: Graceful fallback without Content-Length is good; add HEAD status guard and typed headers

Nice improvement: falling back to serial when size is unknown or small. Two robustness nitpicks:

  • Treat non-success HEAD statuses like “no ranges/unknown size” and fall back to serial.
  • Use typed header constants for case safety.

Suggested changes:

-    let head_response = get_client().head(url.clone()).send().await?;
-    let total_size = get_content_length_from_headers(&head_response);
+    let head_response = get_client().head(url.clone()).send().await?;
+    let total_size = get_content_length_from_headers(&head_response);
+    let head_ok = head_response.status().is_success();
@@
-    let supports_ranges = head_response
+    let supports_ranges = head_response
         .headers()
-        .get("accept-ranges")
+        .get(reqwest::header::ACCEPT_RANGES)
         .map(|v| v.to_str().unwrap_or(""))
         .unwrap_or("")
         == "bytes";
@@
-    if !supports_ranges || total_size.unwrap_or(0) <= DEFAULT_CHUNK_SIZE {
+    if !head_ok || !supports_ranges || total_size.unwrap_or(0) <= DEFAULT_CHUNK_SIZE {
         return download_file_with_callback(url, output_path, move |progress| {
             progress_callback(progress)
         })
         .await;
     }

And add this import near the existing reqwest imports:

use reqwest::header;
plugins/listener/src/fsm.rs (1)

601-603: Stop logging API keys (secrets leak)

This logs the API key in plaintext. Remove it or redact. Secrets should never be written to logs.

-    tracing::info!(api_base = ?api_base, api_key = ?api_key, languages = ?languages, "listen_client");
+    tracing::info!(api_base = ?api_base, languages = ?languages, "listen_client");
+    // If needed for debugging, log whether a key was present without revealing it:
+    // tracing::debug!(api_key_present = api_key.as_ref().map(|k| !k.is_empty()).unwrap_or(false), "listen_client");
owhisper/owhisper-server/src/commands/run/state.rs (1)

59-85: Don’t drop transcripts when words are empty; use the provided transcript and avoid converting to Word2.

StreamResponse already carries alternatives[n].transcript. Today you return early if words is empty and also pay an unnecessary clone + conversion cost. Prefer the transcript; fall back to words only if transcript is empty.

Apply this refactor within the changed range:

-    pub fn process_chunk(&mut self, chunk: owhisper_interface::StreamResponse) {
-        let words = match chunk {
-            owhisper_interface::StreamResponse::TranscriptResponse { channel, .. } => channel
-                .alternatives
-                .first()
-                .map(|alt| {
-                    alt.words
-                        .iter()
-                        .map(|w| owhisper_interface::Word2::from(w.clone()))
-                        .collect::<Vec<_>>()
-                })
-                .unwrap_or_default(),
-            _ => {
-                return;
-            }
-        };
-
-        if words.is_empty() {
-            return;
-        }
-
-        let text = words
-            .iter()
-            .map(|w| w.text.as_str())
-            .collect::<Vec<_>>()
-            .join(" ");
+    pub fn process_chunk(&mut self, chunk: owhisper_interface::StreamResponse) {
+        let text = match chunk {
+            owhisper_interface::StreamResponse::TranscriptResponse { channel, .. } => {
+                if let Some(alt) = channel.alternatives.first() {
+                    if !alt.transcript.is_empty() {
+                        alt.transcript.clone()
+                    } else if !alt.words.is_empty() {
+                        alt.words
+                            .iter()
+                            .map(|w| w.word.as_str())
+                            .collect::<Vec<_>>()
+                            .join(" ")
+                    } else {
+                        return;
+                    }
+                } else {
+                    return;
+                }
+            }
+            _ => return,
+        };
crates/transcribe-moonshine/src/service/streaming.rs (1)

251-253: Remove unwraps on mutex lock and transcribe.

Both can fail (poisoned lock, model error). Handle gracefully to avoid panics/disconnects.

-                        let text = {
-                            let mut model_guard = model.lock().unwrap();
-                            model_guard.transcribe(chunk.samples).unwrap()
-                        };
+                        let text = {
+                            let mut model_guard = match model.lock() {
+                                Ok(g) => g,
+                                Err(e) => {
+                                    tracing::error!("moonshine_model_lock_poisoned: {}", e);
+                                    return None;
+                                }
+                            };
+                            match model_guard.transcribe(chunk.samples) {
+                                Ok(t) => t,
+                                Err(e) => {
+                                    tracing::error!("moonshine_transcribe_failed: {}", e);
+                                    return None;
+                                }
+                            }
+                        };
owhisper/owhisper-client/src/lib.rs (1)

215-241: Mark external Deepgram test as ignored (network + secret).

This requires an external API and env var; it will be flaky in CI.

-    #[tokio::test]
+    #[tokio::test]
+    #[ignore = "requires valid DEEPGRAM_API_KEY and external network"]

Alternatively, gate on the env var:

-        let client = ListenClient::builder()
+        let api_key = match std::env::var("DEEPGRAM_API_KEY") {
+            Ok(k) if !k.is_empty() => k,
+            _ => return,
+        };
+        let client = ListenClient::builder()
             .api_base("https://api.deepgram.com")
-            .api_key(std::env::var("DEEPGRAM_API_KEY").unwrap())
+            .api_key(api_key)
🧹 Nitpick comments (18)
crates/moonshine/src/lib.rs (1)

1-5: Public re-exports look good

mod + pub use is a clean, minimal surface. Consider adding a short crate-level doc comment explaining the crate’s purpose (the “why”).

apps/desktop/src/components/toast/model-select.tsx (1)

10-10: Hoist englishModels and make it const

Avoid re-allocating on each call and keep it immutable.

Example:

-import { commands as localSttCommands, type WhisperModel } from "@hypr/plugin-local-stt";
+import { commands as localSttCommands, type WhisperModel } from "@hypr/plugin-local-stt";

+const ENGLISH_MODELS = [
+  "QuantizedTinyEn",
+  "QuantizedBaseEn",
+  "QuantizedSmallEn",
+] as const satisfies readonly WhisperModel[];

 export async function showModelSelectToast(language: string) {
   const currentModel = await localSttCommands.getCurrentModel();
-  const englishModels: WhisperModel[] = ["QuantizedTinyEn", "QuantizedBaseEn", "QuantizedSmallEn"];
+  const englishModels = ENGLISH_MODELS;
owhisper/owhisper-interface/Cargo.toml (1)

21-21: uuid v4: consider feature minimization

If you don’t need default features (e.g., std), consider disabling them explicitly to keep builds lean.

For tighter control:

uuid = { workspace = true, features = ["v4"], default-features = false }

Only apply if std or other defaults aren’t required by your usage.

owhisper/owhisper-server/src/commands/run/mod.rs (1)

38-38: Avoid unnecessary clone of config.

config isn’t used after this call; cloning may be expensive. Pass it by value as before.

-    let server = Server::new(config.clone(), None);
+    let server = Server::new(config, None);
plugins/local-stt/src/lib.rs (1)

20-20: Field addition looks fine, but ensure consistent naming.

If this key is for the “hypr-am” client, consider am_api_key to match crate naming.

owhisper/owhisper-interface/src/stream.rs (1)

50-62: Default for Metadata creates UUIDs — confirm intended semantics

Generating new UUIDs in Default is fine if “fresh IDs by default” is desired. If you need deterministic defaults in tests, consider zero/empty defaults and explicit ID assignment.

apps/desktop/src/components/settings/components/ai/stt-view.tsx (3)

172-181: Narrow the download status typing to WhisperModel keys

Returning Record<string, boolean> loses type-safety. Tighten to Partial<Record<WhisperModel, boolean>>.

-      return {
+      const data: Partial<Record<WhisperModel, boolean>> = {
         "QuantizedTiny": statusChecks[0],
         "QuantizedTinyEn": statusChecks[1],
         "QuantizedBase": statusChecks[2],
         "QuantizedBaseEn": statusChecks[3],
         "QuantizedSmall": statusChecks[4],
         "QuantizedSmallEn": statusChecks[5],
         "QuantizedLargeTurbo": statusChecks[6],
-      } as Record<string, boolean>;
+      };
+      return data;

196-198: Type model key arrays to prevent drift

Annotate defaultModelKeys/otherModelKeys as WhisperModel[] to catch typos at compile time.

-  const defaultModelKeys = ["QuantizedTiny", "QuantizedSmall", "QuantizedLargeTurbo"];
-  const otherModelKeys = ["QuantizedTinyEn", "QuantizedBase", "QuantizedBaseEn", "QuantizedSmallEn"];
+  const defaultModelKeys: WhisperModel[] = ["QuantizedTiny", "QuantizedSmall", "QuantizedLargeTurbo"];
+  const otherModelKeys: WhisperModel[] = ["QuantizedTinyEn", "QuantizedBase", "QuantizedBaseEn", "QuantizedSmallEn"];

9-12: Remove dead/commented-out code

Large commented blocks add noise. Consider removing or guarding behind a feature flag when ready.

Also applies to: 254-317, 471-506

apps/desktop/src/components/welcome-modal/model-selection-view.tsx (1)

52-64: Optional: add basic error handling to the query

Surface failures to dev console to aid troubleshooting.

-  const supportedSTTModels = useQuery<ModelInfo[]>({
+  const supportedSTTModels = useQuery<ModelInfo[]>({
     queryKey: ["local-stt", "supported-models"],
     queryFn: async () => {
       const models = await localSttCommands.listSupportedModels();
       const downloadedModels = await Promise.all(
         models.map((model) => localSttCommands.isModelDownloaded(model)),
       );
       return models.map((model, index) => ({
         model,
         is_downloaded: downloadedModels[index],
       }));
     },
+    onError: console.error,
   });
crates/moonshine/src/error.rs (2)

1-23: Error surface is fine; consider a crate-wide Result alias

Defining a common Result improves ergonomics and consistency.

 pub enum Error {
@@
     Other(String),
 }
+
+pub type Result<T> = std::result::Result<T, Error>;

6-11: Consider not exposing hypr_onnx types in the public error API

If hypr_onnx is considered an internal detail, mapping those errors into your own variants avoids leaking dependencies into downstream public APIs.

apps/desktop/src/components/welcome-modal/index.tsx (1)

313-314: Remove redundant cast

selectedSttModel is already WhisperModel; the cast is unnecessary.

-          const sttModelExists = await localSttCommands.isModelDownloaded(selectedSttModel as WhisperModel);
+          const sttModelExists = await localSttCommands.isModelDownloaded(selectedSttModel);
crates/transcribe-whisper-local/src/service/streaming.rs (1)

220-233: Retain punctuation and annotate per-word language to improve fidelity.

Whitespace tokenization discards punctuation and per-word language. Populate punctuated_word and set language from the chunk-level field to improve downstream UX.

-                    .map(|w| Word {
-                        word: w.to_string(),
+                    .map(|w| Word {
+                        word: w.to_string(),
                         start: start_f64,
                         end: start_f64 + duration_f64,
                         confidence,
                         speaker: speaker.clone(),
-                        punctuated_word: None,
-                        language: None,
+                        punctuated_word: Some(w.to_string()),
+                        language: language.first().cloned(),
                     })
crates/ws-utils/src/lib.rs (1)

18-33: Deinterleave without intermediate allocations and document channel order.

Avoid materializing an intermediate Vec<i16> and process 4-byte frames directly. Also, encode the channel order assumption (mic first) inline.

-fn deinterleave_audio(data: &[u8]) -> (Vec<f32>, Vec<f32>) {
-    let samples: Vec<i16> = data
-        .chunks_exact(2)
-        .map(|chunk| i16::from_le_bytes([chunk[0], chunk[1]]))
-        .collect();
-
-    let mut mic = Vec::with_capacity(samples.len() / 2);
-    let mut speaker = Vec::with_capacity(samples.len() / 2);
-
-    for chunk in samples.chunks_exact(2) {
-        mic.push(chunk[0] as f32 / 32768.0);
-        speaker.push(chunk[1] as f32 / 32768.0);
-    }
-
-    (mic, speaker)
-}
+fn deinterleave_audio(data: &[u8]) -> (Vec<f32>, Vec<f32>) {
+    // Assumes 16-bit LE PCM, interleaved as [mic, speaker] per frame.
+    let frames = data.chunks_exact(4);
+    let mut mic = Vec::with_capacity(frames.len());
+    let mut speaker = Vec::with_capacity(frames.len());
+    for frame in frames {
+        let l = i16::from_le_bytes([frame[0], frame[1]]) as f32 / 32768.0;
+        let r = i16::from_le_bytes([frame[2], frame[3]]) as f32 / 32768.0;
+        mic.push(l);
+        speaker.push(r);
+    }
+    (mic, speaker)
+}
owhisper/owhisper-client/src/lib.rs (2)

147-152: Also accept Binary JSON in from_message.

Some servers may send JSON in Binary frames. Safe to support both.

-        match msg {
-            Message::Text(text) => serde_json::from_str::<Self::Output>(&text).ok(),
-            _ => None,
-        }
+        match msg {
+            Message::Text(text) => serde_json::from_str(&text).ok(),
+            Message::Binary(bin) => serde_json::from_slice(&bin).ok(),
+            _ => None,
+        }

174-179: Mirror Binary JSON support for dual client.

Same reasoning as single client.

-        match msg {
-            Message::Text(text) => serde_json::from_str::<Self::Output>(&text).ok(),
-            _ => None,
-        }
+        match msg {
+            Message::Text(text) => serde_json::from_str(&text).ok(),
+            Message::Binary(bin) => serde_json::from_slice(&bin).ok(),
+            _ => None,
+        }
owhisper/owhisper-server/src/commands/run/realtime.rs (1)

169-176: Naming nit: "chunk" now carries StreamResponse.

Consider renaming to response for clarity.

-    while let Some(chunk) = response_stream.next().await {
+    while let Some(response) = response_stream.next().await {
         if should_stop.load(std::sync::atomic::Ordering::Relaxed) {
             break;
         }
-        if transcript_tx.send(chunk).is_err() {
+        if transcript_tx.send(response).is_err() {
             break;
         }
     }
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 73e6001 and 5d29237.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (38)
  • .github/workflows/desktop_cd.yaml (1 hunks)
  • Cargo.toml (2 hunks)
  • apps/desktop/src/components/settings/components/ai/stt-view.tsx (3 hunks)
  • apps/desktop/src/components/toast/model-select.tsx (1 hunks)
  • apps/desktop/src/components/toast/shared.tsx (2 hunks)
  • apps/desktop/src/components/welcome-modal/download-progress-view.tsx (2 hunks)
  • apps/desktop/src/components/welcome-modal/index.tsx (4 hunks)
  • apps/desktop/src/components/welcome-modal/model-selection-view.tsx (4 hunks)
  • crates/am/Cargo.toml (1 hunks)
  • crates/am/src/client.rs (1 hunks)
  • crates/am/src/error.rs (1 hunks)
  • crates/am/src/lib.rs (1 hunks)
  • crates/am/src/types.rs (1 hunks)
  • crates/file/Cargo.toml (1 hunks)
  • crates/file/src/lib.rs (2 hunks)
  • crates/moonshine/Cargo.toml (1 hunks)
  • crates/moonshine/src/error.rs (1 hunks)
  • crates/moonshine/src/lib.rs (1 hunks)
  • crates/transcribe-deepgram/src/lib.rs (0 hunks)
  • crates/transcribe-moonshine/Cargo.toml (1 hunks)
  • crates/transcribe-moonshine/src/error.rs (1 hunks)
  • crates/transcribe-moonshine/src/lib.rs (0 hunks)
  • crates/transcribe-moonshine/src/service/streaming.rs (4 hunks)
  • crates/transcribe-whisper-local/src/lib.rs (0 hunks)
  • crates/transcribe-whisper-local/src/service/streaming.rs (2 hunks)
  • crates/whisper-local/src/model.rs (4 hunks)
  • crates/ws-utils/src/lib.rs (3 hunks)
  • owhisper/owhisper-client/src/lib.rs (10 hunks)
  • owhisper/owhisper-interface/Cargo.toml (1 hunks)
  • owhisper/owhisper-interface/src/stream.rs (2 hunks)
  • owhisper/owhisper-server/src/commands/run/mod.rs (1 hunks)
  • owhisper/owhisper-server/src/commands/run/realtime.rs (4 hunks)
  • owhisper/owhisper-server/src/commands/run/state.rs (1 hunks)
  • owhisper/owhisper-server/src/main.rs (1 hunks)
  • plugins/listener/src/fsm.rs (1 hunks)
  • plugins/local-stt/Cargo.toml (1 hunks)
  • plugins/local-stt/js/bindings.gen.ts (2 hunks)
  • plugins/local-stt/src/lib.rs (3 hunks)
💤 Files with no reviewable changes (3)
  • crates/transcribe-moonshine/src/lib.rs
  • crates/transcribe-deepgram/src/lib.rs
  • crates/transcribe-whisper-local/src/lib.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{js,ts,tsx,rs}

⚙️ CodeRabbit Configuration File

**/*.{js,ts,tsx,rs}: 1. No error handling.
2. No unused imports, variables, or functions.
3. For comments, keep it minimal. It should be about "Why", not "What".

Files:

  • apps/desktop/src/components/toast/model-select.tsx
  • crates/moonshine/src/lib.rs
  • owhisper/owhisper-server/src/commands/run/mod.rs
  • apps/desktop/src/components/welcome-modal/download-progress-view.tsx
  • owhisper/owhisper-interface/src/stream.rs
  • plugins/local-stt/src/lib.rs
  • crates/whisper-local/src/model.rs
  • apps/desktop/src/components/settings/components/ai/stt-view.tsx
  • apps/desktop/src/components/toast/shared.tsx
  • crates/am/src/lib.rs
  • owhisper/owhisper-server/src/main.rs
  • crates/am/src/error.rs
  • apps/desktop/src/components/welcome-modal/index.tsx
  • crates/file/src/lib.rs
  • plugins/listener/src/fsm.rs
  • crates/ws-utils/src/lib.rs
  • apps/desktop/src/components/welcome-modal/model-selection-view.tsx
  • crates/moonshine/src/error.rs
  • crates/transcribe-whisper-local/src/service/streaming.rs
  • crates/transcribe-moonshine/src/error.rs
  • owhisper/owhisper-server/src/commands/run/state.rs
  • plugins/local-stt/js/bindings.gen.ts
  • crates/transcribe-moonshine/src/service/streaming.rs
  • owhisper/owhisper-client/src/lib.rs
  • crates/am/src/types.rs
  • crates/am/src/client.rs
  • owhisper/owhisper-server/src/commands/run/realtime.rs
🧬 Code Graph Analysis (6)
apps/desktop/src/components/toast/model-select.tsx (1)
plugins/local-stt/js/bindings.gen.ts (1)
  • WhisperModel (66-66)
apps/desktop/src/components/welcome-modal/download-progress-view.tsx (1)
plugins/local-stt/js/bindings.gen.ts (1)
  • WhisperModel (66-66)
apps/desktop/src/components/settings/components/ai/stt-view.tsx (1)
plugins/local-stt/js/bindings.gen.ts (1)
  • WhisperModel (66-66)
apps/desktop/src/components/toast/shared.tsx (1)
plugins/local-stt/js/bindings.gen.ts (1)
  • WhisperModel (66-66)
apps/desktop/src/components/welcome-modal/index.tsx (1)
plugins/local-stt/js/bindings.gen.ts (1)
  • WhisperModel (66-66)
plugins/local-stt/js/bindings.gen.ts (3)
plugins/local-llm/js/bindings.gen.ts (1)
  • TAURI_CHANNEL (59-59)
plugins/db/js/bindings.gen.ts (2)
  • Word2 (182-182)
  • SpeakerIdentity (178-178)
plugins/listener/js/bindings.gen.ts (1)
  • SpeakerIdentity (82-82)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: cubic · AI code reviewer
  • GitHub Check: ci (macos, macos-latest)
  • GitHub Check: ci (windows, windows-latest)
🔇 Additional comments (39)
crates/file/Cargo.toml (1)

18-18: Switch to workspace hypr-s3 looks good

Aligning to workspace dep improves version cohesion. No action needed.

apps/desktop/src/components/toast/model-select.tsx (1)

3-3: Type rename to WhisperModel is consistent with bindings

Matches the plugin bindings and other UI updates. Good.

owhisper/owhisper-interface/Cargo.toml (1)

20-20: strum entry looks fine

Change appears to be reordering; no issues.

owhisper/owhisper-server/src/main.rs (1)

53-55: LGTM: clearer error handling with explicit exit code.

crates/am/Cargo.toml (1)

1-14: No changes needed for the am crate name

The workspace intentionally aliases each local crate as hypr-<crate> while keeping their intrinsic names (aec, agc, am, etc.) concise. Your discovery confirms:

  • In Cargo.toml (root):
    hypr-am = { path = "crates/am", package = "am" }
  • All other crates follow the same pattern (hypr-aec, hypr-agc, hypr-analytics, hypr-audio).
  • Plugin manifests (e.g. plugins/local-stt/Cargo.toml) also consume hypr-am = { workspace = true }.

Since this is a consistent, intentional convention, no renaming is required.

Likely an incorrect or invalid review comment.

plugins/local-stt/src/lib.rs (2)

45-45: LGTM: export WhisperModel for TS bindings.

This keeps UI/TS types in sync.


96-100: State initialization OK; picks up runtime-provided key.

After switching to std::env::var, this remains correct.

Cargo.toml (2)

25-25: Workspace dep hypr-am added — OK

Path/package mapping follows existing convention. No issues spotted.


51-51: Dependency alignment confirmed for transcribe-moonshine
crates/transcribe-moonshine/Cargo.toml now depends on hypr-moonshine = { workspace = true } and no longer references hypr-onnx. No further changes required.

apps/desktop/src/components/welcome-modal/download-progress-view.tsx (2)

7-7: Type import updated to WhisperModel — OK

Matches plugin bindings. Consistent with the broader refactor.


21-21: Prop type switched to WhisperModel — OK

Downstream usage (downloadModel/setCurrentModel) remains compatible.

owhisper/owhisper-interface/src/stream.rs (1)

109-109: Confirm Deepgram crate’s support for v2 “Results” events

– Cargo.toml pins the Deepgram client at
deepgram = { version = "0.6.8", default-features = false }
– A recursive grep shows three uses of
type_field: "Results"
and no occurrences of
type_field: "transcript"
in the streaming code.
● Please verify that version 0.6.8 of the deepgram crate (with default-features disabled) includes the v2 streaming API, which emits "Results" events. If it does not, upgrade or adjust your feature flags so that the client recognizes the new event naming.

crates/transcribe-moonshine/src/error.rs (1)

7-7: Error consolidation to hypr_moonshine::Error — Approved

  • crates/transcribe-moonshine/src/error.rs now routes errors through hypr_moonshine::Error as intended.
  • crates/transcribe-moonshine/Cargo.toml has hypr-moonshine = { workspace = true } and no longer references hypr-onnx, so the dependency was updated correctly.
apps/desktop/src/components/toast/shared.tsx (2)

6-6: Type import updated to WhisperModel — OK

Consistent with plugin bindings and other component updates.


55-55: Toast helper param switched to WhisperModel — OK

Call sites should remain type-safe; no functional change.

apps/desktop/src/components/settings/components/ai/stt-view.tsx (2)

14-14: Type rename alignment looks good

Importing WhisperModel from the plugin matches the bindings change and keeps the UI strictly typed.


31-41: Strongly-typed metadata map is correct

Typing sttModelMetadata as Record<WhisperModel, ...> ensures key coverage at compile time. Nice.

crates/am/src/lib.rs (2)

1-3: Module layout is fine

Private modules with pub re-exports keeps the public surface tidy.


5-7: Re-exports are clear

Re-exporting client, error, and types provides a nice one-stop API.

apps/desktop/src/components/welcome-modal/model-selection-view.tsx (3)

8-8: Type import aligns with plugin updates

Switching to WhisperModel is consistent with the bindings.


48-49: Callback signature update is correct

onContinue now enforces a WhisperModel; good for upstream consumers.


50-51: Sensible default model

"QuantizedSmall" is in the supported union; initializing selection is fine.

crates/moonshine/Cargo.toml (2)

6-10: Feature forwarding looks correct; confirm upstream features exist

Forwarding cuda, coreml, and directml to hypr-onnx is clear. Please confirm hypr-onnx actually defines these features to avoid build-time surprises. If GPU backends are optional, consider documenting the feature expectations in the README.

Also applies to: 12-18


2-3: Workspace alias for moonshine is correctly configured
The root Cargo.toml defines:

hypr-moonshine = { path = "crates/moonshine", package = "moonshine" }

and crates/moonshine/Cargo.toml uses hypr-moonshine as a workspace dependency (for feature propagation). No further changes needed.

plugins/listener/src/fsm.rs (1)

476-504: Handle errors instead of unwraps when updating session and emitting events

unwrap() here can crash the app on DB issues or event bus failures. Handle errors and continue gracefully.

-                            if !words.is_empty() {
-                                let updated_words =
-                                    update_session(&app, &session.id, words).await.unwrap();
-
-                                SessionEvent::Words {
-                                    words: updated_words,
-                                }
-                                .emit(&app)
-                                .unwrap();
-                            }
+                            if !words.is_empty() {
+                                let updated_words = match update_session(&app, &session.id, words).await {
+                                    Ok(w) => w,
+                                    Err(e) => {
+                                        tracing::error!(error = ?e, "update_session_failed");
+                                        continue;
+                                    }
+                                };
+                                if let Err(e) = SessionEvent::Words { words: updated_words }.emit(&app) {
+                                    tracing::error!(error = ?e, "emit_words_event_failed");
+                                }
+                            }

Likely an incorrect or invalid review comment.

crates/whisper-local/src/model.rs (1)

162-169: Language propagation into segments looks good

Adding language to Segment and setting it during construction is consistent and non-breaking. Accessor language() returning Option<&str> is idiomatic.

apps/desktop/src/components/welcome-modal/index.tsx (1)

10-11: Type migration to WhisperModel is consistent

Imports, state, mutation, and handler signatures now consistently use WhisperModel. This aligns with the plugin API. Good update.

Also applies to: 74-81, 255-260

crates/transcribe-whisper-local/src/service/streaming.rs (1)

235-251: All type_field values correctly use "Results"
A project-wide search confirmed no occurrences of "transcript" in any consumer or test—every type_field is "Results". No downstream code or tests will break.

crates/transcribe-moonshine/Cargo.toml (1)

8-10: Verify hypr-moonshine exposes cuda, coreml, and directml features

In crates/transcribe-moonshine/Cargo.toml (lines 8–10) you re-export:

• cuda = ["hypr-moonshine/cuda"]
• coreml = ["hypr-moonshine/coreml"]
• directml = ["hypr-moonshine/directml"]

Please confirm that the published hypr-moonshine crate defines features named exactly cuda, coreml, and directml. Misaligned or missing feature names will cause build failures.

crates/ws-utils/src/lib.rs (1)

159-170: Confirm mono-to-stereo fallback in split_dual_audio_sources

In crates/ws-utils/src/lib.rs (lines 159–167), we unconditionally duplicate a single-channel Samples into both mic_tx and speaker_tx. Downstream, both transcribe-moonshine (src/service/streaming.rs 189–199) and transcribe-whisper-local (src/service/streaming.rs 163–177) treat the “mic” and “speaker” streams separately (e.g. for VAD/diarization and speaker tagging). Duplicating mono audio may lead to identical VAD chunks and duplicate transcripts across channels.

• Verify that sending the same samples to both channels is intended
• Confirm this aligns with your diarization, UI labeling, and UX requirements

plugins/local-stt/js/bindings.gen.ts (2)

64-68: RecordedProcessingEvent now uses Word2 — aligned with other plugins.

Matches plugins/db/js/bindings.gen.ts Word2 shape.


19-36: WhisperModel Rename Verified
All plugin-local-stt bindings and your desktop UI call sites have been updated to use WhisperModel—no leftover references to the old SupportedModel.

• plugin-local-stt/js/bindings.gen.ts exclusively uses WhisperModel
• apps/desktop imports and mutations reference only WhisperModel
SupportedModel remains in plugin-local-llm (unrelated)

crates/transcribe-moonshine/src/service/streaming.rs (1)

279-296: type_field and languages fields verified — no action needed.
The type_field: "Results" string is used consistently by both producers (crates/transcribe-moonshine, crates/transcribe-whisper-local) and the owhisper-interface consumer. The languages vector remains empty by design, as there is currently no request parameter for languages.

owhisper/owhisper-client/src/lib.rs (1)

76-87: Query param channels aligns dual/single modes.

Including channels=1|2 is consistent with Deepgram handshake and your server. LGTM.

crates/am/src/types.rs (2)

3-13: Type definitions look solid.

Serde strategies and option skipping are consistent with typical HTTP JSON contracts. LGTM.

Also applies to: 24-50, 63-82


52-61: Confirm server wire format for ComputeUnits

I searched the crates/am codebase and only found the enum definition (lines 52–61 in crates/am/src/types.rs); there are no other references to the serialized tokens. Please verify that the server’s API actually expects the lowercase values "cpuandgpu" and "cpuandneuralengine" (rather than, e.g., "cpuAndGpu"/"cpuAndNeuralEngine"). You might:

  • Consult the server’s API documentation or schema
  • Add a serde serialization round-trip test to lock down the exact wire format
owhisper/owhisper-server/src/commands/run/realtime.rs (2)

25-27: Channel type switch to StreamResponse is consistent.

Matches client and streaming service changes. LGTM.


202-204: TUI consumer matches new type.

state.process_chunk(chunk) consuming StreamResponse aligns with upstream changes. LGTM.

crates/am/src/client.rs (1)

26-35: HTTP/JSON error handling is generally solid.

Consistent use of reqwest and serde; ergonomic constructors. LGTM.

Also applies to: 81-90, 103-112

Comment on lines +85 to 86
const metadata = sttModelMetadata[model as WhisperModel];
if (!metadata) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Remove casts by typing ModelInfo.model as WhisperModel

Typing ModelInfo.model eliminates the need for casts at usage sites.

-interface ModelInfo {
-  model: string;
-  is_downloaded: boolean;
-}
+interface ModelInfo {
+  model: WhisperModel;
+  is_downloaded: boolean;
+}
@@
-              const metadata = sttModelMetadata[model as WhisperModel];
+              const metadata = sttModelMetadata[model];
@@
-                      onClick={() => setSelectedModel(model as WhisperModel)}
+                      onClick={() => setSelectedModel(model)}

Also applies to: 102-103

🤖 Prompt for AI Agents
In apps/desktop/src/components/welcome-modal/model-selection-view.tsx around
lines 85-86 and 102-103, remove the explicit type casts of model to WhisperModel
by updating the type definition of ModelInfo.model to be WhisperModel. This
change will ensure that the model property is already correctly typed,
eliminating the need for casting at usage sites and improving type safety and
code clarity.

Comment on lines +181 to 183
tracing::info!(text = ?full_text, "transcribe_completed");
self.dynamic_prompt = full_text;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

⚠️ Potential issue

Avoid logging full transcriptions (PII risk)

Logging the entire transcript can leak sensitive data. Prefer logging length or a hash, or gate behind debug builds.

-        if !full_text.is_empty() {
-            tracing::info!(text = ?full_text, "transcribe_completed");
-            self.dynamic_prompt = full_text;
-        }
+        if !full_text.is_empty() {
+            #[cfg(debug_assertions)]
+            tracing::debug!(chars = full_text.len(), "transcribe_completed");
+            self.dynamic_prompt = full_text;
+        }
🤖 Prompt for AI Agents
In crates/whisper-local/src/model.rs around lines 181 to 183, avoid logging the
full transcription text to prevent potential exposure of sensitive information.
Instead of logging the entire transcript, modify the tracing::info! call to log
either the length of the transcript, a hash of the text, or conditionally log
the full text only in debug builds. This change will reduce the risk of leaking
personally identifiable information while maintaining useful logging.

Comment on lines +6 to +27
fn interleave_audio(mic: &[u8], speaker: &[u8]) -> Vec<u8> {
let mic_samples: Vec<i16> = mic
.chunks_exact(2)
.map(|chunk| i16::from_le_bytes([chunk[0], chunk[1]]))
.collect();
let speaker_samples: Vec<i16> = speaker
.chunks_exact(2)
.map(|chunk| i16::from_le_bytes([chunk[0], chunk[1]]))
.collect();

let max_len = mic_samples.len().max(speaker_samples.len());
let mut interleaved = Vec::with_capacity(max_len * 2 * 2);

for i in 0..max_len {
let mic_sample = mic_samples.get(i).copied().unwrap_or(0);
let speaker_sample = speaker_samples.get(i).copied().unwrap_or(0);
interleaved.extend_from_slice(&mic_sample.to_le_bytes());
interleaved.extend_from_slice(&speaker_sample.to_le_bytes());
}

interleaved
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Handle odd-length buffers in interleave_audio.

chunks_exact(2) silently drops trailing bytes. Pad the last byte to avoid sample truncation.

-    let mic_samples: Vec<i16> = mic
-        .chunks_exact(2)
-        .map(|chunk| i16::from_le_bytes([chunk[0], chunk[1]]))
-        .collect();
+    let mic_samples: Vec<i16> = mic
+        .chunks(2)
+        .map(|chunk| {
+            let bytes = if chunk.len() == 2 { [chunk[0], chunk[1]] } else { [chunk[0], 0] };
+            i16::from_le_bytes(bytes)
+        })
+        .collect();
-    let speaker_samples: Vec<i16> = speaker
-        .chunks_exact(2)
-        .map(|chunk| i16::from_le_bytes([chunk[0], chunk[1]]))
-        .collect();
+    let speaker_samples: Vec<i16> = speaker
+        .chunks(2)
+        .map(|chunk| {
+            let bytes = if chunk.len() == 2 { [chunk[0], chunk[1]] } else { [chunk[0], 0] };
+            i16::from_le_bytes(bytes)
+        })
+        .collect();
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
fn interleave_audio(mic: &[u8], speaker: &[u8]) -> Vec<u8> {
let mic_samples: Vec<i16> = mic
.chunks_exact(2)
.map(|chunk| i16::from_le_bytes([chunk[0], chunk[1]]))
.collect();
let speaker_samples: Vec<i16> = speaker
.chunks_exact(2)
.map(|chunk| i16::from_le_bytes([chunk[0], chunk[1]]))
.collect();
let max_len = mic_samples.len().max(speaker_samples.len());
let mut interleaved = Vec::with_capacity(max_len * 2 * 2);
for i in 0..max_len {
let mic_sample = mic_samples.get(i).copied().unwrap_or(0);
let speaker_sample = speaker_samples.get(i).copied().unwrap_or(0);
interleaved.extend_from_slice(&mic_sample.to_le_bytes());
interleaved.extend_from_slice(&speaker_sample.to_le_bytes());
}
interleaved
}
fn interleave_audio(mic: &[u8], speaker: &[u8]) -> Vec<u8> {
let mic_samples: Vec<i16> = mic
.chunks(2)
.map(|chunk| {
let bytes = if chunk.len() == 2 { [chunk[0], chunk[1]] } else { [chunk[0], 0] };
i16::from_le_bytes(bytes)
})
.collect();
let speaker_samples: Vec<i16> = speaker
.chunks(2)
.map(|chunk| {
let bytes = if chunk.len() == 2 { [chunk[0], chunk[1]] } else { [chunk[0], 0] };
i16::from_le_bytes(bytes)
})
.collect();
let max_len = mic_samples.len().max(speaker_samples.len());
let mut interleaved = Vec::with_capacity(max_len * 2 * 2);
for i in 0..max_len {
let mic_sample = mic_samples.get(i).copied().unwrap_or(0);
let speaker_sample = speaker_samples.get(i).copied().unwrap_or(0);
interleaved.extend_from_slice(&mic_sample.to_le_bytes());
interleaved.extend_from_slice(&speaker_sample.to_le_bytes());
}
interleaved
}
🤖 Prompt for AI Agents
In owhisper/owhisper-client/src/lib.rs lines 6 to 27, the interleave_audio
function uses chunks_exact(2) which drops the last byte if the input buffer
length is odd, causing sample truncation. Modify the code to handle odd-length
buffers by padding the last byte with zero before converting to i16, ensuring no
audio data is lost. Implement this padding for both mic and speaker buffers
before processing.

Comment on lines +242 to +274
#[tokio::test]
// cargo test -p owhisper-client test_client_ag -- --nocapture
async fn test_client_ag() {
let audio_1 = rodio::Decoder::new(std::io::BufReader::new(
std::fs::File::open(hypr_data::english_1::AUDIO_PATH).unwrap(),
))
.unwrap()
.to_i16_le_chunks(16000, 512);

let audio_2 = rodio::Decoder::new(std::io::BufReader::new(
std::fs::File::open(hypr_data::english_1::AUDIO_PATH).unwrap(),
))
.unwrap()
.to_i16_le_chunks(16000, 512);

let client = ListenClient::builder()
.api_base("ws://localhost:50060")
.api_key("".to_string())
.params(owhisper_interface::ListenParams {
model: Some("tiny.en".to_string()),
languages: vec![hypr_language::ISO639::En.into()],
..Default::default()
})
.build_dual();

let stream = client.from_realtime_audio(audio_1, audio_2).await.unwrap();
futures_util::pin_mut!(stream);

while let Some(result) = stream.next().await {
println!("{:?}", result);
}
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Mark local server test as ignored.

Depends on a locally running server on ws://localhost:50060. Ignore to keep CI green.

-    #[tokio::test]
+    #[tokio::test]
+    #[ignore = "requires local server on ws://localhost:50060"]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
#[tokio::test]
// cargo test -p owhisper-client test_client_ag -- --nocapture
async fn test_client_ag() {
let audio_1 = rodio::Decoder::new(std::io::BufReader::new(
std::fs::File::open(hypr_data::english_1::AUDIO_PATH).unwrap(),
))
.unwrap()
.to_i16_le_chunks(16000, 512);
let audio_2 = rodio::Decoder::new(std::io::BufReader::new(
std::fs::File::open(hypr_data::english_1::AUDIO_PATH).unwrap(),
))
.unwrap()
.to_i16_le_chunks(16000, 512);
let client = ListenClient::builder()
.api_base("ws://localhost:50060")
.api_key("".to_string())
.params(owhisper_interface::ListenParams {
model: Some("tiny.en".to_string()),
languages: vec![hypr_language::ISO639::En.into()],
..Default::default()
})
.build_dual();
let stream = client.from_realtime_audio(audio_1, audio_2).await.unwrap();
futures_util::pin_mut!(stream);
while let Some(result) = stream.next().await {
println!("{:?}", result);
}
}
#[tokio::test]
#[ignore = "requires local server on ws://localhost:50060"]
// cargo test -p owhisper-client test_client_ag -- --nocapture
async fn test_client_ag() {
let audio_1 = rodio::Decoder::new(std::io::BufReader::new(
std::fs::File::open(hypr_data::english_1::AUDIO_PATH).unwrap(),
))
.unwrap()
.to_i16_le_chunks(16000, 512);
let audio_2 = rodio::Decoder::new(std::io::BufReader::new(
std::fs::File::open(hypr_data::english_1::AUDIO_PATH).unwrap(),
))
.unwrap()
.to_i16_le_chunks(16000, 512);
let client = ListenClient::builder()
.api_base("ws://localhost:50060")
.api_key("".to_string())
.params(owhisper_interface::ListenParams {
model: Some("tiny.en".to_string()),
languages: vec![hypr_language::ISO639::En.into()],
..Default::default()
})
.build_dual();
let stream = client.from_realtime_audio(audio_1, audio_2).await.unwrap();
futures_util::pin_mut!(stream);
while let Some(result) = stream.next().await {
println!("{:?}", result);
}
}
🤖 Prompt for AI Agents
In owhisper/owhisper-client/src/lib.rs around lines 242 to 274, the test
function test_client_ag depends on a local server running at
ws://localhost:50060, which can cause CI failures. Add the #[ignore] attribute
above the #[tokio::test] annotation to mark this test as ignored by default,
preventing it from running during automated CI builds.

@yujonglee yujonglee merged commit de8f1eb into main Aug 9, 2025
6 checks passed
@yujonglee yujonglee deleted the deepgram-compat-v2 branch August 9, 2025 23:32
This was referenced Aug 11, 2025
@coderabbitai coderabbitai bot mentioned this pull request Sep 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants