-
Notifications
You must be signed in to change notification settings - Fork 416
Audio file upload groundwork #1078
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughThis change introduces audio recording upload support in the transcript view, enabling users to select and process audio files via the UI. It adds a new backend command ( Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant UI (Transcript View)
participant Tauri Dialog
participant Local STT Plugin (Frontend)
participant Local STT Plugin (Backend)
participant Audio Processor
User->>UI (Transcript View): Click "Upload recording"
UI (Transcript View)->>Tauri Dialog: Open file selector (audio files)
Tauri Dialog-->>UI (Transcript View): Return selected file path
UI (Transcript View)->>Local STT Plugin (Frontend): Call processRecorded(audioPath)
Local STT Plugin (Frontend)->>Local STT Plugin (Backend): Invoke process_recorded(audioPath)
Local STT Plugin (Backend)->>Audio Processor: Decode and resample audio
Audio Processor-->>Local STT Plugin (Backend): Return processed audio data
Local STT Plugin (Backend)-->>Local STT Plugin (Frontend): Return success
Local STT Plugin (Frontend)-->>UI (Transcript View): Update transcript with results
Assessment against linked issues
Assessment against linked issues: Out-of-scope changes
Possibly related PRs
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 Clippy (1.86.0)error: failed to load source for dependency Caused by: Caused by: Caused by: ✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (18)
apps/desktop/src/components/right-panel/views/transcript-view.tsx(3 hunks)crates/audio/Cargo.toml(1 hunks)crates/chunker/Cargo.toml(1 hunks)plugins/local-stt/Cargo.toml(1 hunks)plugins/local-stt/build.rs(1 hunks)plugins/local-stt/js/bindings.gen.ts(2 hunks)plugins/local-stt/permissions/autogenerated/commands/process_recorded.toml(1 hunks)plugins/local-stt/permissions/autogenerated/reference.md(2 hunks)plugins/local-stt/permissions/default.toml(1 hunks)plugins/local-stt/permissions/schemas/schema.json(2 hunks)plugins/local-stt/src/commands.rs(1 hunks)plugins/local-stt/src/ext.rs(3 hunks)plugins/local-stt/src/lib.rs(2 hunks)plugins/task/src/commands.rs(2 hunks)plugins/task/src/ctx.rs(1 hunks)plugins/task/src/error.rs(1 hunks)plugins/task/src/ext.rs(2 hunks)plugins/task/src/lib.rs(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.{js,ts,tsx,rs}`: 1. No error handling. 2. No unused imports, variables, or functions. 3. For comments, keep it minimal. It should be about "Why", not "What".
**/*.{js,ts,tsx,rs}: 1. No error handling.
2. No unused imports, variables, or functions.
3. For comments, keep it minimal. It should be about "Why", not "What".
⚙️ Source: CodeRabbit Configuration File
List of files the instruction was applied to:
plugins/task/src/lib.rsplugins/local-stt/build.rsplugins/task/src/error.rsplugins/task/src/commands.rsplugins/local-stt/src/lib.rsplugins/task/src/ctx.rsplugins/local-stt/src/commands.rsapps/desktop/src/components/right-panel/views/transcript-view.tsxplugins/local-stt/js/bindings.gen.tsplugins/task/src/ext.rsplugins/local-stt/src/ext.rs
🧬 Code Graph Analysis (1)
plugins/task/src/ctx.rs (1)
plugins/task/js/bindings.gen.ts (1)
TaskStatus(29-29)
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: ci
- GitHub Check: ci (windows, windows-latest)
- GitHub Check: ci (macos, macos-latest)
🔇 Additional comments (23)
plugins/task/src/error.rs (1)
7-8: LGTM! Clean error variant addition.The new
TaskNotFounderror variant follows the established pattern and provides a clear, descriptive error message.plugins/task/src/commands.rs (2)
9-10: LGTM! Improved error consistency.Using the centralized
TaskNotFounderror variant instead of a generic string improves error handling consistency across the codebase.
19-20: LGTM! Consistent error handling.The updated error handling aligns with the
get_taskfunction and provides consistent error messages throughout the plugin.plugins/task/src/ctx.rs (1)
63-69: LGTM! Well-designed task lifecycle methods.The new
completeandfailmethods provide clear, explicit APIs for task lifecycle management. They follow the established pattern of delegating to the privateupdate_statusmethod and maintain consistency with the existing codebase.plugins/task/src/ext.rs (2)
11-11: LGTM! Improved method naming.The rename from
spawn_tasktospawn_task_blockingbetter communicates the blocking nature of the operation, improving API clarity.
25-25: LGTM! Consistent method implementation.The implementation update aligns with the trait definition and maintains the same functionality with improved naming.
crates/audio/Cargo.toml (1)
20-20: LGTM: Consistent feature simplification across crates.The removal of the "vorbis" feature from
rodioaligns with the broader standardization effort visible across multiple crates in this PR.plugins/local-stt/permissions/default.toml (1)
14-14: LGTM: Required permission for new audio processing command.The addition of "allow-process-recorded" permission is necessary for the new audio upload functionality and follows the existing naming convention.
plugins/local-stt/build.rs (1)
13-13: LGTM: Proper command registration for new functionality.The addition of "process_recorded" to the COMMANDS array correctly registers the new command for the Tauri plugin system and matches the corresponding permission.
plugins/local-stt/Cargo.toml (2)
60-60: LGTM: Updated rodio features for enhanced audio decoding.The addition of "symphonia" and "symphonia-all" features to
rodiosupports the new synchronous audio decoding functionality and aligns with the overall audio handling refactor.
61-61: rubato version is current and secure
Version 0.16.2 is not yanked and has no known security advisories. The latest on crates.io is 1.0.0-preview.0 (a prerelease). Only upgrade if you need prerelease features—0.16.2 remains the latest stable release.crates/chunker/Cargo.toml (1)
13-13: LGTM: Consistent feature simplification across workspace.The removal of the "wav" feature from
rodiois consistent with similar changes in other crates, contributing to the overall standardization of audio handling dependencies.plugins/local-stt/permissions/autogenerated/commands/process_recorded.toml (1)
1-14: LGTM! Permission configuration is properly structured.The TOML configuration correctly defines allow/deny permissions for the
process_recordedcommand, following the established pattern of other permission files in the system.plugins/local-stt/permissions/autogenerated/reference.md (2)
17-17: LGTM! Default permissions list updated correctly.The new
allow-process-recordedpermission is properly added to the default permission list.
265-287: LGTM! Permission table entries are well-formatted.The new allow/deny permission entries for
process_recordedare properly documented with clear descriptions and consistent formatting.plugins/local-stt/src/lib.rs (2)
48-48: LGTM! Command registration is properly implemented.The new
process_recordedcommand is correctly registered in the specta builder alongside other commands.
168-168: LGTM! Test path updated for better organization.The additional "stt" subdirectory in the test path appears to be an organizational improvement for model file storage.
plugins/local-stt/js/bindings.gen.ts (2)
45-48: LGTM! New command binding is properly implemented.The
processRecordedfunction correctly follows the established pattern for Tauri command bindings with proper parameter typing and return type.
67-67: LGTM! Event type updated for progress tracking.The change from "inactive" to "progress" in
RecordedProcessingEventaligns with the intended functionality for tracking processing progress.apps/desktop/src/components/right-panel/views/transcript-view.tsx (1)
222-238: Add error handling and remove console.logThe function has several issues that need to be addressed:
- Remove the console.log statement (line 233) - it violates the minimal comments guideline
- Add error handling for both the file dialog and processing operations
- Add user feedback during processing
Apply this diff to fix the issues:
- const handleUploadRecording = () => { - openFile({ + const handleUploadRecording = async () => { + try { + const file = await openFile({ multiple: false, directory: false, filters: [ { name: "supported_audio", extensions: ["mp3", "wav", "m4a", "mp4", "webm", "flac"], }, ], - }).then((file) => { - console.log(file); + }); if (file) { - localSttCommands.processRecorded(file); + await localSttCommands.processRecorded(file); } - }); + } catch (error) { + // Handle error appropriately - show toast, notification, etc. + } };Consider also adding a loading state to prevent multiple simultaneous uploads and provide visual feedback to the user.
Likely an incorrect or invalid review comment.
plugins/local-stt/permissions/schemas/schema.json (1)
405-416: Permission definitions look goodThe new permission entries for
process_recordedcommand follow the established pattern consistently.Also applies to: 454-457
plugins/local-stt/src/ext.rs (2)
256-300: Well-implemented audio resampling functionThe resampling implementation is well-designed with:
- Early return optimization for similar sample rates
- Proper multi-channel handling with deinterleaving/reinterleaving
- High-quality sinc interpolation parameters
Good architectural choice to extract this into a separate function.
278-279: Fix unwrap() in resampler creationThe resampler creation uses unwrap() which could panic.
Apply this diff:
- let mut resampler = - SincFixedIn::<f32>::new(to_rate / from_rate, 2.0, params, 1024, channels).unwrap(); + let mut resampler = + SincFixedIn::<f32>::new(to_rate / from_rate, 2.0, params, 1024, channels) + .map_err(|e| crate::Error::ResamplerInitFailed(e.to_string()))?;Likely an incorrect or invalid review comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🔭 Outside diff range comments (1)
plugins/local-stt/src/ext.rs (1)
215-215: Add error handling for transcription.The
transcribe()method can fail and should have proper error handling instead of unwrap().Apply this diff:
- let whisper_segments = model.transcribe(&audio_f32).unwrap(); + let whisper_segments = model.transcribe(&audio_f32) + .map_err(|e| crate::Error::TranscriptionFailed(e.to_string()))?;
♻️ Duplicate comments (3)
plugins/local-stt/src/ext.rs (3)
183-186: Replace unwrap() calls with proper error handling.The unwrap() calls on file operations will cause panics if the file doesn't exist or can't be decoded. This was flagged in previous reviews but hasn't been addressed.
201-201: Fix additional unwrap() calls.More unwrap() calls that need proper error handling, as flagged in previous reviews.
206-207: Fix segmenter unwrap() calls.Segmenter initialization and processing unwrap() calls that were flagged in previous reviews.
🧹 Nitpick comments (1)
plugins/task/src/ext.rs (1)
9-12: Rename the generic parameter for clarity.The generic parameter
Futis misleading since it no longer represents a Future type. Consider renaming it to a more generic name likeTorRto reflect that it's simply the return type of the closure.- fn spawn_task_blocking<F, Fut>(&self, exec: F) -> String + fn spawn_task_blocking<F, T>(&self, exec: F) -> String where - F: FnOnce(TaskCtx<R>) -> Fut + Send + 'static, - Fut: Send + 'static; + F: FnOnce(TaskCtx<R>) -> T + Send + 'static, + T: Send + 'static;
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
apps/desktop/src/components/right-panel/views/transcript-view.tsx(3 hunks)plugins/local-stt/js/bindings.gen.ts(2 hunks)plugins/local-stt/src/commands.rs(2 hunks)plugins/local-stt/src/events.rs(2 hunks)plugins/local-stt/src/ext.rs(6 hunks)plugins/local-stt/src/lib.rs(2 hunks)plugins/task/src/ctx.rs(2 hunks)plugins/task/src/ext.rs(4 hunks)plugins/task/src/store.rs(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- plugins/task/src/store.rs
🚧 Files skipped from review as they are similar to previous changes (4)
- plugins/local-stt/src/lib.rs
- apps/desktop/src/components/right-panel/views/transcript-view.tsx
- plugins/task/src/ctx.rs
- plugins/local-stt/src/commands.rs
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.{js,ts,tsx,rs}`: 1. No error handling. 2. No unused imports, variables, or functions. 3. For comments, keep it minimal. It should be about "Why", not "What".
**/*.{js,ts,tsx,rs}: 1. No error handling.
2. No unused imports, variables, or functions.
3. For comments, keep it minimal. It should be about "Why", not "What".
⚙️ Source: CodeRabbit Configuration File
List of files the instruction was applied to:
plugins/local-stt/src/events.rsplugins/local-stt/js/bindings.gen.tsplugins/task/src/ext.rsplugins/local-stt/src/ext.rs
🧬 Code Graph Analysis (2)
plugins/local-stt/js/bindings.gen.ts (2)
plugins/db/js/bindings.gen.ts (2)
Word(170-170)SpeakerIdentity(166-166)plugins/listener/js/bindings.gen.ts (2)
Word(77-77)SpeakerIdentity(76-76)
plugins/task/src/ext.rs (1)
plugins/task/src/ctx.rs (2)
id(35-37)new(19-27)
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: ci
- GitHub Check: ci (macos, macos-latest)
- GitHub Check: ci (windows, windows-latest)
🔇 Additional comments (9)
plugins/task/src/ext.rs (3)
29-29: LGTM! TaskCtx creation aligns with the simplified constructor.The TaskCtx::new call is consistent with the updated constructor signature that defaults total to 1.
41-41: LGTM! Hardcoded total is consistent with the simplified task system.The hardcoded total value of 1 aligns with the TaskCtx constructor changes and the removal of the total_steps parameter.
51-57: Confirm blocking thread usage and concurrency boundsThe
spawn_task_blockingimplementation inplugins/task/src/ext.rs(lines 51–57) correctly offloads work to Tauri’s blocking thread pool. In your use ofapp.spawn_task_blocking(e.g. inplugins/local-stt/src/commands.rs), each closure is synchronous (returns no future) and ignores its result per guidelines.• Verify that any long-running or CPU-intensive work (e.g. audio processing in
local-stt) is appropriate for the blocking pool.
• Confirm that the expected number of concurrent tasks won’t exhaust Tokio’s blocking thread pool or degrade performance.No code changes required; please review your task concurrency and durations.
plugins/local-stt/src/events.rs (2)
4-4: LGTM: Debug trait addition is useful for development.Adding the
Debugtrait to the common derives macro will help with debugging and logging of events.
13-13: LGTM: Word field addition supports progress reporting.The new
wordfield in theProgressevent enables detailed progress reporting during audio transcription, which aligns with the PR's audio upload functionality.plugins/local-stt/js/bindings.gen.ts (2)
45-48: LGTM: New command binding looks correct.The
processRecordedcommand binding correctly maps to the backendprocess_recordedcommand and accepts the expectedaudioPathparameter.
67-70: LGTM: Type definitions are consistent with backend.The updated type definitions for
RecordedProcessingEvent,SpeakerIdentity, andWordare consistent with the backend Rust types and match the structure seen in other plugin bindings.plugins/local-stt/src/ext.rs (2)
267-311: LGTM: Resampling logic is well-implemented.The
resample_audiofunction correctly handles multi-channel audio by splitting channels, resampling each channel separately, and then interleaving the results. The early return for similar sample rates is a good optimization.
231-235: LGTM: Progress reporting implementation is correct.The progress callback correctly reports the current word count and includes the word details, which will enable proper UI updates during transcription.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (2)
plugins/local-stt/src/ext.rs (2)
183-186: Replace unwrap() calls with proper error handlingMultiple unwrap() calls that will cause panics if operations fail.
199-199: Fix additional unwrap() callsModel path conversion and segmenter operations need proper error handling.
Also applies to: 204-205
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (5)
crates/audio-utils/Cargo.toml(1 hunks)crates/audio-utils/src/error.rs(1 hunks)crates/audio-utils/src/lib.rs(2 hunks)plugins/local-stt/Cargo.toml(1 hunks)plugins/local-stt/src/ext.rs(5 hunks)
✅ Files skipped from review due to trivial changes (2)
- plugins/local-stt/Cargo.toml
- crates/audio-utils/src/error.rs
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.{js,ts,tsx,rs}`: 1. No error handling. 2. No unused imports, variables, or functions. 3. For comments, keep it minimal. It should be about "Why", not "What".
**/*.{js,ts,tsx,rs}: 1. No error handling.
2. No unused imports, variables, or functions.
3. For comments, keep it minimal. It should be about "Why", not "What".
⚙️ Source: CodeRabbit Configuration File
List of files the instruction was applied to:
crates/audio-utils/src/lib.rsplugins/local-stt/src/ext.rs
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: ci (macos, macos-latest)
- GitHub Check: ci (windows, windows-latest)
- GitHub Check: ci
🔇 Additional comments (2)
crates/audio-utils/Cargo.toml (1)
10-13: Dependencies align well with the new audio processing functionality.The addition of
thiserrorfor error handling,rodiofor audio decoding, andrubatofor resampling are appropriate choices for the audio utilities crate.crates/audio-utils/src/lib.rs (1)
49-98: Well-implemented audio resampling with proper error handling.The function correctly handles multi-channel audio by deinterleaving/reinterleaving samples and includes a smart optimization to skip resampling when rates are nearly identical. The error propagation using
?is appropriate.
Resolves #849