Skip to content

feat: Add opt-in History audio attachments with playback (addresses infiniV/VoiceFlow#2)#5

Merged
infiniV merged 2 commits into
infiniV:mainfrom
EffortlessMetrics:upstream/history-audio
Dec 17, 2025
Merged

feat: Add opt-in History audio attachments with playback (addresses infiniV/VoiceFlow#2)#5
infiniV merged 2 commits into
infiniV:mainfrom
EffortlessMetrics:upstream/history-audio

Conversation

@EffortlessSteven
Copy link
Copy Markdown
Contributor

@EffortlessSteven EffortlessSteven commented Dec 17, 2025

Addresses #2.

  • Adds saveAudioToHistory setting (default: OFF) so VoiceFlow keeps its “RAM-only by default” behavior.
  • When enabled, after a successful transcription we persist that session’s audio as WAV (16kHz mono, 16-bit PCM) to ~/.VoiceFlow/audio/history_{id}.wav using temp-write + atomic replace, then attach per-row metadata on history.
  • Extends get_history to always return has_audio; include_audio_meta=true returns duration/size/mime when requested.
  • Adds get_history_audio(history_id) RPC returning { mime, base64, fileName?, sizeBytes?, durationMs? }; frontend plays via Blob URL and revokes URLs on close/unmount.

Privacy / lifecycle / safety

  • Audio save is best-effort and does not block transcript persistence (if audio write fails, the text history still saves).
  • Deleting a History entry, retention cleanup, and “Reset All Data” remove associated audio files.
  • Read/delete paths are guarded to stay under ~/.VoiceFlow/audio (prevents path traversal / escape).

Testing

  • pnpm tsc --noEmit
  • uv run -p .venv pytest src-pyloid/tests/

Summary by CodeRabbit

  • New Features
    • Audio history support: Save and attach audio recordings to history entries for future reference.
    • Audio playback: Play back recorded audio directly from history entries with visual indicators.
    • Audio storage setting: New toggle in Settings to enable or disable automatic audio storage with history.
    • Audio badges: Visual markers display on history entries that contain audio attachments.

✏️ Tip: You can customize this high-level summary in your review settings.

Copilot AI review requested due to automatic review settings December 17, 2025 17:24
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Dec 17, 2025

Walkthrough

This PR implements a complete audio attachment feature for the application. Users can enable audio persistence in settings, and when enabled, dictation audio is automatically saved to history alongside transcription metadata. The feature includes backend storage with metadata tracking, database schema migration, new RPC endpoints for audio retrieval, and frontend UI for playback with base64-encoded audio streaming and cleanup.

Changes

Cohort / File(s) Summary
Backend Audio Storage & Persistence
src-pyloid/app_controller.py, src-pyloid/services/database.py
Added _save_audio_attachment() method to serialize WAV audio with metadata; introduced get_history_audio() RPC endpoint; extended get_history() to support include_audio_meta flag; database schema migration adds audio columns (audio_relpath, audio_duration_ms, audio_size_bytes, audio_mime); new update_history_audio() and get_history_entry() methods; audio file cleanup on history deletion and reset.
Settings Infrastructure
src-pyloid/services/settings.py
Added new save_audio_to_history: bool field to Settings dataclass; updated update_settings() to accept and persist audio persistence flag; refactored to keyword-only parameters with Optional[...] types.
Server RPC Layer
src-pyloid/server.py
Extended get_history() signature with include_audio_meta parameter; added new get_history_audio() RPC endpoint; wired saveAudioToHistory setting into update_settings() call.
Frontend Settings UI
src/components/SettingsTab.tsx, src/lib/types.ts
Added new saveAudioToHistory: boolean field to Settings interface; introduced Audio History toggle card in settings with FileAudio icon and explanatory description.
Frontend API Layer
src/lib/api.ts
Added include_audio_meta parameter to getHistory() call; introduced new getHistoryAudio() method returning base64-encoded audio with metadata (mime, fileName, sizeBytes, durationMs).
Frontend Audio Utilities
src/lib/audio.ts
New module exporting audio helper functions: INVALID_AUDIO_PAYLOAD constant, isInvalidAudioPayload() error detector, base64ToBlobUrl() converter with error handling, revokeUrl() cleanup utility.
Frontend History Playback UI
src/components/HistoryPage.tsx, src/components/HomePage.tsx
Added audio playback support with state management (showPlayer, audioUrl, audioMeta, loadingAudioFor); implemented handlePlayAudio() to fetch audio and open modal player; UI displays Audio badges on entries with audio and Play buttons for triggering playback; integrated HTML5 audio element in Dialog-based modal; includes blob URL cleanup on unmount.

Sequence Diagram(s)

sequenceDiagram
    participant User as User
    participant Frontend as Frontend (React)
    participant Server as Server (RPC)
    participant Controller as AppController
    participant DB as Database
    participant FileSystem as File System

    rect rgb(100, 150, 200)
    Note over User,FileSystem: Audio Save Flow (on successful transcription)
    User->>Frontend: Record/Paste Audio
    Frontend->>Server: transcribe()
    Server->>Controller: transcribe()
    Controller->>DB: add_history(text)
    DB->>DB: save history entry
    DB-->>Controller: history_id

    alt saveAudioToHistory enabled
        Controller->>Controller: _save_audio_attachment(audio)
        Controller->>FileSystem: write WAV file
        FileSystem-->>Controller: audio_relpath, metadata
        Controller->>DB: update_history_audio(metadata)
        DB->>DB: update audio fields
    end
    
    Controller-->>Server: success response
    Server-->>Frontend: transcription + has_audio flag
    Frontend->>Frontend: display history entry with Audio badge
    end

    rect rgb(150, 180, 100)
    Note over User,FileSystem: Audio Playback Flow
    User->>Frontend: Click Play Audio button
    Frontend->>Server: get_history_audio(history_id)
    Server->>Controller: get_history_audio(history_id)
    Controller->>DB: get_history_entry(history_id)
    DB->>DB: fetch audio metadata
    DB-->>Controller: audio_relpath, metadata
    Controller->>FileSystem: read WAV file
    FileSystem-->>Controller: audio bytes
    Controller->>Controller: encode to base64
    Controller-->>Server: base64 + mime + metadata
    Server-->>Frontend: audio payload
    Frontend->>Frontend: base64ToBlobUrl()
    Frontend->>Frontend: display modal with HTML5 player
    User->>Frontend: listen to audio
    Frontend->>Frontend: revokeUrl() on close
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Audio file persistence logic in app_controller.py and database.py: WAV serialization, metadata tracking, and path safety checks require careful verification
  • Database migration and schema changes: new columns and fallback handling in _ensure_history_audio_columns() need validation
  • State management complexity in React components: multiple audio-related state variables and cleanup effects across HomePage.tsx and HistoryPage.tsx
  • Base64 encoding/decoding with error handling in base64ToBlobUrl() and _save_audio_attachment()
  • File cleanup logic: deletion of audio files during history purge and reset operations must be robust and safe

Possibly related issues

  • Audio attachment persistence feature request aligns with the described opt-in "save dictation audio to History" functionality, including WAV storage, metadata tracking, get_history_audio RPC endpoint, base64 playback, and file cleanup on deletion/reset.

Poem

🎵 A rabbit hops through history's hall,
Recording voice for one and all,
With WAV files tucked in database deep,
Audio playback—memories to keep! 🎧
No setting left unchecked, no byte unplayed,
Our audio dreams are finally made! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 40.91% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the main feature added: opt-in audio attachments for history with playback capability, with a reference to the associated issue.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (6)
src-pyloid/services/settings.py (1)

1-4: Unused Literal import.

The Literal type is imported but not used in this file. Consider removing it to keep imports clean.

-from typing import Literal, Optional
+from typing import Optional
src/components/HomePage.tsx (1)

74-97: Potential redundant URL revocation in error handler.

At line 78, you revoke the previous audioUrl before fetching new audio. If an error occurs, line 90 attempts to revoke audioUrl again using the closure value (which is the old URL, already revoked). This is harmless but redundant.

Consider simplifying the error handler since the cleanup effect at lines 45-47 will handle revocation when audioUrl is set to null:

     } catch (error) {
       console.error("Failed to load audio recording:", error);
       toast.error(isInvalidAudioPayload(error) ? "Audio file is corrupted" : "Audio file not found");
-      revokeUrl(audioUrl);
       setAudioUrl(null);
       setShowPlayer(false);
       setAudioMeta(null);
     } finally {
src-pyloid/services/database.py (1)

157-163: Use explicit Optional[str] for the search parameter.

Per PEP 484, using str = None implies an implicit Optional which is prohibited. This aligns with the static analysis hint.

     def get_history(
         self,
         limit: int = 100,
         offset: int = 0,
-        search: str = None,
+        search: Optional[str] = None,
         include_audio_meta: bool = False,
     ) -> list:
src-pyloid/app_controller.py (3)

257-258: Use explicit Optional[str] for the search parameter.

Same as in database.py, use explicit Optional[str] for type correctness.

-    def get_history(self, limit: int = 100, offset: int = 0, search: str = None, include_audio_meta: bool = False) -> list:
+    def get_history(self, limit: int = 100, offset: int = 0, search: Optional[str] = None, include_audio_meta: bool = False) -> list:

353-360: Use exception chaining for clarity.

When re-raising an exception inside an except block, use raise ... from None to explicitly suppress the chain or raise ... from exc to preserve it. This makes the intent clear.

         try:
             audio_path.relative_to(audio_root)
         except ValueError:
-            raise FileNotFoundError("Audio path is invalid")
+            raise FileNotFoundError("Audio path is invalid") from None

335-341: Consider adding a cache invalidation method to SettingsService.

Directly accessing self.settings_service._cache breaks encapsulation. Consider adding a public method like invalidate_cache() to SettingsService.

In settings.py:

def invalidate_cache(self) -> None:
    """Clear the settings cache."""
    self._cache = None

Then in app_controller.py:

-        self.settings_service._cache = None
+        self.settings_service.invalidate_cache()
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between aae9cb7 and 63a7f6e.

📒 Files selected for processing (10)
  • src-pyloid/app_controller.py (9 hunks)
  • src-pyloid/server.py (4 hunks)
  • src-pyloid/services/database.py (7 hunks)
  • src-pyloid/services/settings.py (3 hunks)
  • src/components/HistoryPage.tsx (5 hunks)
  • src/components/HomePage.tsx (7 hunks)
  • src/components/SettingsTab.tsx (4 hunks)
  • src/lib/api.ts (2 hunks)
  • src/lib/audio.ts (1 hunks)
  • src/lib/types.ts (2 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.{ts,tsx}: Use pyloid-js RPC for frontend-backend communication to call Python methods exposed by the PyloidRPC server
Use @/ path alias for imports in frontend TypeScript files to reference the src/ directory
Use hash-based routing for frontend navigation between /popup, /onboarding, and /dashboard pages
Use React 18 with TypeScript for frontend development with Vite as the build tool
Define TypeScript interfaces for all data structures in lib/types.ts (Settings, HistoryEntry, Stats, Options)

Files:

  • src/lib/types.ts
  • src/lib/audio.ts
  • src/components/HistoryPage.tsx
  • src/components/HomePage.tsx
  • src/components/SettingsTab.tsx
  • src/lib/api.ts
src/components/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Use shadcn/ui (New York style) components with Tailwind CSS v4 for UI development, adding components via npx shadcn@latest add <component>

Files:

  • src/components/HistoryPage.tsx
  • src/components/HomePage.tsx
  • src/components/SettingsTab.tsx
src-pyloid/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src-pyloid/**/*.py: Backend sends events to popup window via popup_window.invoke('popup-state', {...}) to notify frontend of state changes
Use singleton controller pattern: access the app controller via get_controller() for orchestrating all services
Run model loading and transcription operations in daemon background threads to avoid blocking the main thread

Files:

  • src-pyloid/services/database.py
  • src-pyloid/services/settings.py
  • src-pyloid/app_controller.py
  • src-pyloid/server.py
src-pyloid/services/database.py

📄 CodeRabbit inference engine (CLAUDE.md)

Store application data (settings and history) in SQLite database at ~/.VoiceFlow/VoiceFlow.db

Files:

  • src-pyloid/services/database.py
src-pyloid/server.py

📄 CodeRabbit inference engine (CLAUDE.md)

Expose Python backend methods via PyloidRPC server to be called by the frontend (get_settings, update_settings, get_history, etc.)

Files:

  • src-pyloid/server.py
🧠 Learnings (6)
📚 Learning: 2025-12-17T13:57:39.689Z
Learnt from: CR
Repo: infiniV/VoiceFlow PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-17T13:57:39.689Z
Learning: Applies to src/**/*.{ts,tsx} : Define TypeScript interfaces for all data structures in `lib/types.ts` (Settings, HistoryEntry, Stats, Options)

Applied to files:

  • src/lib/types.ts
📚 Learning: 2025-12-17T13:57:39.689Z
Learnt from: CR
Repo: infiniV/VoiceFlow PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-17T13:57:39.689Z
Learning: Applies to src-pyloid/services/database.py : Store application data (settings and history) in SQLite database at `~/.VoiceFlow/VoiceFlow.db`

Applied to files:

  • src-pyloid/services/database.py
  • src-pyloid/app_controller.py
📚 Learning: 2025-12-17T13:57:39.689Z
Learnt from: CR
Repo: infiniV/VoiceFlow PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-17T13:57:39.689Z
Learning: Recording flow should follow the sequence: hotkey activate -> start recording -> hotkey release -> stop recording -> transcribe -> paste at cursor -> save to history -> return to idle state

Applied to files:

  • src-pyloid/app_controller.py
📚 Learning: 2025-12-17T13:57:39.689Z
Learnt from: CR
Repo: infiniV/VoiceFlow PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-17T13:57:39.689Z
Learning: Applies to src-pyloid/server.py : Expose Python backend methods via `PyloidRPC` server to be called by the frontend (`get_settings`, `update_settings`, `get_history`, etc.)

Applied to files:

  • src-pyloid/server.py
  • src/lib/api.ts
📚 Learning: 2025-12-17T13:57:39.689Z
Learnt from: CR
Repo: infiniV/VoiceFlow PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-17T13:57:39.689Z
Learning: Applies to src-pyloid/**/*.py : Use singleton controller pattern: access the app controller via `get_controller()` for orchestrating all services

Applied to files:

  • src-pyloid/server.py
📚 Learning: 2025-12-17T13:57:39.689Z
Learnt from: CR
Repo: infiniV/VoiceFlow PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-17T13:57:39.689Z
Learning: Applies to src/**/*.{ts,tsx} : Use `pyloid-js` RPC for frontend-backend communication to call Python methods exposed by the PyloidRPC server

Applied to files:

  • src/lib/api.ts
🧬 Code graph analysis (6)
src/components/HistoryPage.tsx (3)
src/lib/types.ts (1)
  • HistoryEntry (12-23)
src/lib/api.ts (1)
  • api (10-100)
src/lib/audio.ts (3)
  • revokeUrl (19-23)
  • base64ToBlobUrl (7-17)
  • isInvalidAudioPayload (3-5)
src/components/HomePage.tsx (6)
src/lib/types.ts (1)
  • HistoryEntry (12-23)
src/lib/api.ts (1)
  • api (10-100)
src/lib/audio.ts (3)
  • revokeUrl (19-23)
  • base64ToBlobUrl (7-17)
  • isInvalidAudioPayload (3-5)
src/components/ui/dialog.tsx (4)
  • Dialog (131-131)
  • DialogContent (133-133)
  • DialogHeader (136-136)
  • DialogTitle (139-139)
src/components/ui/badge.tsx (1)
  • Badge (46-46)
src/components/ui/button.tsx (1)
  • Button (64-64)
src-pyloid/services/database.py (3)
src-pyloid/services/logger.py (2)
  • debug (104-106)
  • debug (244-246)
src-pyloid/app_controller.py (2)
  • get_history (257-258)
  • delete_history (260-261)
src-pyloid/server.py (2)
  • get_history (99-101)
  • delete_history (118-121)
src/components/SettingsTab.tsx (2)
src/components/ui/label.tsx (1)
  • Label (24-24)
src/components/ui/switch.tsx (1)
  • Switch (31-31)
src-pyloid/server.py (2)
src-pyloid/app_controller.py (3)
  • get_history (257-258)
  • get_controller (418-422)
  • get_history_audio (343-368)
src-pyloid/services/database.py (1)
  • get_history (157-206)
src/lib/api.ts (1)
src/lib/types.ts (1)
  • HistoryEntry (12-23)
🪛 Biome (2.1.2)
src/components/HistoryPage.tsx

[error] 266-266: expected ) but instead found open

Remove open

(parse)


[error] 267-267: Expected a property, a shorthand property, a getter, a setter, or a method but instead found '('.

Expected a property, a shorthand property, a getter, a setter, or a method here.

(parse)


[error] 267-267: Expected a function body but instead found '=>'.

Expected a function body here.

(parse)


[error] 268-268: expected , but instead found setShowPlayer

Remove setShowPlayer

(parse)


[error] 268-268: Expected a function body but instead found ';'.

Expected a function body here.

(parse)


[error] 274-279: Expected a statement but instead found '}
>


Audio Recording
'.

Expected a statement here.

(parse)


[error] 298-298: Expected an expression but instead found '<'.

Expected an expression here.

(parse)


[error] 266-266: A global variable should not be reassigned.

Assigning to a global variable can override essential functionality.

(lint/suspicious/noGlobalAssign)

src/components/HomePage.tsx

[error] 242-242: expected ) but instead found open

Remove open

(parse)


[error] 243-243: Expected a property, a shorthand property, a getter, a setter, or a method but instead found '('.

Expected a property, a shorthand property, a getter, a setter, or a method here.

(parse)


[error] 243-243: Expected a function body but instead found '=>'.

Expected a function body here.

(parse)


[error] 244-244: expected , but instead found setShowPlayer

Remove setShowPlayer

(parse)


[error] 244-244: Expected a function body but instead found ';'.

Expected a function body here.

(parse)


[error] 250-255: Expected a statement but instead found '}
>


Audio Recording
'.

Expected a statement here.

(parse)


[error] 274-274: Expected an expression but instead found '<'.

Expected an expression here.

(parse)


[error] 276-277: Expected a statement but instead found '}'.

Expected a statement here.

(parse)


[error] 242-242: A global variable should not be reassigned.

Assigning to a global variable can override essential functionality.

(lint/suspicious/noGlobalAssign)

🪛 Ruff (0.14.8)
src-pyloid/services/database.py

161-161: PEP 484 prohibits implicit Optional

Convert to Optional[T]

(RUF013)


322-322: Do not catch blind exception: Exception

(BLE001)


363-363: Do not catch blind exception: Exception

(BLE001)

src-pyloid/app_controller.py

90-90: Do not catch blind exception: Exception

(BLE001)


195-195: Do not catch blind exception: Exception

(BLE001)


257-257: PEP 484 prohibits implicit Optional

Convert to Optional[T]

(RUF013)


316-316: Consider moving this statement to an else block

(TRY300)


317-317: Do not catch blind exception: Exception

(BLE001)


326-326: Starting a process without a shell

(S606)


327-327: Do not catch blind exception: Exception

(BLE001)


347-347: Avoid specifying long messages outside the exception class

(TRY003)


356-356: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


356-356: Avoid specifying long messages outside the exception class

(TRY003)


359-359: Avoid specifying long messages outside the exception class

(TRY003)

src-pyloid/server.py

99-99: PEP 484 prohibits implicit Optional

Convert to Optional[T]

(RUF013)

🔇 Additional comments (14)
src/lib/types.ts (1)

9-9: LGTM!

The new saveAudioToHistory setting and audio metadata fields on HistoryEntry are well-typed. The optional markers (?) with nullable unions (| null) correctly model the backend behavior where audio may or may not be present.

Also applies to: 18-22

src-pyloid/server.py (1)

104-107: LGTM!

The new get_history_audio endpoint correctly delegates to the controller, which handles path validation and file access. The implementation follows the established singleton pattern per coding guidelines.

src/lib/api.ts (1)

27-38: LGTM!

The updated getHistory signature with include_audio_meta and the new getHistoryAudio method correctly use pyloid-js RPC calls with proper parameter mapping (camelCase to snake_case). The return types align with the backend response structure.

src/components/SettingsTab.tsx (1)

339-364: LGTM!

The new "History Audio" settings card follows the established BentoSettingCard pattern and correctly wires the toggle to saveAudioToHistory. The accessible label association (htmlFor matching id) and privacy-focused description text are well done.

src/lib/audio.ts (1)

1-23: LGTM!

Clean utility module with well-structured error handling. The INVALID_AUDIO_PAYLOAD constant enables consistent error identification across components, and revokeUrl safely handles null/undefined inputs. The implementation uses standard Web APIs correctly.

src/components/HistoryPage.tsx (2)

53-55: Audio cleanup logic is well implemented.

The useEffect cleanup on line 54 correctly revokes object URLs on unmount, and handlePlayAudio properly revokes the previous URL before creating a new one (line 86). The error handling appropriately distinguishes between corrupted audio and missing files using isInvalidAudioPayload.

Also applies to: 82-105


181-255: LGTM (pending JSX structure fix).

The audio badge and dual play buttons (header hover + footer visible) provide good UX. The loading state via loadingAudioFor prevents double-click issues, and the badge appropriately indicates audio availability.

src-pyloid/services/settings.py (1)

62-93: Clean implementation of keyword-only arguments and optional parameters.

The refactor to use keyword-only arguments (*) with Optional types improves API safety by preventing positional argument mistakes. The pattern of checking if param is not None before persisting is correct and consistent.

src/components/HomePage.tsx (1)

279-366: Well-structured audio playback integration in HistoryCard.

The component correctly:

  • Displays an audio badge when has_audio is true
  • Provides play buttons with loading states
  • Disables buttons during loading to prevent double-clicks

The dual play buttons (header on hover, footer always visible) provide good discoverability.

src-pyloid/services/database.py (3)

351-364: Good path traversal protection.

The relative_to check correctly prevents deletion of files outside the audio directory. Consider logging a warning when path traversal is detected for security auditing:

Optional enhancement:

             try:
                 path.relative_to(audio_root)
             except ValueError:
+                debug(f"Path traversal attempt blocked: {relpath}")
                 return

244-265: Correct lifecycle management for retention cleanup.

The implementation properly collects audio paths before deleting database records, ensuring audio files are cleaned up during retention enforcement. The best-effort file deletion approach is appropriate.


301-323: Complete data reset including audio files.

The reset logic correctly clears the database first, then removes all audio files with robust error handling.

src-pyloid/app_controller.py (2)

171-186: Best-effort audio saving with proper error isolation.

The implementation correctly saves the transcript first (ensuring it's never lost), then attempts audio attachment as a secondary operation. Catching specific exceptions (OSError, wave.Error, sqlite3.Error, ValueError) is more precise than a bare Exception. Based on learnings, this follows the recording flow pattern correctly.


370-411: Solid audio persistence implementation with atomic writes.

The implementation correctly:

  • Uses temp file + atomic replace to prevent partial writes
  • Handles various numpy dtypes (float, int16, other)
  • Clips values to prevent overflow
  • Uses consistent audio parameters from audio_service (CHANNELS=1, SAMPLE_RATE=16000)

The CHANNELS and SAMPLE_RATE constants are verified as accessible class attributes on AudioService defined in src-pyloid/services/audio.py.

Comment thread src-pyloid/server.py
Comment on lines +99 to +101
async def get_history(limit: int = 100, offset: int = 0, search: str = None, include_audio_meta: bool = False):
controller = get_controller()
return controller.get_history(limit, offset, search)
return controller.get_history(limit, offset, search, include_audio_meta)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix implicit Optional type hint.

The search parameter uses str = None which violates PEP 484. For consistency with the rest of the file (lines 50-57), use explicit Optional[str].

 @server.method()
-async def get_history(limit: int = 100, offset: int = 0, search: str = None, include_audio_meta: bool = False):
+async def get_history(limit: int = 100, offset: int = 0, search: Optional[str] = None, include_audio_meta: bool = False):
     controller = get_controller()
     return controller.get_history(limit, offset, search, include_audio_meta)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async def get_history(limit: int = 100, offset: int = 0, search: str = None, include_audio_meta: bool = False):
controller = get_controller()
return controller.get_history(limit, offset, search)
return controller.get_history(limit, offset, search, include_audio_meta)
async def get_history(limit: int = 100, offset: int = 0, search: Optional[str] = None, include_audio_meta: bool = False):
controller = get_controller()
return controller.get_history(limit, offset, search, include_audio_meta)
🧰 Tools
🪛 Ruff (0.14.8)

99-99: PEP 484 prohibits implicit Optional

Convert to Optional[T]

(RUF013)

🤖 Prompt for AI Agents
In src-pyloid/server.py around lines 99 to 101, the get_history signature uses
`search: str = None` which implicitly uses Optional; change the annotation to
`search: Optional[str]` and ensure `Optional` is imported from `typing` at the
module top (add `from typing import Optional` if missing) so the default None is
type-correct and consistent with lines 50-57.

Comment on lines 263 to +299
</div>

<Dialog
open={showPlayer}
onOpenChange={(open) => {
setShowPlayer(open);
if (!open) {
revokeUrl(audioUrl);
setAudioUrl(null);
setAudioMeta(null);
}
}}
>
<DialogContent className="max-w-xl">
<DialogHeader>
<DialogTitle>Audio Recording</DialogTitle>
</DialogHeader>
{audioUrl ? (
<div className="space-y-3">
<div className="flex items-center justify-between text-sm text-muted-foreground">
<div className="flex items-center gap-2">
<FileAudio className="w-4 h-4 text-primary" />
<span>{audioMeta?.fileName || "history_audio.wav"}</span>
</div>
{durationMs ? <span>{Math.round(durationMs / 1000)}s</span> : null}
</div>
{/* biome-ignore lint/a11y/useMediaCaption: transcript text is already displayed in the history card */}
<audio controls autoPlay className="w-full">
<source src={audioUrl} type={audioMeta?.mime || "audio/wav"} />
Your browser does not support audio playback.
</audio>
</div>
) : (
<p className="text-sm text-muted-foreground">No audio loaded.</p>
)}
</DialogContent>
</Dialog>
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Dialog component is outside the main wrapper div, causing parse errors.

The Dialog component (lines 265-299) is placed after the closing </div> of the main container (line 263). This creates invalid JSX structure where the Dialog is a sibling to the root div rather than inside the return statement's fragment or wrapper. The static analysis correctly flags multiple parse errors here.

Wrap both elements in a Fragment or move the Dialog inside the main container:

       </div>
     </div>
-  );
-}
-
-      <Dialog
+
+      <Dialog
         open={showPlayer}
         onOpenChange={(open) => {
           setShowPlayer(open);
           if (!open) {
             revokeUrl(audioUrl);
             setAudioUrl(null);
             setAudioMeta(null);
           }
         }}
       >
-      <DialogContent className="max-w-xl">
+        <DialogContent className="max-w-xl">
         <DialogHeader>
           <DialogTitle>Audio Recording</DialogTitle>
         </DialogHeader>
         {audioUrl ? (
           <div className="space-y-3">
             <div className="flex items-center justify-between text-sm text-muted-foreground">
               <div className="flex items-center gap-2">
                 <FileAudio className="w-4 h-4 text-primary" />
                 <span>{audioMeta?.fileName || "history_audio.wav"}</span>
               </div>
               {durationMs ? <span>{Math.round(durationMs / 1000)}s</span> : null}
             </div>
             {/* biome-ignore lint/a11y/useMediaCaption: transcript text is already displayed in the history card */}
             <audio controls autoPlay className="w-full">
               <source src={audioUrl} type={audioMeta?.mime || "audio/wav"} />
               Your browser does not support audio playback.
             </audio>
           </div>
         ) : (
           <p className="text-sm text-muted-foreground">No audio loaded.</p>
         )}
-      </DialogContent>
-    </Dialog>
+        </DialogContent>
+      </Dialog>
+    </>
   );
 }

Also add the Fragment wrapper at the start of the return:

   return (
+    <>
     <div className="min-h-screen w-full bg-background/50 relative overflow-x-hidden">

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools
🪛 Biome (2.1.2)

[error] 266-266: expected ) but instead found open

Remove open

(parse)


[error] 267-267: Expected a property, a shorthand property, a getter, a setter, or a method but instead found '('.

Expected a property, a shorthand property, a getter, a setter, or a method here.

(parse)


[error] 267-267: Expected a function body but instead found '=>'.

Expected a function body here.

(parse)


[error] 268-268: expected , but instead found setShowPlayer

Remove setShowPlayer

(parse)


[error] 268-268: Expected a function body but instead found ';'.

Expected a function body here.

(parse)


[error] 274-279: Expected a statement but instead found '}
>


Audio Recording
'.

Expected a statement here.

(parse)


[error] 298-298: Expected an expression but instead found '<'.

Expected an expression here.

(parse)


[error] 266-266: A global variable should not be reassigned.

Assigning to a global variable can override essential functionality.

(lint/suspicious/noGlobalAssign)

🤖 Prompt for AI Agents
In src/components/HistoryPage.tsx around lines 263 to 299, the Dialog component
is placed after the closing </div> of the main container which makes the JSX
invalid; fix by wrapping the returned elements in a Fragment (or move the Dialog
inside the main container) so both the main div and the Dialog are children of
the same fragment/parent, update the return to start with <> before the main div
and end with </> after the Dialog, and remove the stray closing div placement so
the JSX parses correctly.

Comment on lines 239 to 276
</div>

<Dialog
open={showPlayer}
onOpenChange={(open) => {
setShowPlayer(open);
if (!open) {
revokeUrl(audioUrl);
setAudioUrl(null);
setAudioMeta(null);
}
}}
>
<DialogContent className="max-w-xl">
<DialogHeader>
<DialogTitle>Audio Recording</DialogTitle>
</DialogHeader>
{audioUrl ? (
<div className="space-y-3">
<div className="flex items-center justify-between text-sm text-muted-foreground">
<div className="flex items-center gap-2">
<FileAudio className="w-4 h-4 text-primary" />
<span>{audioMeta?.fileName || "history_audio.wav"}</span>
</div>
{durationMs ? <span>{Math.round(durationMs / 1000)}s</span> : null}
</div>
{/* biome-ignore lint/a11y/useMediaCaption: transcript text is already displayed in the history card */}
<audio controls autoPlay className="w-full">
<source src={audioUrl} type={audioMeta?.mime || "audio/wav"} />
Your browser does not support audio playback.
</audio>
</div>
) : (
<p className="text-sm text-muted-foreground">No audio loaded.</p>
)}
</DialogContent>
</Dialog>
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical syntax error: Dialog component is outside the return statement.

The Dialog component at lines 241-275 is placed after the closing </div> of the main return block (line 239), causing parse errors. The Dialog must be inside the return statement.

Wrap the entire return content in a React Fragment and include the Dialog inside:

   return (
+    <>
     <div className="min-h-screen w-full bg-background/50 relative overflow-x-hidden">
       {/* ... existing content ... */}
     </div>
-  );
-}
-
-      <Dialog
+      <Dialog
         open={showPlayer}
         onOpenChange={(open) => {
           setShowPlayer(open);
           if (!open) {
             revokeUrl(audioUrl);
             setAudioUrl(null);
             setAudioMeta(null);
           }
         }}
       >
-      <DialogContent className="max-w-xl">
+        <DialogContent className="max-w-xl">
           {/* ... dialog content ... */}
-      </DialogContent>
-    </Dialog>
+        </DialogContent>
+      </Dialog>
+    </>
   );
 }

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools
🪛 Biome (2.1.2)

[error] 242-242: expected ) but instead found open

Remove open

(parse)


[error] 243-243: Expected a property, a shorthand property, a getter, a setter, or a method but instead found '('.

Expected a property, a shorthand property, a getter, a setter, or a method here.

(parse)


[error] 243-243: Expected a function body but instead found '=>'.

Expected a function body here.

(parse)


[error] 244-244: expected , but instead found setShowPlayer

Remove setShowPlayer

(parse)


[error] 244-244: Expected a function body but instead found ';'.

Expected a function body here.

(parse)


[error] 250-255: Expected a statement but instead found '}
>


Audio Recording
'.

Expected a statement here.

(parse)


[error] 274-274: Expected an expression but instead found '<'.

Expected an expression here.

(parse)


[error] 242-242: A global variable should not be reassigned.

Assigning to a global variable can override essential functionality.

(lint/suspicious/noGlobalAssign)

🤖 Prompt for AI Agents
In src/components/HomePage.tsx around lines 239 to 276 the Dialog JSX is placed
after the closing </div> of the component's return block, causing a syntax/parse
error; fix by wrapping the existing returned JSX in a React Fragment (<>...</>)
or a single parent element so the Dialog is included inside the return, i.e.,
move the Dialog block so it sits within the fragment/parent alongside the rest
of the return content and ensure imports (React) remain correct.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds opt-in audio attachment functionality to the VoiceFlow history feature, maintaining the application's RAM-only default behavior while allowing users to persist audio recordings alongside transcriptions. The implementation includes comprehensive path security, atomic file writes, and proper cleanup mechanisms.

Key Changes:

  • Adds a saveAudioToHistory user setting (default: OFF) to enable audio persistence
  • Implements secure audio storage as 16kHz mono WAV files in ~/.VoiceFlow/audio/ with path traversal protection
  • Extends the RPC API with get_history_audio endpoint and audio metadata support in get_history

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
src/lib/types.ts Adds saveAudioToHistory setting and audio metadata fields to HistoryEntry interface
src/lib/audio.ts New utility module for base64-to-blob conversion and URL cleanup
src/lib/api.ts Adds getHistoryAudio RPC method and include_audio_meta parameter to getHistory
src/components/SettingsTab.tsx Adds UI toggle for audio history setting with explanatory text
src/components/HomePage.tsx Integrates audio playback UI with badge indicators, play buttons, and dialog player
src/components/HistoryPage.tsx Mirrors HomePage audio playback features for consistency
src-pyloid/services/settings.py Adds save_audio_to_history setting with keyword-only parameter pattern
src-pyloid/services/database.py Implements schema migration, audio metadata storage, and cleanup in deletion/retention
src-pyloid/server.py Exposes get_history_audio RPC endpoint and updates get_history signature
src-pyloid/app_controller.py Core audio persistence logic with atomic writes, path security, and WAV encoding

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +343 to +411
def get_history_audio(self, history_id: int) -> dict:
"""Fetch audio attachment for a history entry as base64."""
entry = self.db.get_history_entry(history_id)
if not entry or not entry.get("audio_relpath"):
raise FileNotFoundError("No audio stored for this history item")

data_dir = self.db.db_path.parent
audio_root = (data_dir / "audio").resolve()
audio_path = (data_dir / entry["audio_relpath"]).resolve()

try:
audio_path.relative_to(audio_root)
except ValueError:
raise FileNotFoundError("Audio path is invalid")

if not audio_path.exists():
raise FileNotFoundError("Audio file missing on disk")

data = audio_path.read_bytes()
return {
"base64": base64.b64encode(data).decode("utf-8"),
"mime": entry.get("audio_mime") or "audio/wav",
"fileName": audio_path.name,
"sizeBytes": audio_path.stat().st_size,
"durationMs": entry.get("audio_duration_ms"),
}

def _save_audio_attachment(self, history_id: int, audio: np.ndarray) -> AudioAttachmentMeta:
"""Persist recorded audio as WAV and return metadata for DB update."""
# Ensure audio directory exists
audio_dir = self.db.db_path.parent / "audio"
audio_dir.mkdir(parents=True, exist_ok=True)

relpath = Path("audio") / f"history_{history_id}.wav"
output_path = self.db.db_path.parent / relpath
tmp_path = output_path.with_suffix(".wav.tmp")

# Normalize audio into flat int16 PCM
audio_array = np.asarray(audio)
if audio_array.ndim > 1:
audio_array = audio_array.reshape(-1)

if np.issubdtype(audio_array.dtype, np.floating):
audio_clipped = np.clip(audio_array, -1.0, 1.0)
audio_int16 = (audio_clipped * 32767).astype(np.int16)
elif audio_array.dtype == np.int16:
audio_int16 = audio_array
else:
# Fallback: clip to int16 range
audio_clipped = np.clip(audio_array, -32768, 32767)
audio_int16 = audio_clipped.astype(np.int16)

with wave.open(str(tmp_path), "wb") as wf:
wf.setnchannels(self.audio_service.CHANNELS)
wf.setsampwidth(2) # 16-bit PCM
wf.setframerate(self.audio_service.SAMPLE_RATE)
wf.writeframes(audio_int16.tobytes())

tmp_path.replace(output_path)

duration_ms = int((len(audio_int16) / float(self.audio_service.SAMPLE_RATE)) * 1000)
size_bytes = output_path.stat().st_size

return {
"audio_relpath": relpath.as_posix(),
"audio_duration_ms": duration_ms,
"audio_size_bytes": size_bytes,
"audio_mime": "audio/wav",
}
Copy link

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new audio history feature lacks test coverage. While this feature adds significant new functionality including file I/O, database updates, and path security checks, no tests are included to verify the behavior of the new methods like get_history_audio, _save_audio_attachment, update_history_audio, or the audio file cleanup in delete_history and reset_all_data. Consider adding tests that verify audio is saved correctly when the setting is enabled, that it's not saved when disabled, that path traversal is prevented, and that cleanup operations properly remove audio files.

Copilot uses AI. Check for mistakes.
Comment thread src/lib/api.ts
return rpc.call("get_history", { limit, offset, search, include_audio_meta });
},

async getHistoryAudio(historyId: number): Promise<{ base64: string; mime: string; fileName?: string; sizeBytes?: number; durationMs?: number }> {
Copy link

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type for getHistoryAudio uses an inline object type that should be extracted to a named interface in types.ts for better reusability and consistency with other API methods. Consider creating an AudioAttachment or HistoryAudio interface.

Copilot uses AI. Check for mistakes.
for name, col_type in columns.items():
if name not in existing:
try:
cursor.execute(f"ALTER TABLE history ADD COLUMN {name} {col_type}")
Copy link

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ALTER TABLE statement uses an f-string to interpolate the column name and type. While this code is safe because the column names come from a hardcoded dictionary rather than user input, this pattern could be problematic if replicated elsewhere with untrusted input. Consider adding a comment explaining why this is safe, or use a safer pattern like validating the column name against a whitelist before interpolation.

Copilot uses AI. Check for mistakes.
Comment thread src/lib/audio.ts
Comment on lines +1 to +23
export const INVALID_AUDIO_PAYLOAD = "Invalid audio payload";

export function isInvalidAudioPayload(err: unknown): boolean {
return err instanceof Error && err.message === INVALID_AUDIO_PAYLOAD;
}

export function base64ToBlobUrl(base64: string, mime: string): string {
try {
const byteCharacters = atob(base64);
const byteArray = Uint8Array.from(byteCharacters, (c) => c.charCodeAt(0));
const blob = new Blob([byteArray], { type: mime });
return URL.createObjectURL(blob);
} catch (err) {
console.error("Failed to create audio blob URL (invalid payload):", err);
throw new Error(INVALID_AUDIO_PAYLOAD);
}
}

export function revokeUrl(url?: string | null) {
if (url) {
URL.revokeObjectURL(url);
}
}
Copy link

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The public functions in this module (base64ToBlobUrl, revokeUrl, isInvalidAudioPayload) lack documentation comments explaining their purpose, parameters, and return values. Adding JSDoc comments would improve maintainability and help other developers understand how to use these utility functions correctly.

Copilot uses AI. Check for mistakes.
@@ -1,9 +1,6 @@
from dataclasses import dataclass
from typing import Literal
from typing import Literal, Optional
Copy link

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'Literal' is not used.

Suggested change
from typing import Literal, Optional
from typing import Optional

Copilot uses AI. Check for mistakes.
@infiniV infiniV merged commit 22f10ee into infiniV:main Dec 17, 2025
8 checks passed
@infiniV
Copy link
Copy Markdown
Owner

infiniV commented Dec 17, 2025

Thanks for this contribution! The audio history feature is a great addition.

Just a heads up - there was a small JSX syntax issue where the Dialog component was placed as a sibling to the main div without a React fragment wrapper. This caused esbuild to fail during production builds (vite build), though it worked fine in dev mode since Vite's dev server is more lenient.

Fixed in commit 9d049ee by wrapping the adjacent elements in <>...</> fragments and added DialogDescription for accessibility compliance.

@coderabbitai coderabbitai Bot mentioned this pull request Dec 17, 2025
@EffortlessSteven
Copy link
Copy Markdown
Contributor Author

Yeah, missed it until the Coderabbit.

If you want, can send a follow-up to keep typing/imports consistent:

# src-pyloid/server.py
from typing import Optional

@server.method()
async def get_history(
    limit: int = 100,
    offset: int = 0,
    search: Optional[str] = None,
    include_audio_meta: bool = False,
):
    controller = get_controller()
    return controller.get_history(limit, offset, search, include_audio_meta)
# src-pyloid/services/settings.py
from dataclasses import dataclass
from typing import Optional
from .database import DatabaseService

@infiniV
Copy link
Copy Markdown
Owner

infiniV commented Dec 18, 2025

Sorry got busy yesterday - no worries on the follow-up, I'll handle those small type hint fixes myself. Thanks! Do suggest any other features you think are required, raise them in the issue section.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants