feat(gui): Wire Audio/STT Pipeline to Tauri Shell by Coldaine · Pull Request #398 · Coldaine/ColdVox

Coldaine · 2026-04-13T13:24:12Z

Stacked on #397. This PR removes the demo driver logic and connects the Tauri v2 GUI shell directly to the coldvox-app runtime. It wires the Always-On Push-to-Transcribe mode and translates STT events into frontend snapshots.

Implements the Portable Agentic Evidence Standard for ColdVox: - docs/reviews/portable_standard_critique.md: Philosophy document explaining why tautological unit tests are insufficient and the case for empirical evidence-based PR review. - docs/reviews/reviewer_driven_evidence.md: Workflow strategy describing how the reviewer-driven evidence process works, evidence tiers (1-5), and semantic drift detection patterns. - docs/plans/agentic-evidence-preview.md: System architecture spec for the shadow mode assessor: permissions, git diff strategy, token budget, failure modes, and Phase 2 considerations. - .github/prompts/evidence-assessor.md: The hardened CoT prompt that Gemini executes in CI. Includes explicit anti-hallucination constraints, structured output format, and ColdVox-specific ground truths (Moonshine fragile, Parakeet not ready, stubs dead). - .github/workflows/agentic-evidence-preview.yml: GitHub Actions workflow triggering on PR events. Uses fetch-depth: 0 for correct git diff, truncates diffs at 2000 lines, composes the full prompt with pre-gathered context, runs gemini-cli in non-interactive mode, and pipes the report to GITHUB_STEP_SUMMARY. Shadow mode: never blocks merges. GEMINI_API_KEY secret must be configured in repo settings.

…nces Root cleanup: - Delete CLAUDE.md, GEMINI.md (byte-identical copies of AGENTS.md) - Delete root junk: plugins.json, pr_365_details.json, test_enigo_live.rs - Archive 6 root reports to docs/archive/root/ Dead backend code: - Replace WHISPER_MODEL_PATH with STT_MODEL_PATH in types.rs and tests - Fix integration tests: whisper -> moonshine as preferred plugin - Update doc comments in plugin.rs, plugin_types.rs - Delete crates/app/plugins.json (had preferred_plugin: whisper) - Remove stale faster-whisper comment from Cargo.toml Dead reference fixes: - Replace ALL 20+ windows-multi-agent-recovery.md refs with current-status.md - Remove 'absolute truth' language from agent rules - Fix AGENTS.md pointer to nonexistent CI/policy.md -> CI/architecture.md - Fix README.md: remove CLAUDE.md reference - Update drive-project.prompt.md, gui-design-overview, todo.md Doc pruning: - Delete 15 empty/expired docs (stubs, chat transcripts, past-retention) - Archive 8 stale docs (Linux-only, org-wide, superseded) - Fix stt-overview.md: remove Whisper from Supported Backends - Fix aud-user-config-design.md: Moonshine is PyO3 not pure Rust - Fix fdn-testing-guide.md: add Parakeet validation warning Agent instruction restructure: - Sync AGENTS.md from .github/copilot-instructions.md (full content) - Update ensure_agent_hardlinks.sh: source is now copilot-instructions.md - Update check_markdown_placement.py: CLAUDE.md -> AGENTS.md - Update standards.md: remove CLAUDE.md/GEMINI.md references Dead vendor/scripts: - Delete vendor/vosk/ (stubs to dead Linux runner cache) - Delete scripts: setup-vosk-cache.sh, verify_vosk_model.sh, ensure_venv.sh, start-headless.sh

- SttRemoteAuthSettings: use #[derive(Default)] instead of manual impl (clippy::derivable_impls error in CI) - deny.toml: add RUSTSEC ignores for unmaintained transitive deps from Tauri (gtk3-rs, fxhash, unic-*, proc-macro-error - all from wry/tauri GUI layer, no safe upgrade available, no security impact) - docs/index.md: regenerate after doc cleanup changed file count/structure

Previous implementation failed because: 1. The 'Gather PR context' step failed with bash string substitution bugs 2. \{PLACEHOLDER\} patterns in bash expansion don't match {PLACEHOLDER} tokens 3. Large PRs caused the diff to be unavailable or truncated incorrectly New approach: - Use gemini-cli --approval-mode=yolo to give the agent autonomous tools - Agent reads its instructions from the prompt file directly - Agent runs git diff, reads files, and explores the repo itself - No more brittle bash string replacement for prompt composition - Combines two steps into one to avoid compose/run split failures - Still uses gemini-2.0-flash (fixes model name/docs mismatch) - Agent writes report to /tmp/report.md which is always checked Addresses Copilot reviewer comments on bash substitution bugs and model name mismatch between workflow comments and actual --model flag.

Delete .github/agents/ (project-driver, researcher, implementer, tester) and .github/prompts/drive-project.prompt.md — these were prompt-only specs with no automation hooks. They added complexity without value. Add docs/visuals/ with two interactive HTML dashboards: - agentic-workflow-dashboard.html: provenance, wiring, prompt anatomy - ci-reviewer-dashboard.html: activation cadence, prompt, implications

…rthstar (complex only) - Add complexity scorer (pure bash, no API) that counts Rust file changes - complex: >10 crates/ files → triggers Northstar Reviewer - moderate: 1-10 crates/ OR any workflow change - simple: docs/config only - Evidence Assessor now always runs using gemini-2.5-flash (was gemini-2.0-flash which is invalid) - Northstar Reviewer added, runs only on complex PRs, uses gemini-2.5-pro - Fix file access: instructions written into workspace dir (not /tmp/) so Gemini CLI can read them - Add northstar-alignment-reviewer.md prompt - Add _ci_*.md and _tmp_*.md patterns to .gitignore (agent working files)

…ort paths - Remove _ci_evidence_*.md and _ci_northstar_*.md from .gitignore Gemini CLI uses .gitignore as a security boundary and refuses to read or write any gitignored file. CI temp files must be ungitignored. - Fix evidence-assessor.md: write to _ci_evidence_report.md (not /tmp/report.md) - Fix northstar-alignment-reviewer.md: write to _ci_northstar_report.md The _tmp_*.md wildcard pattern is retained for other temp files.

Switch models to validated stable versions: - Evidence Assessor: gemini-3-flash-preview -> gemini-2.5-flash - Northstar Reviewer: gemini-3.1-pro-preview-customtools -> gemini-2.5-pro Add settings.json auth step before each gemini-cli invocation to prevent OAuth browser prompt in headless CI runners. GEMINI_API_KEY env var alone is not sufficient to skip interactive auth selection. Update comments and Step Summary labels to match actual models.

…i CLI" This reverts commit be490c4.

Adds AlwaysOnPushToTranscribe Mode and maintains a 2-second rolling audio buffer in the STT processor. Prevents hotkey start mechanical clipping of transcription.

Replaces demo driver with coldvox-app runtime. Connects Always-On Push-to-Transcribe mode and wires STT events to React frontend.

coderabbitai · 2026-04-13T13:24:21Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 461fb82e-0cb4-446f-b744-964be7861985

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/tauri-pipeline-wiring

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 50484b4f1f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-13T13:29:00Z

 pub struct OverlayModel {
-    demo_token: u64,
    snapshot: OverlaySnapshot,


Reintroduce overlay state fields still used by methods

This change removes demo_token from OverlayModel, but methods in the same file (stop, clear, and reject_command) still write to self.demo_token, and the state module still carries demo-era references. In this commit state, coldvox-gui no longer compiles, so the new Tauri pipeline wiring cannot be built or run.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-13T13:29:00Z

+        resampler_quality: coldvox_audio::ResamplerQuality::Balanced,
+        stt_selection: Some(coldvox_stt::plugin::PluginSelectionConfig::default()),


Declare crates referenced in start_pipeline

start_pipeline uses coldvox_audio::ResamplerQuality and coldvox_stt::plugin::PluginSelectionConfig directly, but crates/coldvox-gui/src-tauri/Cargo.toml in this commit does not declare coldvox-audio or coldvox-stt as direct dependencies. Because Rust does not expose transitive deps by crate name, these paths are unresolved and the crate fails to compile.

Useful? React with 👍 / 👎.

Copilot

Pull request overview

This PR replaces the GUI’s demo-driver wiring with a real integration between the Tauri v2 shell and the coldvox-app runtime, translating runtime STT events into overlay snapshots for the frontend.

Changes:

Renames frontend bridge/hook commands from demo-driver to pipeline start/stop.
Adds Tauri commands to start/stop the coldvox-app runtime and forward TranscriptionEvents to the UI.
Removes the Rust-side demo script and introduces a simplified overlay model update API (partial/final/status).

Reviewed changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
crates/coldvox-gui/src/lib/overlayBridge.ts	Updates invoked Tauri command names to `start_pipeline` / `stop_pipeline`.
crates/coldvox-gui/src/hooks/useOverlayShell.ts	Renames hook actions to `startPipeline` / `stopPipeline` and wires them through the command runner.
crates/coldvox-gui/src/App.tsx	Connects `useOverlayShell` pipeline actions into the `OverlayShell` component.
crates/coldvox-gui/src-tauri/src/state.rs	Refactors overlay model away from demo-token flow toward status/partial/final update helpers.
crates/coldvox-gui/src-tauri/src/lib.rs	Implements async Tauri commands to start/stop ColdVox runtime and emit overlay updates from STT events.
crates/coldvox-gui/src-tauri/src/demo.rs	Removes the demo script driver.
crates/coldvox-gui/src-tauri/Cargo.toml	Adds `coldvox-app`, `tokio`, and `parking_lot` dependencies/features for runtime integration.
Cargo.lock	Lockfile updates for the new dependencies.

Copilot · 2026-04-13T13:31:19Z

 use crate::contract::{OverlaySnapshot, OverlayStatus};
 use crate::demo::DemoStep;

 #[derive(Debug, Default)]
 pub struct OverlayModel {
-    demo_token: u64,
    snapshot: OverlaySnapshot,
 }


OverlayModel no longer has demo_token and the demo module has been removed, but this file still imports crate::demo::DemoStep and later methods/tests still reference self.demo_token, start_demo, and apply_demo_step. As-is this won’t compile; either fully remove the remaining demo-related code/tests or reintroduce the missing pieces consistently.

Copilot · 2026-04-13T13:31:20Z

+    let mut handle_guard = runtime.app_handle.lock().await;
+    if handle_guard.is_some() {
+        return Err("Pipeline already running".to_string());
+    }
+
+    let opts = AppRuntimeOptions {
+        activation_mode: ActivationMode::AlwaysOnPushToTranscribe,
+        resampler_quality: coldvox_audio::ResamplerQuality::Balanced,
+        stt_selection: Some(coldvox_stt::plugin::PluginSelectionConfig::default()),
+        enable_device_monitor: true,
+        ..Default::default()
+    };
+
+    let mut coldvox_app = app_runtime::start(opts).await
+        .map_err(|e| format!("Failed to start ColdVox runner: {}", e))?;


runtime.app_handle.lock().await is held across the potentially slow app_runtime::start(opts).await. Holding an async mutex guard across an .await can block other commands and is a deadlock risk if future code needs this lock during startup; consider taking/setting a temporary state then dropping the guard before awaiting startup.

Copilot · 2026-04-13T13:31:20Z

+    // Spawn STT event listener
+    tokio::spawn(async move {
+        while let Some(event) = stt_rx.recv().await {
+            let snapshot = {
+                let mut model = model_clone.lock();
+                match event {
+                    TranscriptionEvent::Partial { text, .. } => model.update_partial(text),
+                    TranscriptionEvent::Final { text, .. } => model.update_final(text),
+                    TranscriptionEvent::Error { message, .. } => {
+                        model.set_status(OverlayStatus::Error, message)
+                    }
+                }
+            };
+            let _ = emit_and_resize(&app_clone, &window_clone, &snapshot, "stt-update");
+        }
+    });


The spawned STT listener task isn’t tracked or cancelled on stop_pipeline. That means it may still emit updates after a stop (draining buffered events), and a quick stop/start cycle can result in multiple listener tasks concurrently mutating the same model/window. Consider storing the JoinHandle (or a cancellation token / generation counter) in OverlayRuntime and aborting/invalidating it on stop/restart.

Copilot · 2026-04-13T13:31:20Z

+    let opts = AppRuntimeOptions {
+        activation_mode: ActivationMode::AlwaysOnPushToTranscribe,
+        resampler_quality: coldvox_audio::ResamplerQuality::Balanced,
+        stt_selection: Some(coldvox_stt::plugin::PluginSelectionConfig::default()),


AppRuntimeOptions.stt_selection is set to Some(PluginSelectionConfig::default()), which causes SttPluginManager::set_selection_config to persist that default config to disk and enable default GC/metrics. If the intent is to respect an existing user config, consider loading/passing the on-disk config (or adding a way to apply selection without saving) rather than always overwriting with defaults.

Suggested change

stt_selection: Some(coldvox_stt::plugin::PluginSelectionConfig::default()),

Copilot · 2026-04-13T13:31:21Z

+    let mut handle_guard = runtime.app_handle.lock().await;
+    if let Some(handle) = handle_guard.take() {
+        Arc::new(handle).shutdown().await;
+        let snapshot = runtime.with_model(|model| {
+            model.reset_to_idle("Pipeline stopped.".to_string())
+        });
+        emit_and_resize(&app, &window, &snapshot, "pipeline-stopped")


stop_pipeline holds the app_handle async-mutex guard while awaiting shutdown(). It’s safer to take the handle (or clone an Arc) and drop the guard before awaiting, so other commands aren’t blocked and to avoid lock-across-await patterns.

Copilot · 2026-04-13T13:31:21Z

 use std::{
-    sync::{Arc, Mutex},
-    thread,
+    sync::Arc,
    time::Duration,
 };

-use contract::{OverlayEvent, OverlaySnapshot, OVERLAY_EVENT_NAME};
-use demo::demo_script;
+use contract::{OverlayEvent, OverlaySnapshot, OVERLAY_EVENT_NAME, OverlayStatus};
 use state::OverlayModel;
 use tauri::{AppHandle, Emitter, Manager, State, WebviewWindow};
+use coldvox_app::runtime::{self as app_runtime, AppHandle as ColdVoxHandle, AppRuntimeOptions, ActivationMode};
+use coldvox_app::stt::TranscriptionEvent;
+use coldvox_audio::ResamplerQuality;
+use coldvox_stt::plugin::PluginSelectionConfig;
+use tokio::sync::Mutex as AsyncMutex;


There are several now-unused imports (Duration, ResamplerQuality, PluginSelectionConfig) after removing the demo driver and switching to fully-qualified paths. Please remove or use them to keep the Tauri shell crate warning-free.

Copilot · 2026-04-13T13:31:21Z

  return {
    snapshot,
    setExpanded: (expanded: boolean) => runCommand(() => setOverlayExpanded(expanded)),
-    startDemo: () => runCommand(startDemoDriver),
+    startPipeline: () => runCommand(startPipeline),
    togglePause: () => runCommand(togglePauseState),
-    stopDemo: () => runCommand(stopDemoDriver),
+    stopPipeline: () => runCommand(stopPipeline),
    clearTranscript: () => runCommand(clearOverlayTranscript),
    openSettings: () => runCommand(openSettingsPlaceholder),
  };


This hook API rename (startDemo/stopDemo → startPipeline/stopPipeline) requires updating remaining call sites/mocks. For example, crates/coldvox-gui/src/hooks/useOverlayShell.test.tsx still imports/mocks startDemoDriver and calls startDemo(), which will break tests/build until updated.

Copilot · 2026-04-13T13:31:21Z

      <OverlayShell
        snapshot={snapshot}
        onSetExpanded={setExpanded}
-        onStartDemo={startDemo}
+        onStartDemo={startPipeline}
        onTogglePause={togglePause}
-        onStop={stopDemo}
+        onStop={stopPipeline}
        onClear={clearTranscript}


OverlayShell still exposes onStartDemo / renders a "Run demo" control, but App now wires that to startPipeline. This naming/label mismatch makes the UI and component contract misleading; consider renaming the prop/control to reflect pipeline start (or keep the demo terminology end-to-end if it’s still intended to be a demo).

Coldaine · 2026-04-16T11:02:49Z

Superseded by retargeted #400 which has been merged to tauri-base.

Coldaine added 13 commits March 31, 2026 14:49

docs: remove windows multi-agent recovery plan

94f94aa

Revert "fix(ci): use stable models, add headless auth setup for Gemin…

c16e3f1

…i CLI" This reverts commit be490c4.

fix(ci): restore validated Gemini preview models

8a7d4ad

feat(stt): always-on push-to-transcribe with rolling buffer

4fd3525

Adds AlwaysOnPushToTranscribe Mode and maintains a 2-second rolling audio buffer in the STT processor. Prevents hotkey start mechanical clipping of transcription.

feat(gui): wire real audio/stt pipeline to tauri shell

50484b4

Replaces demo driver with coldvox-app runtime. Connects Always-On Push-to-Transcribe mode and wires STT events to React frontend.

Copilot AI review requested due to automatic review settings April 13, 2026 13:24

Copilot started reviewing on behalf of Coldaine April 13, 2026 13:25 View session

chatgpt-codex-connector bot reviewed Apr 13, 2026

View reviewed changes

Copilot AI reviewed Apr 13, 2026

View reviewed changes

Coldaine mentioned this pull request Apr 16, 2026

feat(gui): Wire Audio/STT Pipeline to Tauri Shell #400

Merged

Coldaine force-pushed the feat/always-on-push-to-transcribe branch from 254707d to 6a77c38 Compare April 16, 2026 10:55

Coldaine closed this Apr 16, 2026

		resampler_quality: coldvox_audio::ResamplerQuality::Balanced,
		stt_selection: Some(coldvox_stt::plugin::PluginSelectionConfig::default()),

Conversation

Coldaine commented Apr 13, 2026

Uh oh!

coderabbitai bot commented Apr 13, 2026

Review skipped

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Coldaine commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants