Skip to content

feat(gui): Wire Audio/STT Pipeline to Tauri Shell#398

Closed
Coldaine wants to merge 13 commits intofeat/always-on-push-to-transcribefrom
feat/tauri-pipeline-wiring
Closed

feat(gui): Wire Audio/STT Pipeline to Tauri Shell#398
Coldaine wants to merge 13 commits intofeat/always-on-push-to-transcribefrom
feat/tauri-pipeline-wiring

Conversation

@Coldaine
Copy link
Copy Markdown
Owner

Stacked on #397. This PR removes the demo driver logic and connects the Tauri v2 GUI shell directly to the coldvox-app runtime. It wires the Always-On Push-to-Transcribe mode and translates STT events into frontend snapshots.

Coldaine added 13 commits March 31, 2026 14:49
Implements the Portable Agentic Evidence Standard for ColdVox:

- docs/reviews/portable_standard_critique.md: Philosophy document
  explaining why tautological unit tests are insufficient and the
  case for empirical evidence-based PR review.

- docs/reviews/reviewer_driven_evidence.md: Workflow strategy
  describing how the reviewer-driven evidence process works,
  evidence tiers (1-5), and semantic drift detection patterns.

- docs/plans/agentic-evidence-preview.md: System architecture spec
  for the shadow mode assessor: permissions, git diff strategy,
  token budget, failure modes, and Phase 2 considerations.

- .github/prompts/evidence-assessor.md: The hardened CoT prompt
  that Gemini executes in CI. Includes explicit anti-hallucination
  constraints, structured output format, and ColdVox-specific
  ground truths (Moonshine fragile, Parakeet not ready, stubs dead).

- .github/workflows/agentic-evidence-preview.yml: GitHub Actions
  workflow triggering on PR events. Uses fetch-depth: 0 for correct
  git diff, truncates diffs at 2000 lines, composes the full prompt
  with pre-gathered context, runs gemini-cli in non-interactive mode,
  and pipes the report to GITHUB_STEP_SUMMARY. Shadow mode: never
  blocks merges.

GEMINI_API_KEY secret must be configured in repo settings.
…nces

Root cleanup:
- Delete CLAUDE.md, GEMINI.md (byte-identical copies of AGENTS.md)
- Delete root junk: plugins.json, pr_365_details.json, test_enigo_live.rs
- Archive 6 root reports to docs/archive/root/

Dead backend code:
- Replace WHISPER_MODEL_PATH with STT_MODEL_PATH in types.rs and tests
- Fix integration tests: whisper -> moonshine as preferred plugin
- Update doc comments in plugin.rs, plugin_types.rs
- Delete crates/app/plugins.json (had preferred_plugin: whisper)
- Remove stale faster-whisper comment from Cargo.toml

Dead reference fixes:
- Replace ALL 20+ windows-multi-agent-recovery.md refs with current-status.md
- Remove 'absolute truth' language from agent rules
- Fix AGENTS.md pointer to nonexistent CI/policy.md -> CI/architecture.md
- Fix README.md: remove CLAUDE.md reference
- Update drive-project.prompt.md, gui-design-overview, todo.md

Doc pruning:
- Delete 15 empty/expired docs (stubs, chat transcripts, past-retention)
- Archive 8 stale docs (Linux-only, org-wide, superseded)
- Fix stt-overview.md: remove Whisper from Supported Backends
- Fix aud-user-config-design.md: Moonshine is PyO3 not pure Rust
- Fix fdn-testing-guide.md: add Parakeet validation warning

Agent instruction restructure:
- Sync AGENTS.md from .github/copilot-instructions.md (full content)
- Update ensure_agent_hardlinks.sh: source is now copilot-instructions.md
- Update check_markdown_placement.py: CLAUDE.md -> AGENTS.md
- Update standards.md: remove CLAUDE.md/GEMINI.md references

Dead vendor/scripts:
- Delete vendor/vosk/ (stubs to dead Linux runner cache)
- Delete scripts: setup-vosk-cache.sh, verify_vosk_model.sh, ensure_venv.sh, start-headless.sh
- SttRemoteAuthSettings: use #[derive(Default)] instead of manual impl
  (clippy::derivable_impls error in CI)
- deny.toml: add RUSTSEC ignores for unmaintained transitive deps from Tauri
  (gtk3-rs, fxhash, unic-*, proc-macro-error - all from wry/tauri GUI layer,
  no safe upgrade available, no security impact)
- docs/index.md: regenerate after doc cleanup changed file count/structure
Previous implementation failed because:
1. The 'Gather PR context' step failed with bash string substitution bugs
2. \{PLACEHOLDER\} patterns in bash expansion don't match {PLACEHOLDER} tokens
3. Large PRs caused the diff to be unavailable or truncated incorrectly

New approach:
- Use gemini-cli --approval-mode=yolo to give the agent autonomous tools
- Agent reads its instructions from the prompt file directly
- Agent runs git diff, reads files, and explores the repo itself
- No more brittle bash string replacement for prompt composition
- Combines two steps into one to avoid compose/run split failures
- Still uses gemini-2.0-flash (fixes model name/docs mismatch)
- Agent writes report to /tmp/report.md which is always checked

Addresses Copilot reviewer comments on bash substitution bugs and
model name mismatch between workflow comments and actual --model flag.
Delete .github/agents/ (project-driver, researcher, implementer, tester)
and .github/prompts/drive-project.prompt.md — these were prompt-only
specs with no automation hooks. They added complexity without value.

Add docs/visuals/ with two interactive HTML dashboards:
- agentic-workflow-dashboard.html: provenance, wiring, prompt anatomy
- ci-reviewer-dashboard.html: activation cadence, prompt, implications
…rthstar (complex only)

- Add complexity scorer (pure bash, no API) that counts Rust file changes
  - complex: >10 crates/ files → triggers Northstar Reviewer
  - moderate: 1-10 crates/ OR any workflow change
  - simple: docs/config only
- Evidence Assessor now always runs using gemini-2.5-flash (was gemini-2.0-flash which is invalid)
- Northstar Reviewer added, runs only on complex PRs, uses gemini-2.5-pro
- Fix file access: instructions written into workspace dir (not /tmp/) so Gemini CLI can read them
- Add northstar-alignment-reviewer.md prompt
- Add _ci_*.md and _tmp_*.md patterns to .gitignore (agent working files)
…ort paths

- Remove _ci_evidence_*.md and _ci_northstar_*.md from .gitignore
  Gemini CLI uses .gitignore as a security boundary and refuses to
  read or write any gitignored file. CI temp files must be ungitignored.
- Fix evidence-assessor.md: write to _ci_evidence_report.md (not /tmp/report.md)
- Fix northstar-alignment-reviewer.md: write to _ci_northstar_report.md

The _tmp_*.md wildcard pattern is retained for other temp files.
Switch models to validated stable versions:
- Evidence Assessor: gemini-3-flash-preview -> gemini-2.5-flash
- Northstar Reviewer: gemini-3.1-pro-preview-customtools -> gemini-2.5-pro

Add settings.json auth step before each gemini-cli invocation to
prevent OAuth browser prompt in headless CI runners. GEMINI_API_KEY
env var alone is not sufficient to skip interactive auth selection.

Update comments and Step Summary labels to match actual models.
Adds AlwaysOnPushToTranscribe Mode and maintains a 2-second rolling audio buffer in the STT processor. Prevents hotkey start mechanical clipping of transcription.
Replaces demo driver with coldvox-app runtime. Connects Always-On Push-to-Transcribe mode and wires STT events to React frontend.
Copilot AI review requested due to automatic review settings April 13, 2026 13:24
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 13, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 461fb82e-0cb4-446f-b744-964be7861985

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/tauri-pipeline-wiring

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 50484b4f1f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 5 to 6
pub struct OverlayModel {
demo_token: u64,
snapshot: OverlaySnapshot,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0 Badge Reintroduce overlay state fields still used by methods

This change removes demo_token from OverlayModel, but methods in the same file (stop, clear, and reject_command) still write to self.demo_token, and the state module still carries demo-era references. In this commit state, coldvox-gui no longer compiles, so the new Tauri pipeline wiring cannot be built or run.

Useful? React with 👍 / 👎.

Comment on lines +98 to +99
resampler_quality: coldvox_audio::ResamplerQuality::Balanced,
stt_selection: Some(coldvox_stt::plugin::PluginSelectionConfig::default()),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0 Badge Declare crates referenced in start_pipeline

start_pipeline uses coldvox_audio::ResamplerQuality and coldvox_stt::plugin::PluginSelectionConfig directly, but crates/coldvox-gui/src-tauri/Cargo.toml in this commit does not declare coldvox-audio or coldvox-stt as direct dependencies. Because Rust does not expose transitive deps by crate name, these paths are unresolved and the crate fails to compile.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR replaces the GUI’s demo-driver wiring with a real integration between the Tauri v2 shell and the coldvox-app runtime, translating runtime STT events into overlay snapshots for the frontend.

Changes:

  • Renames frontend bridge/hook commands from demo-driver to pipeline start/stop.
  • Adds Tauri commands to start/stop the coldvox-app runtime and forward TranscriptionEvents to the UI.
  • Removes the Rust-side demo script and introduces a simplified overlay model update API (partial/final/status).

Reviewed changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
crates/coldvox-gui/src/lib/overlayBridge.ts Updates invoked Tauri command names to start_pipeline / stop_pipeline.
crates/coldvox-gui/src/hooks/useOverlayShell.ts Renames hook actions to startPipeline / stopPipeline and wires them through the command runner.
crates/coldvox-gui/src/App.tsx Connects useOverlayShell pipeline actions into the OverlayShell component.
crates/coldvox-gui/src-tauri/src/state.rs Refactors overlay model away from demo-token flow toward status/partial/final update helpers.
crates/coldvox-gui/src-tauri/src/lib.rs Implements async Tauri commands to start/stop ColdVox runtime and emit overlay updates from STT events.
crates/coldvox-gui/src-tauri/src/demo.rs Removes the demo script driver.
crates/coldvox-gui/src-tauri/Cargo.toml Adds coldvox-app, tokio, and parking_lot dependencies/features for runtime integration.
Cargo.lock Lockfile updates for the new dependencies.

Comment on lines 1 to 7
use crate::contract::{OverlaySnapshot, OverlayStatus};
use crate::demo::DemoStep;

#[derive(Debug, Default)]
pub struct OverlayModel {
demo_token: u64,
snapshot: OverlaySnapshot,
}
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OverlayModel no longer has demo_token and the demo module has been removed, but this file still imports crate::demo::DemoStep and later methods/tests still reference self.demo_token, start_demo, and apply_demo_step. As-is this won’t compile; either fully remove the remaining demo-related code/tests or reintroduce the missing pieces consistently.

Copilot uses AI. Check for mistakes.
Comment on lines +91 to +105
let mut handle_guard = runtime.app_handle.lock().await;
if handle_guard.is_some() {
return Err("Pipeline already running".to_string());
}

let opts = AppRuntimeOptions {
activation_mode: ActivationMode::AlwaysOnPushToTranscribe,
resampler_quality: coldvox_audio::ResamplerQuality::Balanced,
stt_selection: Some(coldvox_stt::plugin::PluginSelectionConfig::default()),
enable_device_monitor: true,
..Default::default()
};

let mut coldvox_app = app_runtime::start(opts).await
.map_err(|e| format!("Failed to start ColdVox runner: {}", e))?;
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

runtime.app_handle.lock().await is held across the potentially slow app_runtime::start(opts).await. Holding an async mutex guard across an .await can block other commands and is a deadlock risk if future code needs this lock during startup; consider taking/setting a temporary state then dropping the guard before awaiting startup.

Copilot uses AI. Check for mistakes.
Comment on lines +114 to +129
// Spawn STT event listener
tokio::spawn(async move {
while let Some(event) = stt_rx.recv().await {
let snapshot = {
let mut model = model_clone.lock();
match event {
TranscriptionEvent::Partial { text, .. } => model.update_partial(text),
TranscriptionEvent::Final { text, .. } => model.update_final(text),
TranscriptionEvent::Error { message, .. } => {
model.set_status(OverlayStatus::Error, message)
}
}
};
let _ = emit_and_resize(&app_clone, &window_clone, &snapshot, "stt-update");
}
});
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spawned STT listener task isn’t tracked or cancelled on stop_pipeline. That means it may still emit updates after a stop (draining buffered events), and a quick stop/start cycle can result in multiple listener tasks concurrently mutating the same model/window. Consider storing the JoinHandle (or a cancellation token / generation counter) in OverlayRuntime and aborting/invalidating it on stop/restart.

Copilot uses AI. Check for mistakes.
let opts = AppRuntimeOptions {
activation_mode: ActivationMode::AlwaysOnPushToTranscribe,
resampler_quality: coldvox_audio::ResamplerQuality::Balanced,
stt_selection: Some(coldvox_stt::plugin::PluginSelectionConfig::default()),
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AppRuntimeOptions.stt_selection is set to Some(PluginSelectionConfig::default()), which causes SttPluginManager::set_selection_config to persist that default config to disk and enable default GC/metrics. If the intent is to respect an existing user config, consider loading/passing the on-disk config (or adding a way to apply selection without saving) rather than always overwriting with defaults.

Suggested change
stt_selection: Some(coldvox_stt::plugin::PluginSelectionConfig::default()),

Copilot uses AI. Check for mistakes.
Comment on lines +146 to +152
let mut handle_guard = runtime.app_handle.lock().await;
if let Some(handle) = handle_guard.take() {
Arc::new(handle).shutdown().await;
let snapshot = runtime.with_model(|model| {
model.reset_to_idle("Pipeline stopped.".to_string())
});
emit_and_resize(&app, &window, &snapshot, "pipeline-stopped")
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stop_pipeline holds the app_handle async-mutex guard while awaiting shutdown(). It’s safer to take the handle (or clone an Arc) and drop the guard before awaiting, so other commands aren’t blocked and to avoid lock-across-await patterns.

Copilot uses AI. Check for mistakes.
Comment on lines 5 to +17
use std::{
sync::{Arc, Mutex},
thread,
sync::Arc,
time::Duration,
};

use contract::{OverlayEvent, OverlaySnapshot, OVERLAY_EVENT_NAME};
use demo::demo_script;
use contract::{OverlayEvent, OverlaySnapshot, OVERLAY_EVENT_NAME, OverlayStatus};
use state::OverlayModel;
use tauri::{AppHandle, Emitter, Manager, State, WebviewWindow};
use coldvox_app::runtime::{self as app_runtime, AppHandle as ColdVoxHandle, AppRuntimeOptions, ActivationMode};
use coldvox_app::stt::TranscriptionEvent;
use coldvox_audio::ResamplerQuality;
use coldvox_stt::plugin::PluginSelectionConfig;
use tokio::sync::Mutex as AsyncMutex;
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are several now-unused imports (Duration, ResamplerQuality, PluginSelectionConfig) after removing the demo driver and switching to fully-qualified paths. Please remove or use them to keep the Tauri shell crate warning-free.

Copilot uses AI. Check for mistakes.
Comment on lines 85 to 93
return {
snapshot,
setExpanded: (expanded: boolean) => runCommand(() => setOverlayExpanded(expanded)),
startDemo: () => runCommand(startDemoDriver),
startPipeline: () => runCommand(startPipeline),
togglePause: () => runCommand(togglePauseState),
stopDemo: () => runCommand(stopDemoDriver),
stopPipeline: () => runCommand(stopPipeline),
clearTranscript: () => runCommand(clearOverlayTranscript),
openSettings: () => runCommand(openSettingsPlaceholder),
};
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This hook API rename (startDemo/stopDemostartPipeline/stopPipeline) requires updating remaining call sites/mocks. For example, crates/coldvox-gui/src/hooks/useOverlayShell.test.tsx still imports/mocks startDemoDriver and calls startDemo(), which will break tests/build until updated.

Copilot uses AI. Check for mistakes.
Comment on lines 17 to 23
<OverlayShell
snapshot={snapshot}
onSetExpanded={setExpanded}
onStartDemo={startDemo}
onStartDemo={startPipeline}
onTogglePause={togglePause}
onStop={stopDemo}
onStop={stopPipeline}
onClear={clearTranscript}
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OverlayShell still exposes onStartDemo / renders a "Run demo" control, but App now wires that to startPipeline. This naming/label mismatch makes the UI and component contract misleading; consider renaming the prop/control to reflect pipeline start (or keep the demo terminology end-to-end if it’s still intended to be a demo).

Copilot uses AI. Check for mistakes.
@Coldaine
Copy link
Copy Markdown
Owner Author

Superseded by retargeted #400 which has been merged to tauri-base.

@Coldaine Coldaine closed this Apr 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants