Conversation
Implement Moonshine speech-to-text plugin as a CPU fallback for Parakeet: - Add MoonshinePlugin with PyO3/HuggingFace Transformers integration - Support Base (61M) and Tiny (27M) model variants - Cache model in initialize() for fast subsequent transcriptions - Use safe PyO3 locals dict to prevent code injection via paths - Add audio buffer size limits (10 minute max) to prevent OOM - Register plugin in plugin_manager.rs following Parakeet pattern New files: - crates/coldvox-stt/src/plugins/moonshine.rs - Full plugin implementation - crates/coldvox-stt/tests/moonshine_e2e.rs - E2E test suite - scripts/install-moonshine-deps.sh - Python dependency installer - scripts/verify-stt-setup.sh - Setup verification script Build with: cargo build --features moonshine Requires: Python 3.8+, pip install transformers torch librosa 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Update ObjectRefOwned field access to method calls for .name() and .path() and handle Option return types to fix compilation when atspi feature is enabled.
Critical fixes: - Fix eval_bound → run_bound for multi-statement Python code (eval only accepts expressions, not statements) - Implement model_path support (was stored but never used) Improvements from Copilot/Codex review: - Feature-gate hound dependency under moonshine feature - Add auto-initialize feature to PyO3 for safer interpreter init - Fix build script to check cargo exit code instead of grep - Fix Python version check to use Python itself (portability) - Improve error message to reference install script - Add comment explaining load_model() no-op behavior - Add GIL safety documentation to Py<PyAny> fields 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Consolidate agent instructions into root AGENTS.md as the canonical source of truth. Remove docs/agents.md. Update CLAUDE.md to reference AGENTS.md. Update MasterDocumentationPlaybook.md to list root agent config files as approved exceptions.
Includes formatting updates for text-injection and WIP changes for STT processor and telemetry.
Add Moonshine STT plugin implementation using PyO3 and HuggingFace Transformers. Includes security hardening (locals dict), performance caching (model reuse), and E2E tests. Resolves PR #259.
Import PyDictMethods to ensure get_item returns Result<Option<Bound>> correctly, enabling .ok_or_else() usage.
- Add .gemini settings for Gemini configuration - Add .kilocode rules for Kilo Code - Add GitHub copilot instructions
- Add docs/**/*.md to lint-staged for auto-frontmatter on commit - Add AGENTS.md, CODEX.md, COPILOT.md to approved exceptions - Add patterns for crates/*/README.md and .github/**/*.md - Change markdown-outside-docs from error to warning - Add frontmatter to PR-259 status doc 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
- Add #[allow(unused_imports)] to unused InjectionError and tracing imports - Add #[allow(unused_variables)] to timing variables (start_time, timeout_duration) in WIP confirmation features - Add #[allow(dead_code)] to STEP_TIMEOUT constant and other intentionally unused items - Add #[allow(dead_code)] to unused helper functions for future implementation - Address unused context variables in injection methods that will be used later
- Remove text-injection feature flag (now always enabled) - Remove granular text-injection backend flags (atspi, clipboard, etc.) - Fix cross-platform dependency leakage with default-features = false - Document STT backend hardware requirements (parakeet=NVIDIA, moonshine=CPU) - Update documentation to reflect simplified feature set - Platform-specific text injection backends auto-selected at compile time Breaking Change: text-injection is no longer a feature flag. It is now mandatory for all builds. Remaining flags: default, parakeet, moonshine, silero, tui
Implements the Portable Agentic Evidence Standard for ColdVox: - docs/reviews/portable_standard_critique.md: Philosophy document explaining why tautological unit tests are insufficient and the case for empirical evidence-based PR review. - docs/reviews/reviewer_driven_evidence.md: Workflow strategy describing how the reviewer-driven evidence process works, evidence tiers (1-5), and semantic drift detection patterns. - docs/plans/agentic-evidence-preview.md: System architecture spec for the shadow mode assessor: permissions, git diff strategy, token budget, failure modes, and Phase 2 considerations. - .github/prompts/evidence-assessor.md: The hardened CoT prompt that Gemini executes in CI. Includes explicit anti-hallucination constraints, structured output format, and ColdVox-specific ground truths (Moonshine fragile, Parakeet not ready, stubs dead). - .github/workflows/agentic-evidence-preview.yml: GitHub Actions workflow triggering on PR events. Uses fetch-depth: 0 for correct git diff, truncates diffs at 2000 lines, composes the full prompt with pre-gathered context, runs gemini-cli in non-interactive mode, and pipes the report to GITHUB_STEP_SUMMARY. Shadow mode: never blocks merges. GEMINI_API_KEY secret must be configured in repo settings.
…nces Root cleanup: - Delete CLAUDE.md, GEMINI.md (byte-identical copies of AGENTS.md) - Delete root junk: plugins.json, pr_365_details.json, test_enigo_live.rs - Archive 6 root reports to docs/archive/root/ Dead backend code: - Replace WHISPER_MODEL_PATH with STT_MODEL_PATH in types.rs and tests - Fix integration tests: whisper -> moonshine as preferred plugin - Update doc comments in plugin.rs, plugin_types.rs - Delete crates/app/plugins.json (had preferred_plugin: whisper) - Remove stale faster-whisper comment from Cargo.toml Dead reference fixes: - Replace ALL 20+ windows-multi-agent-recovery.md refs with current-status.md - Remove 'absolute truth' language from agent rules - Fix AGENTS.md pointer to nonexistent CI/policy.md -> CI/architecture.md - Fix README.md: remove CLAUDE.md reference - Update drive-project.prompt.md, gui-design-overview, todo.md Doc pruning: - Delete 15 empty/expired docs (stubs, chat transcripts, past-retention) - Archive 8 stale docs (Linux-only, org-wide, superseded) - Fix stt-overview.md: remove Whisper from Supported Backends - Fix aud-user-config-design.md: Moonshine is PyO3 not pure Rust - Fix fdn-testing-guide.md: add Parakeet validation warning Agent instruction restructure: - Sync AGENTS.md from .github/copilot-instructions.md (full content) - Update ensure_agent_hardlinks.sh: source is now copilot-instructions.md - Update check_markdown_placement.py: CLAUDE.md -> AGENTS.md - Update standards.md: remove CLAUDE.md/GEMINI.md references Dead vendor/scripts: - Delete vendor/vosk/ (stubs to dead Linux runner cache) - Delete scripts: setup-vosk-cache.sh, verify_vosk_model.sh, ensure_venv.sh, start-headless.sh
- SttRemoteAuthSettings: use #[derive(Default)] instead of manual impl (clippy::derivable_impls error in CI) - deny.toml: add RUSTSEC ignores for unmaintained transitive deps from Tauri (gtk3-rs, fxhash, unic-*, proc-macro-error - all from wry/tauri GUI layer, no safe upgrade available, no security impact) - docs/index.md: regenerate after doc cleanup changed file count/structure
Previous implementation failed because:
1. The 'Gather PR context' step failed with bash string substitution bugs
2. \{PLACEHOLDER\} patterns in bash expansion don't match {PLACEHOLDER} tokens
3. Large PRs caused the diff to be unavailable or truncated incorrectly
New approach:
- Use gemini-cli --approval-mode=yolo to give the agent autonomous tools
- Agent reads its instructions from the prompt file directly
- Agent runs git diff, reads files, and explores the repo itself
- No more brittle bash string replacement for prompt composition
- Combines two steps into one to avoid compose/run split failures
- Still uses gemini-2.0-flash (fixes model name/docs mismatch)
- Agent writes report to /tmp/report.md which is always checked
Addresses Copilot reviewer comments on bash substitution bugs and
model name mismatch between workflow comments and actual --model flag.
Delete .github/agents/ (project-driver, researcher, implementer, tester) and .github/prompts/drive-project.prompt.md — these were prompt-only specs with no automation hooks. They added complexity without value. Add docs/visuals/ with two interactive HTML dashboards: - agentic-workflow-dashboard.html: provenance, wiring, prompt anatomy - ci-reviewer-dashboard.html: activation cadence, prompt, implications
…rthstar (complex only) - Add complexity scorer (pure bash, no API) that counts Rust file changes - complex: >10 crates/ files → triggers Northstar Reviewer - moderate: 1-10 crates/ OR any workflow change - simple: docs/config only - Evidence Assessor now always runs using gemini-2.5-flash (was gemini-2.0-flash which is invalid) - Northstar Reviewer added, runs only on complex PRs, uses gemini-2.5-pro - Fix file access: instructions written into workspace dir (not /tmp/) so Gemini CLI can read them - Add northstar-alignment-reviewer.md prompt - Add _ci_*.md and _tmp_*.md patterns to .gitignore (agent working files)
…ort paths - Remove _ci_evidence_*.md and _ci_northstar_*.md from .gitignore Gemini CLI uses .gitignore as a security boundary and refuses to read or write any gitignored file. CI temp files must be ungitignored. - Fix evidence-assessor.md: write to _ci_evidence_report.md (not /tmp/report.md) - Fix northstar-alignment-reviewer.md: write to _ci_northstar_report.md The _tmp_*.md wildcard pattern is retained for other temp files.
Switch models to validated stable versions: - Evidence Assessor: gemini-3-flash-preview -> gemini-2.5-flash - Northstar Reviewer: gemini-3.1-pro-preview-customtools -> gemini-2.5-pro Add settings.json auth step before each gemini-cli invocation to prevent OAuth browser prompt in headless CI runners. GEMINI_API_KEY env var alone is not sufficient to skip interactive auth selection. Update comments and Step Summary labels to match actual models.
…i CLI" This reverts commit be490c4.
Add scripts/lint_repo_integrity.py with 6 deterministic checks: - Check 1: Feature-flag doc sync (extracts features from docs, validates cargo check) - Check 2: Dead feature detection (empty features not referenced or used in code) - Check 3: Python version consistency across config files - Check 4: Frontmatter completeness for docs - Check 5: Stale doc detection (last_reviewed > 6 months) - Check 6: Test skip audit with ratchet baseline Features: - Python 3.12 stdlib only, no external deps - CLI args: --check N, --strict-freshness, --fix-baseline - Exit code 0 if all pass, 1 if any fail - Output format: [PASS|FAIL|WARN] Check Name (message) CI Integration: - Added repo-integrity job to ci.yml and ci-minimal.yml - Blocking check that runs on ubuntu-latest with setup-coldvox
- Add back RESULT: PASS/FAIL line at end of output (required by spec) - Fix Check 6 to sum skipped counts across test binaries instead of overwriting
…nd lint gates to Tauri base Combines: - Tauri v2 React GUI shell (from PR #378) - Moonshine STT plugin implementation - Parakeet plugin updates - Text injection fixes (AT-SPI, clipboard) - Mechanical lint gates for repo integrity - AI agent configuration updates Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Includes: - GUI transcript wiring and fixes (PR #386, #387 follow-ups) - Parakeet HTTP-remote integration (#382) - STT plugin GC fix (#366) - Tauri v2 React overlay shell (#381) - Build cleanup (#383) - Dependency updates Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
- Add branch protection script for tauri-base with required status checks - Create agent-review.yml with CodeRabbit integration and label management - Create automerge.yml that merges PRs with agent-approved label - Update ci-minimal.yml to run on tauri-base PRs - Required checks: Repo Integrity, Check (stable/1.90), Lint & Format, Test
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Replace the 189-line custom automerge workflow with a slim 22-line version that enables GitHub's native auto-merge on PR open. Branch protection handles the actual gating: CI checks for mechanical correctness, CodeRabbit (request_changes_workflow) for semantic review. - Simplify automerge.yml to just enable gh pr merge --auto - Add gate-main.yml to enforce PRs to main come from tauri-base only - Delete repo-level .coderabbit.yaml (global config owns everything) - Resolve merge conflicts in ci.yml (upload-artifact v7) and AGENTS.md - Remove broken setup-coldvox action from ci-minimal.yml and ci.yml - Document two-trunk branching strategy and automerge design - Add planned Windows CI runner section to architecture docs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…on, fix gate-main shell injection, regen docs index
…-stt) These four features were never declared in any [features] section and their #[cfg(feature = "...")] attributes triggered -D unexpected-cfgs errors breaking Check, Lint & Format, and Test CI jobs on tauri-base. Changes in coldvox-stt/src/plugins/mod.rs: - Remove the four dead module declarations (whisper_cpp, coqui, leopard, silero_stt) Changes in crates/app/src/runtime.rs: - Replace #[cfg(feature = "whisper")] guards on PluginSttProcessor wiring with #[cfg(any(feature = "moonshine", feature = "parakeet", feature = "http-remote"))] which is the correct gate (PluginSttProcessor already uses this gate) - Remove unused imports that were exclusive to the dead whisper stub (PluginSttProcessor/SessionEvent/TranscriptionConfig imports are now correctly cfg-gated to match where they're used) - Remove the two whisper-gated integration test functions (test_unified_stt_pipeline_*) along with their whisper-only helpers - Fix pre-existing unused_must_use on set_selection_config (was an error under RUSTFLAGS="-D warnings" used in CI) Active backends (parakeet, moonshine, http-remote) retain full pipeline wiring including PluginSttProcessor, session-event bridge, and stt_forward_handle. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The use std::process::Command import and get_binary_path function are only used inside #[cfg(windows)]-gated tests, but weren't themselves gated — causing unused-imports and dead-code errors under RUSTFLAGS="-D warnings" on Linux CI. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Replace deprecated criterion::black_box with std::hint::black_box in audio_quality_benchmarks.rs (6 occurrences) - Prefix unused variable is_off_axis with _ in spectral.rs:256 Both were hard errors under RUSTFLAGS="-D warnings" used in CI. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add missing `use crate::TextInjector;` to test_ydotool_injector.rs (trait methods is_available/inject_text were not in scope) - Apply cargo fmt to runtime.rs, lib.rs, plugin_manager.rs, coldvox-gui/lib.rs, and coldvox-stt/plugin.rs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tauri's generate_context!() macro requires icons/icon.png to exist at compile time. The file was missing, causing CI to fail with "failed to open icon...No such file or directory". Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tauri v2 generate_context!() and build scripts expect multiple icon sizes. Added minimal transparent PNG placeholders for all common sizes so CI compilation does not fail with "No such file or directory". Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. agent-review.yml: Use latestAgentText (latest artifact only) instead of combinedText (entire history) for pattern matching. Prevents false positives from historical review comments accumulating over time. 2. gate-main.yml: Add fork-repo check alongside branch-name check. A fork user who names their branch 'tauri-base' previously passed the gate. Now both branch name AND source repo are validated. Absorbs the changes from Stack 3 (PR #393) making that PR redundant. 3. crates/app/src/runtime.rs: Propagate error from set_selection_config instead of discarding it with let _ = ... The enclosing start() function already returns Result so ? works directly. 4. crates/coldvox-stt/src/plugins/mod.rs: Add missing #[cfg(feature = "http-remote")] pub mod http_remote; export so that runtime.rs imports are satisfied when http-remote feature is enabled. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Drop the `&` prefix from `.args(&[...])` calls where the borrow is redundant — Command::args() accepts anything implementing IntoIterator, so passing the array directly satisfies clippy::needless_borrows_for_generic_args. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
All fields default to None/empty which matches the derived Default, so the hand-written impl is redundant. Fixes derivable_impls lint. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Use struct literal with ..Default::default() instead of post-init field assignment in SttSettings test helper - Replace manual (len + 511) / 512 with div_ceil(512) in audio-quality test - Replace .map_or(false, ...) with .is_some_and(...) in audio-quality test Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ci: add agentic automerge pipeline for tauri-base branch
docs: record AI-gated automerge pipeline in CHANGELOG
Adds AlwaysOnPushToTranscribe Mode and maintains a 2-second rolling audio buffer in the STT processor. Prevents hotkey start mechanical clipping of transcription.
… 12 updates Bumps the rust-dependencies group with 12 updates in the / directory: | Package | From | To | | --- | --- | --- | | [tokio](https://github.com/tokio-rs/tokio) | `1.50.0` | `1.52.0` | | [rubato](https://github.com/HEnquist/rubato) | `1.0.1` | `2.0.0` | | [audioadapter](https://github.com/HEnquist/audioadapter-rs) | `2.0.1` | `3.0.0` | | [audioadapter-buffers](https://github.com/HEnquist/audioadapter-buffers-rs) | `2.0.0` | `3.0.0` | | [clap](https://github.com/clap-rs/clap) | `4.6.0` | `4.6.1` | | [fastrand](https://github.com/smol-rs/fastrand) | `2.3.0` | `2.4.1` | | [rand](https://github.com/rust-random/rand) | `0.10.0` | `0.10.1` | | [similar-asserts](https://github.com/mitsuhiko/similar-asserts) | `1.7.0` | `2.0.0` | | [libc](https://github.com/rust-lang/libc) | `0.2.183` | `0.2.185` | | [cc](https://github.com/rust-lang/cc-rs) | `1.2.58` | `1.2.60` | | [pkg-config](https://github.com/rust-lang/pkg-config-rs) | `0.3.32` | `0.3.33` | | [pyo3](https://github.com/pyo3/pyo3) | `0.28.2` | `0.28.3` | Updates `tokio` from 1.50.0 to 1.52.0 - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](tokio-rs/tokio@tokio-1.50.0...tokio-1.52.0) Updates `rubato` from 1.0.1 to 2.0.0 - [Release notes](https://github.com/HEnquist/rubato/releases) - [Commits](HEnquist/rubato@v1.0.1...v2.0.0) Updates `audioadapter` from 2.0.1 to 3.0.0 - [Release notes](https://github.com/HEnquist/audioadapter-rs/releases) - [Commits](https://github.com/HEnquist/audioadapter-rs/commits) Updates `audioadapter-buffers` from 2.0.0 to 3.0.0 - [Commits](https://github.com/HEnquist/audioadapter-buffers-rs/commits) Updates `clap` from 4.6.0 to 4.6.1 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](clap-rs/clap@clap_complete-v4.6.0...clap_complete-v4.6.1) Updates `fastrand` from 2.3.0 to 2.4.1 - [Release notes](https://github.com/smol-rs/fastrand/releases) - [Changelog](https://github.com/smol-rs/fastrand/blob/master/CHANGELOG.md) - [Commits](smol-rs/fastrand@v2.3.0...v2.4.1) Updates `rand` from 0.10.0 to 0.10.1 - [Release notes](https://github.com/rust-random/rand/releases) - [Changelog](https://github.com/rust-random/rand/blob/master/CHANGELOG.md) - [Commits](rust-random/rand@0.10.0...0.10.1) Updates `similar-asserts` from 1.7.0 to 2.0.0 - [Changelog](https://github.com/mitsuhiko/similar-asserts/blob/main/CHANGELOG.md) - [Commits](mitsuhiko/similar-asserts@1.7.0...2.0.0) Updates `libc` from 0.2.183 to 0.2.185 - [Release notes](https://github.com/rust-lang/libc/releases) - [Changelog](https://github.com/rust-lang/libc/blob/0.2.185/CHANGELOG.md) - [Commits](rust-lang/libc@0.2.183...0.2.185) Updates `cc` from 1.2.58 to 1.2.60 - [Release notes](https://github.com/rust-lang/cc-rs/releases) - [Changelog](https://github.com/rust-lang/cc-rs/blob/main/CHANGELOG.md) - [Commits](rust-lang/cc-rs@cc-v1.2.58...cc-v1.2.60) Updates `pkg-config` from 0.3.32 to 0.3.33 - [Changelog](https://github.com/rust-lang/pkg-config-rs/blob/master/CHANGELOG.md) - [Commits](rust-lang/pkg-config-rs@0.3.32...0.3.33) Updates `pyo3` from 0.28.2 to 0.28.3 - [Release notes](https://github.com/pyo3/pyo3/releases) - [Changelog](https://github.com/PyO3/pyo3/blob/main/CHANGELOG.md) - [Commits](PyO3/pyo3@v0.28.2...v0.28.3) --- updated-dependencies: - dependency-name: tokio dependency-version: 1.52.0 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: rust-dependencies - dependency-name: rubato dependency-version: 2.0.0 dependency-type: direct:production update-type: version-update:semver-major dependency-group: rust-dependencies - dependency-name: audioadapter dependency-version: 3.0.0 dependency-type: direct:production update-type: version-update:semver-major dependency-group: rust-dependencies - dependency-name: audioadapter-buffers dependency-version: 3.0.0 dependency-type: direct:production update-type: version-update:semver-major dependency-group: rust-dependencies - dependency-name: clap dependency-version: 4.6.1 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: rust-dependencies - dependency-name: fastrand dependency-version: 2.4.1 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: rust-dependencies - dependency-name: rand dependency-version: 0.10.1 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: rust-dependencies - dependency-name: similar-asserts dependency-version: 2.0.0 dependency-type: direct:production update-type: version-update:semver-major dependency-group: rust-dependencies - dependency-name: libc dependency-version: 0.2.185 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: rust-dependencies - dependency-name: cc dependency-version: 1.2.60 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: rust-dependencies - dependency-name: pkg-config dependency-version: 0.3.33 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: rust-dependencies - dependency-name: pyo3 dependency-version: 0.28.3 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: rust-dependencies ... Signed-off-by: dependabot[bot] <support@github.com>
Replaces demo driver with coldvox-app runtime. Connects Always-On Push-to-Transcribe mode and wires STT events to React frontend.
feat(gui): Wire Audio/STT Pipeline to Tauri Shell
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 6 minutes and 28 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: ⛔ Files ignored due to path filters (7)
📒 Files selected for processing (137)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Pull request overview
Merges the long-lived tauri-base integration branch into main, bringing in the Always-On Push-To-Transcribe STT mode, Rust dependency bumps, and wiring the real audio/STT runtime into the Tauri v2 shell, along with substantial CI/docs/repo cleanup and governance updates.
Changes:
- Adds Always-On Push-To-Transcribe activation mode with ~2s rolling pre-roll buffering and updated TUI activation cycling.
- Wires the ColdVox runtime into the Tauri shell, replacing the demo driver and streaming STT events into the overlay model.
- Updates Rust deps / policies and performs broad docs + CI + repo hygiene changes (removing legacy scripts/configs and adding branch protection/gating automation).
Reviewed changes
Copilot reviewed 120 out of 144 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
crates/app/src/stt/processor.rs |
Adds rolling pre-roll buffer + pre-roll injection path for Always-On Push-To-Transcribe. |
crates/app/src/stt/session.rs |
Introduces ActivationMode::AlwaysOnPushToTranscribe. |
crates/app/src/bin/tui_dashboard.rs |
Adds label + toggle cycle for the new activation mode. |
crates/coldvox-gui/src-tauri/src/lib.rs |
Replaces demo driver with real runtime startup/stop and STT event forwarding (currently has handler-list mismatch). |
crates/app/src/stt/plugin_manager.rs |
Plugin registration/formatting changes (currently includes a duplicate/invalid Moonshine registration block). |
crates/coldvox-text-injection/Cargo.toml |
Changes default features to “desktop” (currently risks pulling Linux-only backends into default builds). |
crates/coldvox-stt/tests/moonshine_e2e.rs |
Refactors test audio loading and test coverage; removes serial on env-var-mutating tests. |
docs/repo/copilot-instructions.md |
Updates workspace instructions (currently contains unresolved merge conflict markers). |
README.md |
Updates top-level README (currently duplicated content and references removed Whisper/CLAUDE docs). |
scripts/install-moonshine-deps.sh |
Changes Moonshine dependency install flow (currently uses pip instead of repo-standard uv). |
.gitignore |
Adds notes/patterns for CI agent working files (comment currently contradicts behavior). |
.github/workflows/ci-minimal.yml |
Introduces a minimal CI workflow for trunk branches (includes optional text-injection + Whisper-e2e sections). |
.github/workflows/gate-main.yml |
Adds a required check gating PRs into main to originate from tauri-base in the canonical repo. |
.github/workflows/automerge.yml / .github/workflows/agent-review.yml |
Adds automerge enabling + agent review label automation for tauri-base. |
AGENTS.md / docs/plans/current-status.md / docs/dev/CI/architecture.md |
Updates repository “source of truth” anchors and documents two-trunk strategy and CI policy. |
| Various docs/scripts removed/archived | Removes legacy scripts/configs and archives outdated documentation. |
Comments suppressed due to low confidence (1)
crates/coldvox-gui/src-tauri/src/lib.rs:222
invoke_handler!still lists commands (update_partial_transcript,update_final_transcript,set_overlay_processing,set_overlay_listening,stop_overlay_capture) that no longer exist in this module (they were removed above). This will fail to compile; either reintroduce these command handlers or remove them from the generated handler list so it matches the actual command set.
| // Register Moonshine plugin if the moonshine feature is enabled | ||
| #[cfg(feature = "moonshine")] | ||
| { | ||
| use coldvox_stt::plugins::moonshine::MoonshinePluginFactory; | ||
| registry.register(Box::new(MoonshinePluginFactory::new())); | ||
| } |
There was a problem hiding this comment.
register_builtin_plugins adds a second Moonshine registration block that calls registry.register(...), but the function parameter is named _registry. As written, registry is undefined here and this will not compile. Remove the duplicate block and keep the _registry.register(...) call.
| ### Main Binaries | ||
| <<<<<<< HEAD | ||
| - App (default build, no STT): `cargo run` | ||
| - App with STT (Vosk): `cargo run --features vosk` | ||
| ======= | ||
| - App (default build, no STT): `cargo run --bin coldvox` | ||
| >>>>>>> origin/main | ||
| - TUI Dashboard: `cargo run --bin tui_dashboard` (add `-- --device "<device name>"` and/or `--log-level <level>`) |
There was a problem hiding this comment.
This file contains unresolved git merge conflict markers (<<<<<<<, =======, >>>>>>>). These will break docs tooling and make the instructions ambiguous. Resolve the conflict and remove the markers, leaving a single, correct set of run commands.
| @@ -59,6 +59,7 @@ regex = ["dep:regex"] | |||
| # Combined features for convenience | |||
| all-backends = ["atspi", "wl_clipboard", "enigo", "kdotool"] | |||
| linux-desktop = ["atspi", "wl_clipboard", "kdotool"] | |||
| desktop = ["linux-desktop", "enigo"] # "Batteries-included" feature for most users | |||
|
|
|||
There was a problem hiding this comment.
coldvox-text-injection now enables desktop by default, and desktop includes linux-desktop (which enables atspi + wl_clipboard). This will pull Linux-only backends into the default feature set, likely breaking compilation on non-Linux targets (notably Windows, which is a primary target). Consider making the default feature set cross-platform (e.g., default to just enigo), and keep linux-desktop as an opt-in feature for Linux builds.
| @@ -224,23 +240,15 @@ fn test_factory_env_vars() { | |||
| "Should use Tiny model from env var" | |||
| ); | |||
|
|
|||
| // Restore original value | |||
| match original { | |||
| Some(v) => env::set_var("MOONSHINE_MODEL", v), | |||
| None => env::remove_var("MOONSHINE_MODEL"), | |||
| } | |||
| env::remove_var("MOONSHINE_MODEL"); | |||
| } | |||
|
|
|||
| #[test] | |||
| #[serial_test::serial] | |||
| fn test_factory_invalid_env_var() { | |||
| use coldvox_stt::plugin::SttPluginFactory; | |||
| use coldvox_stt::plugins::MoonshinePluginFactory; | |||
| use std::env; | |||
|
|
|||
| // Save original value | |||
| let original = env::var("MOONSHINE_MODEL").ok(); | |||
|
|
|||
| env::set_var("MOONSHINE_MODEL", "invalid"); | |||
| let factory = MoonshinePluginFactory::new(); | |||
| let plugin = factory | |||
| @@ -253,52 +261,5 @@ fn test_factory_invalid_env_var() { | |||
| "Should fall back to Base on invalid env var" | |||
| ); | |||
|
|
|||
| // Restore original value | |||
| match original { | |||
| Some(v) => env::set_var("MOONSHINE_MODEL", v), | |||
| None => env::remove_var("MOONSHINE_MODEL"), | |||
| } | |||
| } | |||
|
|
|||
| /// Test that MAX_AUDIO_BUFFER_SAMPLES limit is enforced | |||
| #[tokio::test] | |||
| async fn test_buffer_overflow_protection() { | |||
| let mut plugin = MoonshinePlugin::new(); | |||
|
|
|||
| let config = TranscriptionConfig::default(); | |||
| plugin.initialize(config).await.expect("Init failed"); | |||
|
|
|||
| // Create audio that exceeds 10 minutes (16000 * 60 * 10 = 9,600,000 samples) | |||
| // We'll create slightly more than the limit to test truncation | |||
| const MAX_SAMPLES: usize = 16000 * 60 * 10; | |||
| let oversized_audio: Vec<i16> = vec![0i16; MAX_SAMPLES + 10000]; | |||
|
|
|||
| // Process in chunks - the buffer should cap at MAX_SAMPLES | |||
| for chunk in oversized_audio.chunks(16000) { | |||
| let _ = plugin.process_audio(chunk).await; | |||
| } | |||
|
|
|||
| // Reset clears the buffer - if we got here without panic, the limit worked | |||
| plugin.reset().await.expect("Reset failed"); | |||
|
|
|||
| println!( | |||
| "Buffer overflow protection verified (limit: {} samples)", | |||
| MAX_SAMPLES | |||
| ); | |||
| } | |||
|
|
|||
| /// Test that common::load_test_audio validates audio format correctly | |||
| #[test] | |||
| fn test_load_test_audio_validation() { | |||
| let path = common::get_test_audio_path(); | |||
| if !path.exists() { | |||
| eprintln!("Skipping test_load_test_audio_validation: no test audio file"); | |||
| return; | |||
| } | |||
|
|
|||
| // Should not panic - validates 16kHz mono | |||
| let samples = common::load_test_audio(); | |||
| assert!(!samples.is_empty(), "Should load valid test audio"); | |||
|
|
|||
| println!("Audio validation test passed ({} samples)", samples.len()); | |||
| env::remove_var("MOONSHINE_MODEL"); | |||
| } | |||
There was a problem hiding this comment.
These tests mutate the process-global MOONSHINE_MODEL environment variable but are no longer marked serial. Rust tests run in parallel by default, so this can cause flaky failures depending on execution order. Either restore #[serial_test::serial] (or equivalent) for env-var-mutating tests, or refactor the factory to accept configuration without relying on global env state in tests.
| #!/bin/bash | ||
| # Install Moonshine Python dependencies using uv | ||
| # Install Moonshine Python dependencies | ||
|
|
||
| set -euo pipefail | ||
| set -e | ||
|
|
||
| echo "Installing Moonshine STT dependencies..." | ||
|
|
||
| # Check if uv is available | ||
| if ! command -v uv &> /dev/null; then | ||
| echo "Error: uv is not installed. Install with: curl -LsSf https://astral.sh/uv/install.sh | sh" | ||
| # Check Python version (using Python itself for portability) | ||
| if ! python3 -c "import sys; exit(0 if sys.version_info >= (3, 8) else 1)"; then | ||
| echo "Error: Python 3.8 or higher required" | ||
| exit 1 | ||
| fi | ||
|
|
||
| echo "Using uv: $(uv --version)" | ||
| PYTHON_VERSION=$(python3 --version | cut -d' ' -f2 | cut -d'.' -f1-2) | ||
| echo "✓ Python $PYTHON_VERSION detected" | ||
|
|
||
| # Create/sync virtual environment from pyproject.toml | ||
| cd "$(dirname "$0")/.." | ||
| uv sync | ||
| # Install packages | ||
| pip install --upgrade pip | ||
| pip install \ | ||
| transformers>=4.35.0 \ | ||
| torch>=2.0.0 \ | ||
| librosa>=0.10.0 | ||
|
|
There was a problem hiding this comment.
This script now installs Moonshine dependencies via pip directly. Repo guidance (AGENTS.md) states Python environments/dependencies are managed exclusively via uv; using pip here will create inconsistent environments and can bypass the pinned lockfile. Update the script to use uv (e.g., uv sync / uv pip install ...) and consider restoring set -euo pipefail for safer execution.
| ### Whisper Model Setup | ||
| - **Python Package**: Install the `faster-whisper` Python package via pip | ||
| - **Models**: Whisper models are automatically downloaded on first use | ||
| - **Model Identifiers**: Use standard Whisper model names (e.g., "tiny.en", "base.en", "small.en", "medium.en") | ||
| - **Manual Path**: Set `WHISPER_MODEL_PATH` to specify a model identifier or custom model directory | ||
| - **Common Models**: | ||
| - "tiny.en" (~39MB) - Fastest, lower accuracy | ||
| - "base.en" (~142MB) - Good balance of speed and accuracy | ||
| - "small.en" (~466MB) - Better accuracy | ||
| - "medium.en" (~1.5GB) - High accuracy | ||
|
|
||
| ## How It Works | ||
| 1. **Always-on pipeline**: Audio capture, VAD, STT, and text-injection buffering run continuously by default. Raw 16 kHz mono audio is recorded to `logs/audio_dumps/` for later review. | ||
| 2. **Voice activation (default)**: The Silero VAD segments speech automatically—no hotkey required. | ||
| 3. **Push-to-talk (preview inject)**: Hold `Super+Ctrl` to stream buffered text into the preview/injection window when you need manual control. Release to stop feeding new text. | ||
|
|
||
| More detail: See [`CLAUDE.md`](CLAUDE.md) for full developer guide. |
There was a problem hiding this comment.
README content appears duplicated (a second # ColdVox section starts mid-file) and it links to CLAUDE.md, which is deleted in this PR. It also describes Faster-Whisper/Whisper setup and WHISPER_MODEL_PATH, which conflicts with the rest of this PR removing Whisper references. Please remove the duplicated block and update the remaining README to reflect the current STT backends and existing documentation entry points.
| ### Whisper Model Setup | |
| - **Python Package**: Install the `faster-whisper` Python package via pip | |
| - **Models**: Whisper models are automatically downloaded on first use | |
| - **Model Identifiers**: Use standard Whisper model names (e.g., "tiny.en", "base.en", "small.en", "medium.en") | |
| - **Manual Path**: Set `WHISPER_MODEL_PATH` to specify a model identifier or custom model directory | |
| - **Common Models**: | |
| - "tiny.en" (~39MB) - Fastest, lower accuracy | |
| - "base.en" (~142MB) - Good balance of speed and accuracy | |
| - "small.en" (~466MB) - Better accuracy | |
| - "medium.en" (~1.5GB) - High accuracy | |
| ## How It Works | |
| 1. **Always-on pipeline**: Audio capture, VAD, STT, and text-injection buffering run continuously by default. Raw 16 kHz mono audio is recorded to `logs/audio_dumps/` for later review. | |
| 2. **Voice activation (default)**: The Silero VAD segments speech automatically—no hotkey required. | |
| 3. **Push-to-talk (preview inject)**: Hold `Super+Ctrl` to stream buffered text into the preview/injection window when you need manual control. Release to stop feeding new text. | |
| More detail: See [`CLAUDE.md`](CLAUDE.md) for full developer guide. | |
| ### STT Backend Status | |
| - **Moonshine**: Current working STT backend. This is the primary path today, though it remains environment-sensitive because of its Python/PyO3 integration. | |
| - **Parakeet**: Intended successor for a more Windows-native pipeline. It compiles, but runtime validation is still in progress and it should not be treated as the default path yet. | |
| - **Legacy stubs**: `whisper`, `coqui`, `leopard`, and `silero-stt` are not current setup targets and should not be used for new configuration. | |
| - **Configuration**: Select STT plugins through `config/plugins.json`. See [`docs/plans/current-status.md`](docs/plans/current-status.md) for the latest backend status before enabling or testing a backend. | |
| ## How It Works | |
| 1. **Always-on pipeline**: Audio capture, VAD, STT, and text-injection buffering run continuously by default. Raw 16 kHz mono audio is recorded to `logs/audio_dumps/` for later review. | |
| 2. **Voice activation (default)**: The Silero VAD segments speech automatically—no hotkey required. | |
| 3. **Push-to-talk (preview inject)**: Hold `Super+Ctrl` to stream buffered text into the preview/injection window when you need manual control. Release to stop feeding new text. | |
| More detail: | |
| - [`AGENTS.md`](AGENTS.md) for repository-wide agent and workflow guidance | |
| - [`docs/northstar.md`](docs/northstar.md) for product direction | |
| - [`docs/plans/current-status.md`](docs/plans/current-status.md) for current backend/platform reality | |
| - [`docs/architecture.md`](docs/architecture.md) for architecture and experimental direction |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 98b4a47864
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if behavior != HotkeyBehavior::Incremental { | ||
| // Batch mode: lock, buffer, and return. |
There was a problem hiding this comment.
Reintroduce incremental audio processing branch
The new control flow in PluginSttProcessor::handle_audio_frame gates all processing behind if behavior != HotkeyBehavior::Incremental, but Settings::default() still uses Incremental; as a result, the default runtime path never forwards any audio to the plugin and sessions finalize with empty input. This effectively breaks transcription in normal hotkey/always-on startup configurations until an explicit non-incremental behavior is added.
Useful? React with 👍 / 👎.
| let mut processor_settings = Settings::default(); | ||
| #[cfg(any(feature = "moonshine", feature = "parakeet", feature = "http-remote"))] | ||
| { | ||
| processor_settings.activation_mode = opts.activation_mode.into(); | ||
| } |
There was a problem hiding this comment.
Propagate activation-mode switches into STT processor settings
The processor’s activation mode is captured once at startup, but AppHandle::set_activation_mode only restarts the trigger source and never updates the running processor settings. When the app starts in the default VAD mode and the UI toggles to AlwaysOnPushToTranscribe, the pre-roll path in the processor remains disabled because it still sees the old mode, so runtime mode switching does not deliver the advertised always-on behavior.
Useful? React with 👍 / 👎.
| let mut manager = SttPluginManager::new().with_metrics_sink(metrics_clone); | ||
| #[cfg(feature = "http-remote")] | ||
| if let Some(remote_config) = opts.stt_remote_config.clone() { | ||
| manager.configure_http_remote_factory(remote_config).await; | ||
| } | ||
| if let Some(config) = opts.stt_selection.clone() { | ||
| if let Err(e) = manager.set_selection_config(config).await { | ||
| error!("Rejected STT plugin selection configuration: {}", e); | ||
| return Err(Box::new(e)); | ||
| } | ||
| manager.set_selection_config(config).await?; | ||
| } |
There was a problem hiding this comment.
Reapply HTTP remote config when constructing plugin manager
Startup now initializes SttPluginManager with selection config only, and no longer calls configure_http_remote_factory; a repo-wide search shows no remaining call site. In http-remote deployments, custom stt.remote settings (base URL, headers, auth, limits) are therefore ignored and the plugin falls back to canonical defaults, which can silently point traffic at the wrong endpoint.
Useful? React with 👍 / 👎.
Lands:
Also closes #384 (contained in #397).