Skip to content

fix(audio): quiet TCC-denied retry spam + stop ffmpeg-missing panics#3612

Merged
louis030195 merged 1 commit into
mainfrom
fix/sentry-tcc-quiet-and-ffmpeg-panic
May 26, 2026
Merged

fix(audio): quiet TCC-denied retry spam + stop ffmpeg-missing panics#3612
louis030195 merged 1 commit into
mainfrom
fix/sentry-tcc-quiet-and-ffmpeg-panic

Conversation

@louis030195
Copy link
Copy Markdown
Collaborator

what

Two unrelated audio-engine Sentry fixes, both small.

SCREENPIPE-CLI-S8 — TCC-denied retry-loop spam

device_monitor.rs:1131-1148 runs `start_device` every 2s. The error handler already swallows "already running", "not found", and SCK "callback never fired" transients, but NOT "declined TCCs" — so every tick on a user who hasn't granted mic/screen permission hits `error!` and ships to Sentry.

  • before: 4 users × ~50 events/wk of identical "declined TCCs" noise on current CLI builds
  • after: warn-and-continue, same pattern as the existing "callback never fired" branch — pick up the moment the user grants permission

SCREENPIPE-CLI-T0 + CLI-T5 — ffmpeg-missing panic (Linux)

ffmpeg.rs::encode_single_audio panicked via `.expect("Failed to spawn FFmpeg process")` when ffmpeg wasn't in PATH (Linux build doesn't bundle it). Same with `find_ffmpeg_path().unwrap()`, `output_path.to_str().unwrap()`, `stdin.take().expect()`, `wait_with_output().unwrap()` — five panic sites in 30 lines.

  • before: missing ffmpeg → worker thread panic → audio recording dies silently
  • after: all five return `Result` with anyhow context that points at the install fix (`apt install ffmpeg` / `brew install ffmpeg`)

test plan

  • `cargo check -p screenpipe-audio` clean
  • manual: deny mic permission on macOS, watch logs — should see one warn per tick, no Sentry events
  • manual: `PATH= screenpipe record` on Linux — should error gracefully, not panic

🤖 Generated with Claude Code

Two unrelated Sentry-noise fixes in the audio engine, both small.

SCREENPIPE-CLI-S8 (4 users / 198 events on current builds):
device_monitor.rs runs `start_device` every 2s. The error handler
already swallows "already running", "not found", and SCK "callback
never fired" transients, but NOT "declined TCCs" — so every tick on a
user who hasn't granted mic/screen permission hits `error!` and ships
to Sentry. Add the same warn-and-continue branch for TCC denials and
the older "Screen recording permission denied" wording.

SCREENPIPE-CLI-T0 + CLI-T5 (2 users, Linux):
ffmpeg.rs::encode_single_audio panicked via
`.expect("Failed to spawn FFmpeg process")` when ffmpeg wasn't in PATH
(Linux build doesn't bundle it). Same with find_ffmpeg_path().unwrap(),
output_path.to_str().unwrap(), stdin.take().expect(), and
wait_with_output().unwrap() — none should crash the audio worker.
Convert all five to `?` with anyhow context that points at the install
fix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@louis030195 louis030195 merged commit 1711ae1 into main May 26, 2026
16 of 23 checks passed
@github-actions
Copy link
Copy Markdown
Contributor

Diarization eval results

Source: crates/screenpipe-audio-eval/evals/ · VoxConverse dev (CC-BY-4.0) + composed workday templates + screenpipe-shaped LibriSpeech fixtures

fixture DER VAD FA VAD FN boundary err (s) continuity predicted / true spk
interrupted_meeting 0.186 0.01 0.063 20.286 0.833 9 / 5
long_silence_day 0.437 0.011 0.145 11.46 0.7 14 / 10
screenpipe_meeting_rapid_handoffs 0.241 0.196 0.099 2.305 1 5 / 3
screenpipe_background_24_7_day 0.315 0.025 0.159 2.203 1 4 / 3
screenpipe_short_backchannels 0.561 0.915 0.064 0.488 n/a 3 / 3
screenpipe_mic_system_echo_leakage 0.275 0.198 0.084 3.045 0.667 5 / 3
screenpipe_overlap_crosstalk 0.254 0.84 0.042 0.667 n/a 3 / 3
abjxc 0.016 0.098 0.002 1.151 n/a 2 / 1
bxpwa 0.111 0.453 0.029 20.793 0.714 8 / 5
dhorc 0.143 0.461 0.034 3.681 1 5 / 4

DER, VAD FA, VAD FN, boundary err: lower is better. Continuity: higher is better, 1.0 = same hyp cluster across all silence gaps. Composed workday rows and screenpipe_* rows exercise screenpipe-shaped usage: meetings, background gaps, backchannels, echo leakage, and crosstalk. Raw VoxConverse rows score broadcast-quality stems for comparison. See crates/screenpipe-audio-eval/evals/README.md for methodology.

Pipeline replay matrix

Source: generated screenpipe_* fixtures materialized into temp screenpipe SQLite DBs, then read back through search_audio. This catches storage/search regressions that pure DER scoring misses.

scenarios passed failed skipped avg background DER avg background speaker err Deepgram
41 40 0 1 0.329 0.183 skip

The no-secret CI matrix runs local diarization under Parakeet/Whisper engine labels across live/background and mic/system device profiles. Real Deepgram/screenpipe-cloud smoke can be run locally with --deepgram required when credentials are present.

Transcription quality

Source: LibriSpeech test-clean (CC-BY-4.0) · per-model utterance cap · normalized lowercased word-level Levenshtein

model utterances WER CER throughput (samples/s)
tiny 50 0.085 0.033 55160
whisper-large-v3-turbo-quantized 20 0.042 0.009 1831
parakeet 50 0.04 0.026 237679

WER + CER on read-aloud speech. Per-model utterance caps keep wall time bounded — tiny/parakeet at 50, the heavier large-v3-turbo-quantized at 20. See README for normalization rules.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant