fix: macOS audio device recovery — sleep/wake detection, health monitoring, periodic restart#2226
Closed
Limitless2023 wants to merge 1 commit intoscreenpipe:mainfrom
Closed
Conversation
…oring, periodic restart Addresses screenpipe#1626 - Mac audio devices randomly stopping after extended periods. Root causes identified and fixed: 1. device_manager.stop_device() early return bug when is_running was already false (set by error callback), causing stale streams to persist 2. No proactive health monitoring for streams alive but not producing data 3. No macOS sleep/wake detection to restart invalidated CoreAudio/SCK sessions 4. No periodic restart safety net for long-running sessions Changes: - Fix stop_device() to always clean up streams regardless of is_running state - Add per-device health monitoring using existing capture timestamps - Add macOS sleep/wake detection via time-gap heuristic (no new dependencies) - Add periodic forced restart every 4 hours on macOS as safety net - All macOS-specific code gated behind #[cfg(target_os = "macos")]
Collaborator
|
i need clear proof that this solve a problem, eg screen recording of before, reproducing the issue, and after, issue solved |
Author
|
Thanks for the review! Unfortunately, this bug requires specific hardware conditions (macOS sleep/wake cycle over extended periods) to reliably reproduce, making it difficult to provide the screen recording proof requested. Closing this PR for now — if anyone can reproduce this consistently and wants to verify the fix, feel free to reopen or build on this work. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Fixes #1626 — Mac audio devices (display audio / microphone) randomly stop recording after extended periods (typically overnight sleep or every ~48h). This is macOS-only; Windows runs for weeks without issue.
Root Cause Analysis
After deep analysis of the audio pipeline (
screenpipe-audio), I identified four contributing factors:device_manager.stop_device()early-return bug — When macOS CoreAudio fires an error callback with "device is no longer valid", it setsis_runningtofalsevia a weak reference. Later, when the device monitor callscleanup_stale_device→stop_device, the!self.is_running()check returns early without removing the stale stream from thestreamsDashMap. This leaves zombie state that can interfere with subsequent restart attempts.No proactive health monitoring — The device monitor only checked if recording tasks had finished (
is_finished()). A stream could be technically "alive" (task running, cpal stream object exists) but producing zero audio data — a common failure mode when macOS ScreenCaptureKit sessions silently invalidate after sleep/wake cycles.No macOS sleep/wake detection — After the Mac sleeps and wakes, CoreAudio and ScreenCaptureKit sessions can become invalid. The existing 30-second timeout in
run_record_and_transcribewould eventually detect this, but recovery was unreliable due to bug Add license scan report and status #1 above, and there was no proactive restart.No periodic restart safety net — For edge cases where streams silently degrade over very long periods (days), there was no mechanism to periodically refresh them.
Solution
1. Fix
stop_device()cleanup logic (device_manager.rs)Changed the early-return condition from checking only
is_runningto checking bothis_runningand stream existence. Nowstop_device()always cleans up streams even when the error callback has already setis_running = false.2. Health monitoring via capture timestamps (
device_monitor.rs)Leverages the existing
DEVICE_AUDIO_CAPTURESper-device timestamps (already updated inrun_record_and_transcribe). The device monitor now checks these timestamps every 2 seconds:3. macOS sleep/wake detection (
macos_sleep.rs)New module using a lightweight time-gap heuristic: a background thread sleeps for 2s intervals; if the wall-clock gap between iterations exceeds 10s, a sleep/wake cycle is inferred. Zero new dependencies — no IOKit FFI needed.
On wake detection, the device monitor:
4. Periodic forced restart (macOS only)
Every 4 hours, all audio streams are stopped and restarted as a safety net. This interval is:
Key Design Decisions
#[cfg(target_os = "macos")]— zero impact on Windows/LinuxDEVICE_AUDIO_CAPTURES) — no additional overheadFiles Changed
crates/screenpipe-audio/src/device/device_manager.rsstop_device()to always clean up streamscrates/screenpipe-audio/src/audio_manager/device_monitor.rscrates/screenpipe-audio/src/core/macos_sleep.rscrates/screenpipe-audio/src/core/mod.rsmacos_sleepmoduleTesting
stop_devicefix is testable via existingcore_tests.rsinfrastructurewarn!logs when forcing restart/claim #1626