Skip to content

fix: macOS audio device recovery — sleep/wake detection, health monitoring, periodic restart#2226

Closed
Limitless2023 wants to merge 1 commit intoscreenpipe:mainfrom
Limitless2023:fix/macos-audio-device-recovery
Closed

fix: macOS audio device recovery — sleep/wake detection, health monitoring, periodic restart#2226
Limitless2023 wants to merge 1 commit intoscreenpipe:mainfrom
Limitless2023:fix/macos-audio-device-recovery

Conversation

@Limitless2023
Copy link
Copy Markdown

Problem

Fixes #1626 — Mac audio devices (display audio / microphone) randomly stop recording after extended periods (typically overnight sleep or every ~48h). This is macOS-only; Windows runs for weeks without issue.

Root Cause Analysis

After deep analysis of the audio pipeline (screenpipe-audio), I identified four contributing factors:

  1. device_manager.stop_device() early-return bug — When macOS CoreAudio fires an error callback with "device is no longer valid", it sets is_running to false via a weak reference. Later, when the device monitor calls cleanup_stale_devicestop_device, the !self.is_running() check returns early without removing the stale stream from the streams DashMap. This leaves zombie state that can interfere with subsequent restart attempts.

  2. No proactive health monitoring — The device monitor only checked if recording tasks had finished (is_finished()). A stream could be technically "alive" (task running, cpal stream object exists) but producing zero audio data — a common failure mode when macOS ScreenCaptureKit sessions silently invalidate after sleep/wake cycles.

  3. No macOS sleep/wake detection — After the Mac sleeps and wakes, CoreAudio and ScreenCaptureKit sessions can become invalid. The existing 30-second timeout in run_record_and_transcribe would eventually detect this, but recovery was unreliable due to bug Add license scan report and status #1 above, and there was no proactive restart.

  4. No periodic restart safety net — For edge cases where streams silently degrade over very long periods (days), there was no mechanism to periodically refresh them.

Solution

1. Fix stop_device() cleanup logic (device_manager.rs)

Changed the early-return condition from checking only is_running to checking both is_running and stream existence. Now stop_device() always cleans up streams even when the error callback has already set is_running = false.

2. Health monitoring via capture timestamps (device_monitor.rs)

Leverages the existing DEVICE_AUDIO_CAPTURES per-device timestamps (already updated in run_record_and_transcribe). The device monitor now checks these timestamps every 2 seconds:

  • Input devices (microphone): force-restart if no data for 60s (should always produce frames)
  • Output devices (display audio/SCK): force-restart if no data for 120s (longer threshold since silence is normal when nothing is playing)

3. macOS sleep/wake detection (macos_sleep.rs)

New module using a lightweight time-gap heuristic: a background thread sleeps for 2s intervals; if the wall-clock gap between iterations exceeds 10s, a sleep/wake cycle is inferred. Zero new dependencies — no IOKit FFI needed.

On wake detection, the device monitor:

  1. Waits 3 seconds for macOS audio subsystem to re-initialize
  2. Stops and restarts all enabled audio devices
  3. Failed devices are added to the retry queue

4. Periodic forced restart (macOS only)

Every 4 hours, all audio streams are stopped and restarted as a safety net. This interval is:

  • Short enough to catch silent degradation before users notice gaps
  • Long enough to avoid unnecessary churn (stream restart takes <1s)

Key Design Decisions

  • All macOS-specific code gated behind #[cfg(target_os = "macos")] — zero impact on Windows/Linux
  • No new dependencies — sleep/wake detection uses pure std, health monitoring uses existing infrastructure
  • Leverages existing per-device capture tracking (DEVICE_AUDIO_CAPTURES) — no additional overhead
  • Conservative thresholds — input 60s, output 120s, periodic 4h — tuned to minimize false positives

Files Changed

File Change
crates/screenpipe-audio/src/device/device_manager.rs Fix stop_device() to always clean up streams
crates/screenpipe-audio/src/audio_manager/device_monitor.rs Health monitoring, sleep/wake restart, periodic restart
crates/screenpipe-audio/src/core/macos_sleep.rs New: sleep/wake detection via time-gap heuristic
crates/screenpipe-audio/src/core/mod.rs Register macos_sleep module

Testing

  • Verified recovery logic traces through all code paths manually
  • The stop_device fix is testable via existing core_tests.rs infrastructure
  • Sleep/wake detection can be verified by closing MacBook lid and reopening
  • Health monitoring triggers visible warn! logs when forcing restart

/claim #1626

…oring, periodic restart

Addresses screenpipe#1626 - Mac audio devices randomly stopping after extended periods.

Root causes identified and fixed:
1. device_manager.stop_device() early return bug when is_running was already
   false (set by error callback), causing stale streams to persist
2. No proactive health monitoring for streams alive but not producing data
3. No macOS sleep/wake detection to restart invalidated CoreAudio/SCK sessions
4. No periodic restart safety net for long-running sessions

Changes:
- Fix stop_device() to always clean up streams regardless of is_running state
- Add per-device health monitoring using existing capture timestamps
- Add macOS sleep/wake detection via time-gap heuristic (no new dependencies)
- Add periodic forced restart every 4 hours on macOS as safety net
- All macOS-specific code gated behind #[cfg(target_os = "macos")]
@louis030195 louis030195 added bug Something isn't working high priority labels Feb 15, 2026
@louis030195
Copy link
Copy Markdown
Collaborator

i need clear proof that this solve a problem, eg screen recording of before, reproducing the issue, and after, issue solved

@Limitless2023
Copy link
Copy Markdown
Author

Thanks for the review! Unfortunately, this bug requires specific hardware conditions (macOS sleep/wake cycle over extended periods) to reliably reproduce, making it difficult to provide the screen recording proof requested. Closing this PR for now — if anyone can reproduce this consistently and wants to verify the fix, feel free to reopen or build on this work.

@louis030195 louis030195 added medium priority question Further information is requested labels Feb 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working high priority medium priority question Further information is requested

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix audio device randomly stopping sometimes

2 participants