Skip to content

fix(diagnostics): stop CEC probe from dumping cores on Pi 5#2957

Merged
vpetersson merged 1 commit into
masterfrom
fix/cec-probe-core-dump-pi5
May 31, 2026
Merged

fix(diagnostics): stop CEC probe from dumping cores on Pi 5#2957
vpetersson merged 1 commit into
masterfrom
fix/cec-probe-core-dump-pi5

Conversation

@vpetersson
Copy link
Copy Markdown
Contributor

Issues Fixed

Not tied to a tracked issue. On Raspberry Pi 5 (and any board without a usable CEC adapter), the celery container would slowly fill the SD card with core dumps and eventually crash-loop once the disk hit 100%.

Description

The get_display_power celery-beat task (every 5 min) runs the CEC probe as a python -c subprocess (_CEC_QUERY_SCRIPT in src/anthias_server/lib/diagnostics.py). On a Pi 5 there's no usable CEC adapter, so cec.init() succeeds but tv.is_on() raises IOError → the probe reports Unknown. Then, during interpreter teardown, libcec's adapter thread aborts (FATAL: exception not rethrown, SIGABRT), dumping a ~38 MB core every run. The script's try/except can't catch it because it's a C++/pthread teardown abort, not a Python exception.

Fix: once the answer is on stdout, flush() and os._exit(0) to skip the Python/libcec teardown that aborts. Applied to both _CEC_QUERY_SCRIPT and _CEC_SET_SCRIPT. The reported value is unchanged — the probe already has its answer before teardown.

Verified on a live Pi 5: dispatching the task 3× through the worker produces no cores (previously one per run) and still reports Unknown. ruff check/ruff format --check clean; all 36 test_diagnostics.py tests pass.

Checklist

  • I have performed a self-review of my own code.
  • New and existing unit tests pass locally and on CI with my changes.
  • I have done an end-to-end test for Raspberry Pi devices.
  • I have tested my changes for x86 devices.
  • I added a documentation for the changes I have made (when necessary).

🤖 Generated with Claude Code

The get_display_power celery-beat task (every 5 min) runs the CEC
probe as a `python -c` subprocess. On boards without a usable CEC
adapter (e.g. Raspberry Pi 5) libcec's adapter thread aborts during
interpreter teardown ("FATAL: exception not rethrown", SIGABRT),
dumping a ~38 MB core each run that eventually fills the disk and
crash-loops celery. The script's try/except can't catch it — it's a
C++/pthread teardown abort, not a Python exception.

- Write the result, flush stdout, then os._exit(0) to skip the
  Python/libcec teardown that aborts.
- Apply to both _CEC_QUERY_SCRIPT and _CEC_SET_SCRIPT.

Verified on a live Pi 5: 3 task runs produce no cores, value still
reported as "Unknown".

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@vpetersson vpetersson requested a review from a team as a code owner May 31, 2026 10:21
@vpetersson vpetersson self-assigned this May 31, 2026
@sonarqubecloud
Copy link
Copy Markdown

@vpetersson vpetersson merged commit 4cfa8d4 into master May 31, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant