Fix biometrics: cbor2 C extension silently skips records#135
Conversation
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
📝 WalkthroughWalkthroughAdded a byte-accurate CBOR outer-record reader and integrated it into RAW file followers to read raw data bytes, track per-file read offset (_last_pos), count consecutive read failures, and implement skip/recovery logic plus proper file-handle cleanup on shutdown. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
modules/piezo-processor/main.py (1)
269-299:⚠️ Potential issue | 🟠 MajorMissing file handle cleanup on shutdown.
When
_shutdown.is_set()becomes true, the while loop exits at line 271 butself._fileis never closed. The sleep-detector module has this cleanup (lines 377-380) but it's missing here, causing a resource leak.🐛 Add file cleanup after the while loop
except Exception as e: log.warning("Error reading RAW record: %s", e) self._file.seek(self._last_pos) time.sleep(1) + + # Clean up file handle on shutdown + if self._file: + self._file.close() + self._file = None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modules/piezo-processor/main.py` around lines 269 - 299, The read_records method exits its while loop when _shutdown.is_set() becomes true but never closes self._file; add cleanup after the loop in read_records to check if self._file is truthy and open, then call self._file.close() and set self._file = None (mirroring the cleanup in the sleep-detector logic) to avoid leaking the file handle; ensure this runs regardless of how the loop exits (normal shutdown or exception).
🧹 Nitpick comments (1)
modules/piezo-processor/main.py (1)
177-247: Consider extracting shared CBOR parser to common module.This
_read_raw_recordfunction is identical to the one inmodules/sleep-detector/main.py. While module isolation can justify some duplication, having two copies of this low-level parser means:
- Bug fixes need to be applied twice
- Implementations could drift over time
Consider extracting to a shared utility module (e.g.,
modules/common/cbor_utils.py) if modules share dependencies. Otherwise, this is acceptable for now.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modules/piezo-processor/main.py` around lines 177 - 247, Duplicate low-level CBOR parser _read_raw_record exists in multiple modules; extract it into a single shared utility (e.g., a module named cbor_utils exposing read_raw_record or _read_raw_record) and update both callers to import and use that shared function; ensure the extracted function preserves current behavior and exceptions, add a small unit test or docstring, and remove the duplicate implementation from modules/sleep-detector/main.py and modules/piezo-processor/main.py so future fixes are applied in one place.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@modules/piezo-processor/main.py`:
- Around line 285-299: The loop reading RAW records can get stuck retrying
forever if corrupt data repeatedly keeps file position at self._last_pos; add a
small consecutive-failure counter (e.g., self._consecutive_failures) that
increments when an exception occurs and file.tell() == self._last_pos and resets
to 0 when progress is made (data_bytes not None or file position advances); if
the counter exceeds a small threshold (e.g., 5), log a clear error via
log.warning, advance the file past the stuck byte(s) (e.g., seek(self._last_pos
+ 1) or seek to a safe offset), reset the counter, and continue so the generator
in this reader (the block that calls _read_raw_record, updates self._last_pos,
and yields inner) won't enter an infinite retry loop.
In `@modules/sleep-detector/main.py`:
- Around line 362-375: The read loop in the generator that uses
_read_raw_record/self._file/self._last_pos can spin forever on persistent
corrupt data; add a retry counter (e.g., corrupt_retry and a MAX_CORRUPT_RETRIES
constant) inside the try/except around cbor2.loads and _read_raw_record,
increment on each non-EOF failure, and after N failures advance the file
position past the bad record (or skip one byte/record) and reset the counter
while logging a warning; also replace the broad except Exception with an
explicit tuple of exceptions like (ValueError, cbor2.CBORDecodeError, OSError)
so only parsing/IO errors are retried and other errors propagate.
---
Outside diff comments:
In `@modules/piezo-processor/main.py`:
- Around line 269-299: The read_records method exits its while loop when
_shutdown.is_set() becomes true but never closes self._file; add cleanup after
the loop in read_records to check if self._file is truthy and open, then call
self._file.close() and set self._file = None (mirroring the cleanup in the
sleep-detector logic) to avoid leaking the file handle; ensure this runs
regardless of how the loop exits (normal shutdown or exception).
---
Nitpick comments:
In `@modules/piezo-processor/main.py`:
- Around line 177-247: Duplicate low-level CBOR parser _read_raw_record exists
in multiple modules; extract it into a single shared utility (e.g., a module
named cbor_utils exposing read_raw_record or _read_raw_record) and update both
callers to import and use that shared function; ensure the extracted function
preserves current behavior and exceptions, add a small unit test or docstring,
and remove the duplicate implementation from modules/sleep-detector/main.py and
modules/piezo-processor/main.py so future fixes are applied in one place.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 2cc6b07d-ab0f-4746-9989-2d85038fc8ab
📒 Files selected for processing (2)
modules/piezo-processor/main.pymodules/sleep-detector/main.py
The cbor2 C extension (_cbor2) reads files in internal 4096-byte chunks,
advancing f.tell() by 4096 regardless of actual record size. Since RAW
records are 17-5000 bytes, cbor2.load(f) silently skips most records.
On Pod 5, piezo-dual records are ~2700 bytes so nearly every other record
was lost, severely degrading vitals accuracy.
Additionally, Pod 5 firmware writes empty placeholder records (data=b'')
as sequence markers. cbor2.loads(b'') raises CBORDecodeEOF (a subclass of
EOFError), which the read loop caught as end-of-file, terminating reads
mid-file with valid data remaining.
Fix: Replace cbor2.load(f) with _read_raw_record(f) that manually parses
the outer {seq, data} CBOR wrapper byte-by-byte, keeping f.tell() accurate.
Empty placeholders return None and are skipped. On read errors, seek back
to last known good position instead of breaking the loop.
Applied to both piezo-processor and sleep-detector modules.
Ref: throwaway31265/free-sleep#46
…covery - Extract _read_raw_record into modules/common/cbor_raw.py to eliminate duplication between piezo-processor and sleep-detector - Add file handle cleanup on shutdown to piezo-processor (was already present in sleep-detector) - Add consecutive-failure counter to both modules: after 5 failures at the same file position, skip forward 1 byte to resync past corrupt data instead of retrying forever - Narrow exception handler from bare `except Exception` to `except (ValueError, cbor2.CBORDecodeError, OSError)` so only parsing/IO errors are retried Original CBOR fix ported from throwaway31265/free-sleep#46 by @seanpasino — thank you! Co-Authored-By: seanpasino <seanpasino@users.noreply.github.com>
fadc715 to
1ee7edb
Compare
There was a problem hiding this comment.
Pull request overview
Fixes RAW CBOR ingestion reliability in the biometrics pipeline by replacing cbor2.load(f) with a shared, position-accurate outer-record reader and improving resilience to empty placeholder records and decode failures in both piezo-processor and sleep-detector.
Changes:
- Add
modules/common/cbor_raw.pywithread_raw_record()to parse the outer{seq, data}wrapper without the_cbor2file-chunking behavior. - Update
piezo-processorandsleep-detectorRAW followers to useread_raw_record(), skip placeholder records, and recover from repeated decode failures by advancing offset. - Add per-follower state for
_last_posand consecutive-failure tracking.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| modules/sleep-detector/main.py | Switch RAW parsing to read_raw_record() and add failure recovery/offset skipping. |
| modules/piezo-processor/main.py | Same RAW parsing change + adds cleanup and failure recovery/offset skipping. |
| modules/common/cbor_raw.py | New shared outer-record CBOR parser to keep f.tell() accurate and handle placeholders. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| # "seq" key — text(3) "seq" | ||
| if f.read(4) != b'\x63\x73\x65\x71': | ||
| raise ValueError('Expected seq key') | ||
|
|
||
| # seq value — uint32 (0x1a) or uint64 (0x1b) | ||
| hdr = f.read(1) | ||
| if not hdr: | ||
| raise EOFError | ||
| if hdr[0] == 0x1a: | ||
| seq_bytes = f.read(4) | ||
| if len(seq_bytes) < 4: | ||
| raise EOFError | ||
| elif hdr[0] == 0x1b: | ||
| seq_bytes = f.read(8) | ||
| if len(seq_bytes) < 8: | ||
| raise EOFError | ||
| else: | ||
| raise ValueError('Unexpected seq encoding: 0x%02x' % hdr[0]) | ||
|
|
||
| # "data" key — text(4) "data" | ||
| if f.read(5) != b'\x64\x64\x61\x74\x61': | ||
| raise ValueError('Expected data key') | ||
|
|
||
| # data value — byte string length | ||
| bs = f.read(1) | ||
| if not bs: | ||
| raise EOFError | ||
| ai = bs[0] & 0x1f | ||
| if ai <= 23: | ||
| length = ai | ||
| elif ai == 24: | ||
| lb = f.read(1) | ||
| if not lb: | ||
| raise EOFError | ||
| length = lb[0] | ||
| elif ai == 25: | ||
| lb = f.read(2) | ||
| if len(lb) < 2: | ||
| raise EOFError | ||
| length = struct.unpack('>H', lb)[0] | ||
| elif ai == 26: | ||
| lb = f.read(4) | ||
| if len(lb) < 4: | ||
| raise EOFError | ||
| length = struct.unpack('>I', lb)[0] | ||
| else: | ||
| raise ValueError('Unsupported length encoding: %d' % ai) | ||
|
|
||
| data = f.read(length) | ||
| if len(data) < length: | ||
| raise EOFError |
| # seq value — uint32 (0x1a) or uint64 (0x1b) | ||
| hdr = f.read(1) | ||
| if not hdr: | ||
| raise EOFError | ||
| if hdr[0] == 0x1a: | ||
| seq_bytes = f.read(4) | ||
| if len(seq_bytes) < 4: | ||
| raise EOFError | ||
| elif hdr[0] == 0x1b: | ||
| seq_bytes = f.read(8) | ||
| if len(seq_bytes) < 8: | ||
| raise EOFError | ||
| else: | ||
| raise ValueError('Unexpected seq encoding: 0x%02x' % hdr[0]) |
| self._last_pos = self._file.tell() | ||
| self._consecutive_failures = 0 | ||
| yield inner | ||
| except EOFError: |
| self._consecutive_failures = 0 | ||
| yield inner | ||
| except EOFError: | ||
| # No new data yet — poll |
| # data value — byte string length | ||
| bs = f.read(1) | ||
| if not bs: | ||
| raise EOFError |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@modules/piezo-processor/main.py`:
- Around line 224-231: The EOFError handler currently swallows
cbor2.CBORDecodeEOF (a subclass of EOFError) and prevents the recovery branch
from running; update the exception handling so that the recovery handler
includes cbor2.CBORDecodeEOF (i.e., add cbor2.CBORDecodeEOF to the tuple caught
by the except that currently lists (ValueError, cbor2.CBORDecodeError, OSError))
while keeping the simple EOFError sleep/poll branch for genuine file-EOF cases;
locate the block around the yield of inner that updates self._last_pos and
self._consecutive_failures to apply this change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b1a412c5-8582-43a0-99c0-b42e4d9a192e
📒 Files selected for processing (3)
modules/common/cbor_raw.pymodules/piezo-processor/main.pymodules/sleep-detector/main.py
✅ Files skipped from review due to trivial changes (1)
- modules/common/cbor_raw.py
| inner = cbor2.loads(data_bytes) | ||
| self._last_pos = self._file.tell() | ||
| self._consecutive_failures = 0 | ||
| yield inner | ||
| except EOFError: | ||
| # No new data yet — poll | ||
| time.sleep(0.01) | ||
| except Exception as e: | ||
| log.warning("Error reading RAW record: %s", e) | ||
| time.sleep(1) | ||
| except (ValueError, cbor2.CBORDecodeError, OSError) as e: |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
In cbor2 5.6.5, what is the exception hierarchy for CBORDecodeEOF? Is it a subclass of both CBORDecodeError and EOFError?
💡 Result:
In cbor2 5.6.5, CBORDecodeEOF is an EOFError subclass, intended for streaming use cases. (cbor2.readthedocs.io)
It is not documented (or described in the changelog) as a subclass of CBORDecodeError, so you should treat it as a separate exception from the CBOR decode-error family (i.e., catching CBORDecodeError won’t necessarily catch CBORDecodeEOF). (cbor2.readthedocs.io)
Citations:
- 1: https://cbor2.readthedocs.io/en/latest/versionhistory.html?utm_source=openai
- 2: https://cbor2.readthedocs.io/en/stable/api.html
🏁 Script executed:
cat -n modules/piezo-processor/main.py | sed -n '220,245p'Repository: sleepypod/core
Length of output: 1483
The except EOFError handler will catch CBORDecodeEOF before error recovery runs.
Line 228's except EOFError will intercept cbor2.CBORDecodeEOF (which is a subclass of EOFError in cbor2 5.6.5) before the recovery handler on Line 231 can run. This allows malformed payloads to stall the reader at the same offset.
The suggested fix of reordering the handlers doesn't resolve this because CBORDecodeEOF is not a subclass of CBORDecodeError, so it won't be caught by the (ValueError, cbor2.CBORDecodeError, OSError) handler regardless of order.
Instead, explicitly include cbor2.CBORDecodeEOF in the error recovery branch:
Suggested fix
- except EOFError:
- # No new data yet — poll
- time.sleep(0.01)
- except (ValueError, cbor2.CBORDecodeError, OSError) as e:
+ except (ValueError, cbor2.CBORDecodeError, cbor2.CBORDecodeEOF, OSError) as e:
self._consecutive_failures += 1
if self._consecutive_failures >= MAX_CONSECUTIVE_FAILURES:
log.warning("Skipping past corrupt data at offset %d after %d failures: %s",
self._last_pos, self._consecutive_failures, e)
self._last_pos += 1
self._consecutive_failures = 0
else:
log.debug("Error reading RAW record (attempt %d): %s",
self._consecutive_failures, e)
self._file.seek(self._last_pos)
time.sleep(0.1)
+ except EOFError:
+ # No new data yet — poll
+ time.sleep(0.01)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@modules/piezo-processor/main.py` around lines 224 - 231, The EOFError handler
currently swallows cbor2.CBORDecodeEOF (a subclass of EOFError) and prevents the
recovery branch from running; update the exception handling so that the recovery
handler includes cbor2.CBORDecodeEOF (i.e., add cbor2.CBORDecodeEOF to the tuple
caught by the except that currently lists (ValueError, cbor2.CBORDecodeError,
OSError)) while keeping the simple EOFError sleep/poll branch for genuine
file-EOF cases; locate the block around the yield of inner that updates
self._last_pos and self._consecutive_failures to apply this change.
Fixes from dual-agent (Optimizer + Skeptic) adversarial code review: Parser hardening (cbor_raw.py): - Validate data value major type is byte string (mt=2) - Accept all CBOR uint encodings for seq (inline through uint64) - Detect partial key reads as EOFError, not ValueError - Add length sanity cap (1MB) to prevent corrupt-length OOM - Add uint64 (ai=27) support for data byte-string length - Document protocol contract (key order, seq encoding) Error recovery (raw_follower.py — extracted from both modules): - Seek back to _last_pos on EOFError (prevents partial-record loss) - Reset _consecutive_failures on EOFError (prevents false corruption) - Swap except clause order: catch CBORDecodeError before EOFError (CBORDecodeEOF is a subclass of both — wrong branch caused infinite retry on corrupt inner records) - Guard _find_latest against FileNotFoundError race (glob vs stat) Architecture: - Extract RawFileFollower to modules/common/raw_follower.py (DRY) - Add modules/common/__init__.py for explicit package - Configurable poll_interval replaces hardcoded sleep values Pre-existing fixes: - Fix report_health connection leak (try/finally around conn.close) Co-Authored-By: seanpasino <seanpasino@users.noreply.github.com>
Adversarial Code Review — Findings & FixesRan dual-agent adversarial review (Optimizer + Skeptic). All findings addressed in Fixed
Architecture changes
|
|
🎉 This PR is included in version 1.1.0 🎉 The release is available on GitHub release Your semantic-release bot 📦🚀 |
Summary
f.tell()past valid records. Pod 5 piezo records are ~2700 bytes, so nearly every other record was silently dropped — severely degrading HR/HRV/breathing accuracy.data=b''markers.cbor2.loads(b'')raisesCBORDecodeEOF(subclass ofEOFError), which the loop caught as end-of-file, stopping reads mid-file with megabytes of valid data remaining.Fix applied to both
piezo-processorandsleep-detectormodules. Ported from free-sleep#46.Test plan
🤖 Generated with Claude Code
Summary by CodeRabbit
Bug Fixes
Improvements
New Features