Skip to content

cherry-pick: fix: skip corrupt audit entries during iteration instead of throwingv5.0 (conflict)#581

Merged
kriszyp merged 2 commits into
v5.0from
cherry-pick/v5.0/pr-577
May 18, 2026
Merged

cherry-pick: fix: skip corrupt audit entries during iteration instead of throwingv5.0 (conflict)#581
kriszyp merged 2 commits into
v5.0from
cherry-pick/v5.0/pr-577

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

Resolve cherry-pick conflicts

PR #577 (fix: skip corrupt audit entries during iteration instead of throwing) conflicted when cherry-picking onto v5.0.

Conflict markers (<<<<<<< / ======= / >>>>>>>) have been committed to this branch. Please check out the branch, resolve all conflicts, push, and merge.

Source PR: #577
Commit(s): b84fbbd9f6714cd7d77a2663a26c3f77d3fc85e3 6b6192cb8711b8a3e956823801fb8ce58321c0cc

Devin-Holland and others added 2 commits May 18, 2026 21:33
A corrupt audit entry encountered during rocks/lmdb iteration was throwing
RangeError out of the for-of consumer in an async context, landing as
uncaughtException on a later tick. This stalled outgoing replication for the
affected (peer, db) pair until the process restarted — and because the bad
offset was deterministic, every restart re-hit the same throw during catch-up.

Three failure points are addressed:

- RocksTransactionLogStore.getRange's `.map()` callback now wraps the entire
  per-entry decode (the rocks prelude reads plus readAuditEntry) in
  try/catch. On any decode error it yields a corrupt-marker record with the
  timestamp preserved, so iteration advances past the bad entry. The throw
  in the field trace (`decoder.getUint32(0)` at RocksTransactionLogStore.ts:294)
  was originating here.

- readAuditEntry validates the recordId/username length fields against the
  buffer's byteLength before advancing decoder.position. A corrupt 0xff-prefixed
  uint32 length would otherwise push position hundreds of megabytes past the
  end and produce an opaque RangeError from a later readFloat64.

- The lazy `recordId` and `user` getters on the returned record live outside
  readAuditEntry's outer try/catch. They now catch and log, returning undefined
  instead of letting a deferred decode failure escape on property access.

- The catch in readAuditEntry returns a structurally complete sentinel
  (corrupt: true, all fields undefined, getValue/getBinaryValue/getBinaryRecordId
  present) rather than `{}`, so downstream consumers that call methods on the
  record don't NPE after a header decode failure.

- keyEncoder.readKey checks buffer length before calling getFloat64; a
  truncated key buffer now logs and returns NaN instead of throwing up through
  lmdb-js's iterator.

Downstream consumers already skip records with undefined `tableId`/`type`
(via the existing classifyAuditEntryForReplay guard and the `tableId !== tableId`
guards in Table.ts / transactionBroadcast.ts), so the sentinel surface keeps
working with current consumer code.

Unit tests cover: oversized recordId length, mid-header length-field mutation,
lazy getter access on a corrupt body, and round-trip of a valid entry to lock
in that the new guards don't regress the happy path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…upt flag

Review follow-ups on the corrupt audit entry fix:

- Tests now check the actual skip signals consumers branch on
  (tableId === undefined / type === undefined) instead of an internal
  `corrupt: true` flag that no consumer reads. Drop the flag from the
  sentinel and the AuditRecord type to avoid speculative API surface.

- Add direct unit test for transactionKeyEncoder.readKey returning NaN
  on a truncated float64 key buffer, plus a happy-path round-trip to lock
  in the bounds check didn't regress real timestamp-range key decoding.

- Add unit test exercising the rocks-prelude catch in
  RocksTransactionLogStore.getRange's map callback by stubbing the
  rootStore + nodeLogs and yielding a TransactionEntry whose data buffer
  is too short for getUint32(0) — the exact path from the field stack
  trace. Asserts iteration completes, the sentinel surfaces with
  undefined tableId/type, and the log timestamp is preserved on the
  sentinel so lastTxnTime advances.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot requested a review from a team as a code owner May 18, 2026 21:33
@kriszyp kriszyp merged commit 63b0b55 into v5.0 May 18, 2026
2 checks passed
@kriszyp kriszyp deleted the cherry-pick/v5.0/pr-577 branch May 18, 2026 21:42
kriszyp added a commit that referenced this pull request May 19, 2026
PR #581 ("skip corrupt audit entries during iteration") landed with
unresolved <<<<<<< / >>>>>>> markers in resources/auditStore.ts and
resources/RocksTransactionLogStore.ts, breaking the corrupt-entry tests
in unitTests/resources/auditLog.test.js.

Resolved by keeping the cherry-picked behavior on both sides:
- auditStore.ts: return createCorruptAuditSentinel(...) from the catch
  (instead of `{}`) so consumers can call getValue / access getters
  without NPEs; union the AuditRecord type fields (structureVersion +
  getBinaryRecordId).
- RocksTransactionLogStore.ts: wrap the per-entry rocks prelude decode
  in try/catch so a corrupt 4-16 byte prefix yields a sentinel record
  with timestamp preserved instead of throwing through iterable.map().

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
kriszyp added a commit that referenced this pull request May 19, 2026
PR #581 ("skip corrupt audit entries during iteration") landed with
unresolved <<<<<<< / >>>>>>> markers in resources/auditStore.ts and
resources/RocksTransactionLogStore.ts, breaking the corrupt-entry tests
in unitTests/resources/auditLog.test.js.

Resolved by keeping the cherry-picked behavior on both sides:
- auditStore.ts: return createCorruptAuditSentinel(...) from the catch
  (instead of `{}`) so consumers can call getValue / access getters
  without NPEs; union the AuditRecord type fields (structureVersion +
  getBinaryRecordId).
- RocksTransactionLogStore.ts: wrap the per-entry rocks prelude decode
  in try/catch so a corrupt 4-16 byte prefix yields a sentinel record
  with timestamp preserved instead of throwing through iterable.map().

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants