Skip to content

fix(mnCount): keep scheduler alive on repo throws, reject oversized seed ts (Codex PR16 round 2)#17

Merged
sidhujag merged 1 commit intomainfrom
mncount-codex-p2-round2
Apr 23, 2026
Merged

fix(mnCount): keep scheduler alive on repo throws, reject oversized seed ts (Codex PR16 round 2)#17
sidhujag merged 1 commit intomainfrom
mncount-codex-p2-round2

Conversation

@sidhujag
Copy link
Copy Markdown
Member

Summary

Two P2 issues Codex flagged on PR #16's b79c18e, both real:

1. services/mnCountLogger.js — escaped rejection can kill the scheduler

repo.getLatestDate() sat outside runAndReschedule()'s try/catch, and the setTimeout callback fire-and-forgot runAndReschedule() without attaching .catch. A transient SQLite read failure there would reject the returned promise, escape as an unhandled rejection, and silently kill the logger's ability to reschedule future writes until the next process restart — exactly the failure mode the "keep sampling every UTC day" design is meant to prevent.

Fix is defense-in-depth:

  • Wrap the entire runAndReschedule body (pre-flight repo read + sample path) in one outer try/catch. Any failure is treated as a tick failure: log + exponential backoff clamped to this UTC day. The scheduler loop stays alive.
  • Extract scheduleBackoffRetry() so the happy-path catch and the last-resort scheduler catch share the same retry math (can't drift out of sync on future edits).
  • Attach a belt-and-braces .catch on the setTimeout callback that re-arms even if runAndReschedule ever starts rejecting again after a future refactor.
  • msUntilNextMidnightUtc is itself wrapped so a hypothetical throw in the time-math path can't kill the scheduler either.

2. lib/mnCountSeed.js — oversized ts aborts the whole seed

parseSeedCsv validated that ts is finite and positive but passed it straight into new Date(ts).toISOString(). ECMAScript only defines Date as valid within ±8.64e15 ms of the epoch; a finite-but-oversized ts would throw RangeError. Because the seed runs inside a single db.transaction(), a single bad row anywhere in the ~2900-row history would abort the transaction and prevent any history from loading — worse than partial-seed since even isEmpty() would stay true and every subsequent boot would retry the same broken file.

Fix: introduce JS_DATE_MAX_ABS_MS (the spec's 8.64e15) and add ts > JS_DATE_MAX_ABS_MS to the existing skip-and-warn path. A malformed row is now a one-line warn in the log, not a global loss of history.

Tests

3 new, full suite 910/910:

  • scheduler survives a synchronous throw in repo.getLatestDate() — wires a brittle repo whose getLatestDate throws once, asserts the logger logs mncount_tick_failed, schedules a same-day retry, and on retry actually writes the row. Proves the scheduler is still alive after the throw.
  • parseSeedCsv skips oversized ts — feeds 9e15 as ts, asserts no throw, asserts the good rows around it still parse, asserts the skip was logged at warn.
  • seedMasternodeCount rolls forward past an oversized ts row — end-to-end check that the txn does not roll back over a single bad row.

Test plan

  • npx jest — 910/910 passing locally (up from 907)
  • Wait for Codex review this time before merging
  • On merge: pull + pm2 restart on staging, verify /mnCount still serves the full history and no mncount_scheduler_invariant entries appear

Why a follow-up PR rather than amend-and-force-push

PR #16 is already merged. These fixes are defensive hardening on top of it; a clean follow-up keeps the git history legible and preserves the audit trail of the Codex flags and their resolution.

Made with Cursor

…eed ts (Codex PR16 round 2)

Two P2 issues Codex raised on b79c18e, both real:

1. services/mnCountLogger.js — repo.getLatestDate() sat outside
   runAndReschedule()'s try/catch, and the setTimeout callback fired
   runAndReschedule() without attaching .catch. A transient SQLite
   read failure there would reject the returned promise, escape as
   an unhandled rejection, and silently kill the logger's ability
   to reschedule future writes until the next process restart —
   exactly the failure mode the "keep sampling every UTC day" design
   is meant to prevent.

   Fix is defense-in-depth:
     * Wrap the entire runAndReschedule body (pre-flight repo read
       + sample path) in one try/catch. Any failure is treated as a
       tick failure: log + exponential backoff clamped to this UTC
       day. The scheduler loop stays alive.
     * Extract scheduleBackoffRetry() so the happy-path catch and
       the last-resort scheduler.catch share the same retry math
       (can't drift out of sync on future edits).
     * Attach a belt-and-braces .catch on the setTimeout callback
       that re-arms even if runAndReschedule ever starts rejecting
       again after a future refactor.
     * msUntilNextMidnightUtc is itself wrapped so a hypothetical
       throw in the time-math path can't kill the scheduler either.

2. lib/mnCountSeed.js — parseSeedCsv validated that ts is finite
   and positive but passed it straight into new Date(ts).toISOString().
   ECMAScript only defines Date as valid within ±8.64e15 ms of the
   epoch; a finite-but-oversized ts would throw RangeError. Because
   the seed runs inside a single db.transaction(), a single bad row
   anywhere in the ~2900-row history would abort the transaction and
   prevent any history from loading — worse than partial-seed since
   even isEmpty() would stay true and every subsequent boot would
   retry the same broken file.

   Fix: introduce JS_DATE_MAX_ABS_MS (the spec's 8.64e15) and add
   ts > JS_DATE_MAX_ABS_MS to the existing skip-and-warn path. A
   malformed row is now a one-line warn in the log, not a global
   loss of history.

Tests added (3):
  * scheduler survives a synchronous throw in repo.getLatestDate():
    wires a brittle repo whose getLatestDate throws once, asserts
    the logger logs mncount_tick_failed, schedules a same-day retry,
    and on retry actually writes the row. Proves the scheduler is
    still alive after the throw.
  * parseSeedCsv skips oversized ts: feeds 9e15 as ts, asserts no
    throw, asserts the good rows around it still parse, asserts
    the skip was logged at warn.
  * seedMasternodeCount rolls forward past an oversized ts row:
    end-to-end check that the txn does not roll back over a single
    bad row.

Full suite: 910/910 (was 907/907).

Made-with: Cursor
@sidhujag sidhujag merged commit bc8f780 into main Apr 23, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant