fix(seed-utils): payloadBytes>0 fallback for runSeed recordCount auto-detect by koala73 · Pull Request #3087 · koala73/worldmonitor

koala73 · 2026-04-14T09:10:14Z

Why this PR?

Health dashboard 2026-04-14 08:44 UTC reported 18 EMPTY_DATA CRITs in /api/health. After cross-referencing 12 Railway seeder logs, 16 of them are phantoms: the seeders ran on schedule, wrote payloads to Redis, and emitted `Verified: data present in Redis` in the same line that reported `recordCount:0`.

Smoking-gun signature (BLS-Series example):
```
[BLS-Series] {"event":"seed_complete","recordCount":0,"durationMs":3925,"payloadBytes":6093}
[BLS-Series] Verified: data present in Redis
```

VPD-Tracker is the most striking: 3 MB of payload, count reported as 0.

Root cause

`scripts/_seed-utils.mjs` `runSeed()` auto-detects `recordCount` from a hardcoded list of payload shapes:

```js
Array.isArray(data) ? data.length
: (topicArticleCount
?? data?.predictions?.length
?? data?.events?.length ?? data?.earthquakes?.length ?? data?.outages?.length
?? data?.fireDetections?.length ?? data?.anomalies?.length ?? data?.threats?.length
?? data?.quotes?.length ?? data?.stablecoins?.length
?? data?.cables?.length ?? 0);
```

If a seeder publishes a custom shape (`{score, inputs}` for fear-greed, `{geopolitical, tech}` for prediction-markets, `{primaryTitle, ...}` per-topic for insights, etc.) AND doesn't pass `opts.recordCount`, the chain falls through to 0. seed-meta is written with `{fetchedAt, recordCount: 0}`. health.js reads this and flips to EMPTY_DATA.

Audit: of 19 failing-health seeders, only 2 pass `opts.recordCount` to `runSeed` (`seed-spr-policies`, `seed-owid-energy-mix`). The other 17 rely on auto-detect.

Fix

Add a final `payloadBytes > 0 → 1` fallback to the resolution chain. When triggered, `console.warn` names the seeder so the author can add an explicit `opts.recordCount` for accurate dashboards.

Also extracted the resolution logic into a pure exported `computeRecordCount()` function so it can be unit-tested without a real Redis connection.

Resolution order (unchanged for existing callers):

`opts.recordCount` (function or number) — explicit declaration wins
Auto-detect from known shape
NEW: `payloadBytes > 0` → 1 + warn
0

Explicit `opts.recordCount: 0` still wins (test covers it) — for cases like `seed-owid-energy-mix` which deliberately reports 0.

Files

`scripts/_seed-utils.mjs` — extract `computeRecordCount()`, wire fallback into `runSeed`
`tests/seed-utils.test.mjs` — 11 new test cases

Effect

Clears 16 phantom CRITs on the next bundle cycle (one cron tick per affected seeder).
Per-seeder `console.warn` will surface in logs so we know which seeders still need explicit `opts.recordCount` for accurate dashboards.
One genuine intermittent (`unrestEvents` — ACLED quiet periods) is unchanged; that hits the SKIPPED-validation path which deliberately writes count=0.
`goldExtended` and `sprPolicies` are NOT covered by this PR — those are real bugs (dead `.then()` block in seed-commodity-quotes; missing Railway runner). Separate PRs incoming.

Testing

`node --test tests/seed-utils.test.mjs` → 18/18 (11 new + 7 existing)
`node --test tests/seed-utils-empty-data-failure.test.mjs` → 2/2
`npm run typecheck` → clean

Post-Deploy Monitoring & Validation

Logs: Watch the next 1-2 bundle cycles on Railway (`seed-bundle-macro`, `seed-bundle-health`, `seed-bundle-energy-sources`, `seed-bundle-ecb-eu`, plus standalone services). Seeders that previously logged `recordCount:0, payloadBytes:>0` will now log `recordCount:1` AND a one-time `[recordCount] auto-detect did not match a known shape (payloadBytes=N); falling back to 1. Add opts.recordCount to : for accurate health metrics.` warning.
Health endpoint: `curl -sL https://worldmonitor.app/api/health | jq '.summary'` — `crit` count should drop from 18 to ~5 within 1 hour (only the genuine cases remain: `unrestEvents` intermittent + `goldExtended`/`sprPolicies` until separate PRs land).
Failure signal / rollback: if a seeder that previously reported a meaningful recordCount now reports 1 (regression), check whether its known-shape detection broke. Revert is one-line. No data is at risk — this only affects metadata write.
Validation window: 1 hour post-deploy.
Owner: @koala73

…-detect Phantom EMPTY_DATA in /api/health: 16 of 21 failing health checks were caused by seeders publishing custom payload shapes without passing opts.recordCount. The auto-detect chain in runSeed only matches a hardcoded list of shapes; anything else falls through to recordCount=0 and triggers EMPTY_DATA in /api/health even though the payload is fully populated and verified in Redis. Smoking-gun log signature from Railway 2026-04-14: [BLS-Series] recordCount:0, payloadBytes:6093, Verified: data present [VPD-Tracker] recordCount:0, payloadBytes:3068853, Verified: data present [Disease-Outbreaks] recordCount:0, payloadBytes:92684, Verified: data present Fix: - Extract recordCount logic into pure exported computeRecordCount() for unit testability. - Add payloadBytes>0 → 1 fallback at the end of the resolution chain. When triggered, console.warn names the seeder so the author can add an explicit opts.recordCount for accurate dashboards. - Resolution order unchanged for existing callers: opts.recordCount wins, then known-shape auto-detect, then the new payloadBytes fallback, then 0. Explicit opts.recordCount=0 still wins (test covers it). Effect: clears 16 phantom CRITs on the next bundle cycle. Per-seeder warns will surface in logs so we can add accurate opts.recordCount in follow-up. Tests: 11 new computeRecordCount cases (opts precedence, auto-detect shapes, fallback behavior, no-spurious-warn, explicit-zero precedence). seed-utils.test.mjs 18/18 + seed-utils-empty-data-failure.test.mjs 2/2 + typecheck clean.

vercel · 2026-04-14T09:10:19Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
worldmonitor	Ready	Preview, Comment	Apr 14, 2026 9:27am

greptile-apps · 2026-04-14T09:14:45Z

Greptile Summary

This PR fixes 16 phantom EMPTY_DATA health alerts by adding a payloadBytes > 0 → 1 fallback to runSeed()'s recordCount resolution and extracting that logic into a pure, unit-testable computeRecordCount() function. The root cause (seeders with custom data shapes falling through to recordCount: 0) and the fix are clearly diagnosed, the resolution order is preserved for all existing callers, and the console.warn on fallback gives authors a clear signal to add an explicit opts.recordCount.

Confidence Score: 5/5

Safe to merge — fix is correct, well-scoped, and all remaining findings are minor style/test suggestions.
No P0 or P1 issues found. The resolution chain preserves all existing behavior (explicit 0 wins, known shapes still detected correctly), the fallback is a provably-safe metadata-only write, and 18/18 tests pass. Both findings are P2: a harmless test function mutation and a missing edge-case test.
No files require special attention.

Important Files Changed

Filename	Overview
scripts/_seed-utils.mjs	Extracts `computeRecordCount()` as a pure exported function and wires a `payloadBytes>0 → 1` fallback into `runSeed()`; logic is correct and the `onPhantomFallback` callback pattern keeps the function testable without Redis
tests/seed-utils.test.mjs	11 new `computeRecordCount` test cases covering explicit opts, shape auto-detect, fallback, and explicit-zero precedence; missing coverage for the empty-known-shape + payloadBytes>0 edge case (empty array should NOT trigger the fallback)

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[computeRecordCount called] --> B{opts.recordCount != null?}
    B -- Yes --> C{typeof === 'function'?}
    C -- Yes --> D[return opts.recordCount data]
    C -- No --> E[return opts.recordCount]
    B -- No --> F{Array.isArray data?}
    F -- Yes --> G[detectedFromShape = data.length]
    F -- No --> H{topicArticleCount ?? predictions.length\n?? events.length ?? earthquakes.length\n?? outages.length ?? fireDetections.length\n?? anomalies.length ?? threats.length\n?? quotes.length ?? stablecoins.length\n?? cables.length}
    H -- resolved --> I[detectedFromShape = value]
    H -- all undefined --> J[detectedFromShape = undefined]
    G --> K{detectedFromShape != null?}
    I --> K
    K -- Yes --> L[return detectedFromShape]
    K -- No / undefined --> M{payloadBytes > 0?}
    J --> M
    M -- Yes --> N[onPhantomFallback warn\nreturn 1]
    M -- No --> O[return 0]

    style N fill:#f9c,stroke:#c66
    style O fill:#fcc,stroke:#c66
    style D fill:#cfc,stroke:#6c6
    style E fill:#cfc,stroke:#6c6
    style L fill:#cfc,stroke:#6c6

_{Reviews (1): Last reviewed commit: "fix(seed-utils): payloadBytes>0 fallback..." | Re-trigger Greptile}

greptile-apps · 2026-04-14T09:14:48Z

+    );
+  });
+
+  it.each = undefined; // node:test doesn't have it.each; explicit cases below


Unnecessary mutation of imported test function

it.each = undefined assigns a property directly onto the it function object imported from node:test. Since ES module imports are live bindings (not copies), this mutates the actual function object, and technically could affect any code in this module that inspects it.each. A comment alone expresses the intent without touching the runner:

Suggested change

it.each = undefined; // node:test doesn't have it.each; explicit cases below

// Note: node:test does not provide it.each — explicit cases below

… empty-known-shape edge case Greptile review on PR #3087 caught two minor test issues: 1. `it.each = undefined` mutated the imported `it` function (ES module live binding). Replaced with a plain comment. 2. Missing edge case: `data: { events: [] }` with payloadBytes > 0 should NOT trigger the payloadBytes fallback because detectedFromShape resolves to a real 0 (not undefined). Without this guard, a future regression could collapse the !=null check and silently mask genuine empty upstream cycles as "1 record". Test added. Tests: 19/19 (was 18). No production code change.

koala73 mentioned this pull request Apr 14, 2026

fix(commodity-quotes): move .then() block to opts.afterPublish — resurrect 3 dead Redis writes #3088

Merged

greptile-apps Bot reviewed Apr 14, 2026

View reviewed changes

koala73 mentioned this pull request Apr 14, 2026

fix(spr-policies): wire seed-spr-policies into seed-bundle-energy-sources #3089

Merged

vercel Bot deployed to Preview April 14, 2026 09:27 View deployment

koala73 merged commit 5610368 into main Apr 14, 2026
9 checks passed

koala73 deleted the fix/runseed-recordcount-fallback branch April 14, 2026 09:28

koala73 mentioned this pull request Apr 14, 2026

fix(seed-forecasts): pipeline timeout 10s→45s + BATCH_SIZE 10→5 #3090

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(seed-utils): payloadBytes>0 fallback for runSeed recordCount auto-detect#3087

fix(seed-utils): payloadBytes>0 fallback for runSeed recordCount auto-detect#3087
koala73 merged 2 commits into
mainfrom
fix/runseed-recordcount-fallback

koala73 commented Apr 14, 2026

Uh oh!

vercel Bot commented Apr 14, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented Apr 14, 2026

Uh oh!

greptile-apps Bot Apr 14, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	it.each = undefined; // node:test doesn't have it.each; explicit cases below
	// Note: node:test does not provide it.each — explicit cases below

Conversation

koala73 commented Apr 14, 2026

Why this PR?

Root cause

Fix

Files

Effect

Testing

Post-Deploy Monitoring & Validation

Related

Uh oh!

vercel Bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps Bot commented Apr 14, 2026

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel Bot commented Apr 14, 2026 •

edited

Loading