Skip to content

feat(seed-contract): PR 2a — runSeed envelope dual-write + 91 seeders migrated#3097

Merged
koala73 merged 12 commits intomainfrom
feat/seed-contract-pr2-migrate
Apr 15, 2026
Merged

feat(seed-contract): PR 2a — runSeed envelope dual-write + 91 seeders migrated#3097
koala73 merged 12 commits intomainfrom
feat/seed-contract-pr2-migrate

Conversation

@koala73
Copy link
Copy Markdown
Owner

@koala73 koala73 commented Apr 15, 2026

Summary

Full Unified Seed Contract PR 2 — writers, readers, bundles, direct writers, all in dual-write mode. Builds on PR #3095 (foundation).

What's in this PR (8 commits)

  1. runSeed contract mode + 91 seeders migrated — every scripts/seed-*.mjs exports declareRecords and writes {_seed, data} envelope to canonical key alongside legacy seed-meta:*.
  2. Internal reader unwrapreadSeedSnapshot, verifySeedKey, redisGet, 18 seeders' local helpers, seed-forecasts pipeline batch, seed-energy-spine redisMget. Addresses both P1 findings from review.
  3. External reader unwrapserver/_shared/redis.ts (getRawJson, getCachedJson, getCachedJsonBatch, inherited by cachedFetchJson), api/_upstash-json.js (readJsonFromUpstash → covers all api/mcp.ts tool responses), api/bootstrap.js batch reader.
  4. 3 more server-side readers unwrapresilience/v1/_shared.ts + get-resilience-ranking.ts + infrastructure/v1/_shared.ts:mgetJson.
  5. consumer-prices-core/src/jobs/publish.ts — 5 canonical keys (overview, movers 7d/30d, freshness, categories 7d/30d/90d, retailer-spread, basket-series) now produce envelopes.
  6. scripts/ais-relay.cjs — 32 canonical-key write sites produce envelopes via envelopeWrite(key, data, ttl, meta) helper.
  7. 15 seed-bundle-*.mjs — 54 sections across 12 bundles declare canonicalKey alongside seedMetaKey. _bundle-runner.mjs prefers canonicalKey, gates on envelope _seed.fetchedAt directly.
  8. Tests — new tests/seed-utils-envelope-reads.test.mjs (7 cases); updated tests/cross-source-signals-regulatory.test.mjs + tests/ucdp-seed-resilience.test.mjs for static-analysis pattern changes.

Verification

  • npm run test:data5307/5307 pass
  • npm run typecheck:all — clean
  • node --check clean on every modified script
  • Contract conformance: 84/86 seeders full descriptor (2 pre-existing soft-warns)
  • Post-sweep grep for unwrapped JSON.parse(redis_result) in api/server — no regressions remain
  • Staging dry-run: deploy 2-3 seeders to Railway cron, verify envelope shape + legacy seed-meta both populated

Dual-write invariant (PR 2 phase)

Every seeder now writes:

  • Envelope to canonical data key (new, source of truth going forward)
  • Legacy seed-meta:<domain>:<resource> (preserved, keeps api/health.js + any unmigrated reader happy)

Every reader (getCachedJson, readJsonFromUpstash, readSeedSnapshot, etc.) now unwraps envelopes automatically — legacy bare-shape values pass through unchanged.

What PR 3 does (not in this PR)

  • Stop writing legacy seed-meta:* keys.
  • Collapse api/health.js 5-registry system (SEED_META, ON_DEMAND_KEYS, EMPTY_DATA_OK_KEYS, STANDALONE_KEYS, BOOTSTRAP_KEYS, CASCADE_GROUPS) to single SEEDS registry. Deferred because it requires switching health from STRLEN to GET on ~100 data keys (operational risk: doubles Upstash payload per check). Cleaner once legacy is gone.
  • Remove fallback branches in readers.
  • Delete writeSeedMeta / writeFreshnessMetadata helpers.

Test plan

  • Unit tests pass
  • Staging deploy & verify /api/health reports no regression
  • Spot-check envelope shape: economic:fsi-eu:v1, climate:zone-normals:v1, wildfire:fires:v1, market:commodities:v1:..., consumer-prices:overview:ae
  • Verify api/bootstrap + api/mcp responses contain no _seed field

Plan: docs/plans/2026-04-14-002-fix-runseed-zero-record-lockout-plan.md

… migrated

Opt-in contract path in runSeed: when opts.declareRecords is provided, write
{_seed, data} envelope to the canonical key alongside legacy seed-meta:*
(dual-write). State machine: OK / OK_ZERO / RETRY with zeroIsValid opt.
declareRecords throws or returns non-integer → hard fail (contract violation).
extraKeys[*] support per-key declareRecords; each extra key writes its own
envelope. Legacy seeders (no declareRecords) entirely unchanged.

Migrated all 91 scripts/seed-*.mjs to contract mode. Each exports
declareRecords returning the canonical record count, and passes
schemaVersion: 1 + maxStaleMin (matched to api/health.js SEED_META, or 2.5x
interval where no registry entry exists). Contract conformance reports 84/86
seeders with full descriptor (2 pre-existing warnings).

Legacy seed-meta keys still written so unmigrated readers keep working;
follow-up slices flip health.js + readers to envelope-first.

Tests: 61/61 PR 1 tests still pass.

Next slices for PR 2:
- api/health.js registry collapse + 15 seed-bundle-*.mjs canonicalKey wiring
- reader migration (mcp, resilience, aviation, displacement, regional-snapshot)
- direct writers — ais-relay.cjs, consumer-prices-core publish.ts
- public-boundary stripSeedEnvelope + test migration

Plan: docs/plans/2026-04-14-002-fix-runseed-zero-record-lockout-plan.md
@vercel
Copy link
Copy Markdown

vercel bot commented Apr 15, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
worldmonitor Ready Ready Preview, Comment Apr 15, 2026 4:56am

Request Review

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 15, 2026

Greptile Summary

This PR wires the contract-mode dual-write path into runSeed and migrates 91 seeders to export declareRecords plus the full descriptor fields (schemaVersion, sourceVersion, maxStaleMin). The approach is safe for legacy readers because the envelope write is additive alongside the existing seed-meta:* key.

  • P1 — extra-key envelope mislabels empty writes as 'OK': in _seed-utils.mjs line 917, state: ekCount > 0 ? 'OK' : (zeroIsValid ? 'OK_ZERO' : 'OK') has 'OK' where it should be 'RETRY'. Any extra key that returns 0 records (and zeroIsValid is false) will be written to Redis with state: 'OK', causing downstream health readers to treat it as healthy once envelope reads are enabled in PR 2c.
  • P2 — 39 of the 91 migrated seeders are missing the isMain guard: they export declareRecords but call runSeed() unconditionally at module level. The conformance test reads files as text today so it doesn't trigger, but any future test that imports one of these modules will fire runSeed() against Redis and kill the test process via process.exit().

Confidence Score: 4/5

Safe to merge for the dual-write path itself; the P1 extra-key state bug should be fixed before envelope reads are enabled in PR 2c.

One P1 bug exists: extra-key envelopes mislabel 0-record writes as state 'OK' instead of 'RETRY'. This is latent today because no reader inspects _seed.state yet (PR 2c), but it will silently corrupt health signals once readers migrate. The P2 isMain gap affects 39 seeders and is a test-safety risk rather than a production correctness issue now.

scripts/_seed-utils.mjs — the ekEnvelope state ternary at line 917 is the only blocking fix needed.

Important Files Changed

Filename Overview
scripts/_seed-utils.mjs Core runSeed engine with contract mode dual-write state machine; contains P1 bug where extra-key envelope uses 'OK' instead of 'RETRY' when ekCount=0 and zeroIsValid=false
scripts/_seed-contract.mjs validateDescriptor and resolveRecordCount implementations look correct; soft-warn behavior during PR 2 is intentional and well-documented
scripts/_seed-envelope-source.mjs buildEnvelope / unwrapEnvelope / stripSeedEnvelope are clean and match the spec; no issues
scripts/seed-fsi-eu.mjs Reference implementation of contract migration with proper isMain guard and full descriptor fields; no issues
scripts/seed-aviation.mjs Exports declareRecords but calls runSeed() unconditionally at module level — missing isMain guard that ~39 other migrated seeders also lack
scripts/seed-jodi-gas.mjs Uses extraKeys in contract mode — the LNG vulnerability extra key would be subject to the P1 ekEnvelope state bug if it ever returns 0 records
scripts/seed-aaii-sentiment.mjs Correctly uses isMain guard, exports declareRecords, and carries both legacy recordCount and new declareRecords during dual-write transition period
scripts/seed-climate-news.mjs Missing isMain guard — calls runSeed() at module level while exporting declareRecords

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[runSeed with declareRecords] --> B{contractMode?}
    B -- No --> C[Legacy: bare data + seed-meta write]
    B -- Yes --> D[resolveRecordCount for canonical key]
    D --> E{count gt 0?}
    E -- Yes --> F[contractState = OK]
    E -- No --> G{zeroIsValid?}
    G -- Yes --> H[contractState = OK-ZERO]
    G -- No --> I[contractState = RETRY - skip publish, extend TTL, exit 0]
    F --> J[atomicPublish with envelope wrapper]
    H --> J
    J --> K[writeFreshnessMetadata - dual-write seed-meta]
    K --> L{extraKeys present?}
    L -- Yes --> M[resolveRecordCount per extra key]
    M --> N{ekCount gt 0?}
    N -- Yes --> O[ekState = OK]
    N -- No --> P{zeroIsValid?}
    P -- Yes --> Q[ekState = OK-ZERO]
    P -- No --> R[ekState = OK BUG should be RETRY]
    O --> S[writeExtraKey with envelope]
    Q --> S
    R --> S
    I --> T[exit 0]
Loading

Reviews (1): Last reviewed commit: "feat(seed-contract): PR 2a — runSeed env..." | Re-trigger Greptile

Comment thread scripts/_seed-utils.mjs
recordCount: ekCount,
sourceVersion: sourceVersion || '',
schemaVersion: schemaVersion || 1,
state: ekCount > 0 ? 'OK' : (zeroIsValid ? 'OK_ZERO' : 'OK'),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Extra-key envelope reports 'OK' instead of 'RETRY' on zero-record write

When ekCount === 0 and zeroIsValid is false, the final branch evaluates to 'OK' — the same value used for a healthy write. Any future reader that inspects _seed.state on the extra-key envelope will incorrectly treat the key as healthy, masking the empty-data condition that the RETRY state is meant to signal. The canonical key correctly aborts publishing on RETRY; the extra-key path has no such guard and always writes — and now mislabels that write.

Suggested change
state: ekCount > 0 ? 'OK' : (zeroIsValid ? 'OK_ZERO' : 'OK'),
state: ekCount > 0 ? 'OK' : (zeroIsValid ? 'OK_ZERO' : 'RETRY'),

Comment thread scripts/seed-aviation.mjs
Comment on lines +267 to 271
export function declareRecords(data) {
return data?.summaries?.length ?? 0;
}

runSeed('aviation', 'ops-news', OPS_CACHE_KEY, fetchAll, {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Missing isMain guard on seeders that export declareRecords

seed-aviation.mjs exports declareRecords (for contract conformance) but calls runSeed() unconditionally at the module level. The comment in seed-fsi-eu.mjs explicitly marks this guard as "required for scripts that export AND call runSeed at top level — prevents runSeed() from firing when this module is imported in tests or CI." The conformance test reads files as text so it doesn't trigger this today, but any future test that does import { declareRecords } from './seed-aviation.mjs' will immediately fire runSeed(), attempting a Redis write and calling process.exit(), killing the test runner.

The same pattern appears in ~39 other seeders in this PR (e.g. seed-climate-news.mjs, seed-conflict-intel.mjs, seed-bigmac.mjs …). seed-fsi-eu.mjs is the reference implementation:

const isMain = process.argv[1] && import.meta.url.endsWith(process.argv[1].replace(/^file:\/\//, ''));
if (isMain) {
  runSeed(...).catch(...);
}

koala73 added 3 commits April 15, 2026 07:40
After PR 2a enveloped 91 canonical keys as {_seed, data}, every script-side
reader that returned the raw parsed JSON started silently handing callers the
envelope instead of the bare payload. WoW baselines (bigmac, grocery-basket,
fear-greed) saw undefined .countries / .composite; seed-climate-anomalies saw
undefined .normals from climate:zone-normals:v1; seed-thermal-escalation saw
undefined .fireDetections from wildfire:fires:v1; seed-forecasts' ~40-key
pipeline batch returned envelopes for every input.

Fix: route every script-side reader through unwrapEnvelope(...).data. Legacy
bare-shape values pass through unchanged (unwrapEnvelope returns
{_seed: null, data: raw} for any non-envelope shape).

Changed:
- scripts/_seed-utils.mjs: import unwrapEnvelope; redisGet, readSeedSnapshot,
  verifySeedKey all unwrap. Exported new readCanonicalValue() helper for
  cross-seed consumers.
- 18 seed-*.mjs scripts with local redisGet-style helpers or inline fetch
  patched to unwrap via the envelope source module (subagent sweep).
- scripts/seed-forecasts.mjs pipeline batch: parse() unwraps each result.
- scripts/seed-energy-spine.mjs redisMget: unwraps each result.

Tests:
- tests/seed-utils-envelope-reads.test.mjs: 7 new cases covering envelope
  + legacy + null paths for readSeedSnapshot and verifySeedKey.
- Full seed suite: 67/67 pass (was 61, +6 new).

Addresses both of user's P1 findings on PR #3097.
Every RPC and public-boundary reader now automatically strips _seed from
contract-mode canonical keys. Legacy bare-shape values pass through unchanged
(unwrapEnvelope no-ops on non-envelope shapes).

Changed helpers (one-place fix — unblocks ~60 call sites):
- server/_shared/redis.ts: getRawJson, getCachedJson, getCachedJsonBatch
  unwrap by default. cachedFetchJson inherits via getCachedJson.
- api/_upstash-json.js: readJsonFromUpstash unwraps (covers api/mcp.ts
  tool responses + all its canonical-key reads).
- api/bootstrap.js: getCachedJsonBatch unwraps (public-boundary —
  clients never see envelope metadata).

Left intentionally unchanged:
- api/health.js / api/seed-health.js: read only seed-meta:* keys which
  remain bare-shape during dual-write. unwrapEnvelope already imported at
  the meta-read boundary (PR 1) as a defensive no-op.

Tests: 67/67 seed tests pass. typecheck + typecheck:api clean.

This is the blast-radius fix the PR #3097 review called out — external
readers that would otherwise see {_seed, data} after the writer side
migrated.
cross-source-signals-regulatory.test.mjs loads scripts/seed-cross-source-signals.mjs
via vm.runInContext, which cannot parse ESM `export` syntax. PR 2a added
`export function declareRecords` to every seeder, which broke this test's
static-analysis approach.

Fix: strip the `export` keyword from the declareRecords line in the
preprocessed source string so the function body still evaluates as a plain
declaration.

Full test:data suite: 5307/5307 pass. typecheck + typecheck:api clean.
koala73 added 4 commits April 15, 2026 07:50
Wrap the 5 canonical keys written by consumer-prices-core/src/jobs/publish.ts
(overview, movers:7d/30d, freshness, categories:7d/30d/90d, retailer-spread,
basket-series) in {_seed, data} envelopes. Legacy seed-meta:<key> writes
preserved for dual-write.

Inlined a buildEnvelope helper (10 lines) rather than taking a cross-package
dependency — consumer-prices-core is a standalone npm package. Documented the
four-file parity contract (mjs source, ts mirror, js edge mirror, this copy).

Contract fields: sourceVersion='consumer-prices-core-publish-v1', schemaVersion=1,
state='OK' (recordCount>0) or 'OK_ZERO' (legitimate zero).

Typecheck: no new errors in publish.ts.
Found during final audit:

- server/worldmonitor/resilience/v1/_shared.ts: resilience score reader
  parsed cached GetResilienceScoreResponse raw. Contract-mode seed-resilience-scores
  now envelopes those keys.
- server/worldmonitor/resilience/v1/get-resilience-ranking.ts: p05/p95
  interval lookup parsed raw from seed-resilience-scores' extra-key path.
- server/worldmonitor/infrastructure/v1/_shared.ts: mgetJson() used for
  count-source keys (wildfire:fires:v1, news:insights:v1) which are both
  contract-mode now.

All three now unwrap via server/_shared/seed-envelope. Legacy shapes pass
through unchanged.

Typecheck clean.
32 canonical-key write sites in scripts/ais-relay.cjs now produce {_seed, data}
envelopes. Inlined buildEnvelope() (CJS module can't require ESM source) +
envelopeWrite(key, data, ttlSeconds, meta) wrapper. Enveloped keys span market
bootstrap, aviation, cyber-threats, theater-posture, weather-alerts, economic
spending/fred/worldbank, tech-events, corridor-risk, usni-fleet, shipping-stress,
social:reddit, wsb-tickers, pizzint, product-catalog, chokepoint transits,
ucdp-events, satellites, oref.

Left bare (not seeded data keys): seed-meta:* (dual-write legacy),
classifyCacheKey LLM cache, notam:prev-closed-state internal state,
wm:notif:scan-dedup flags.

Updated tests/ucdp-seed-resilience.test.mjs regex to accept both upstashSet
(pre-contract) and envelopeWrite (post-contract) call patterns.
54 bundle sections across 12 files now declare canonicalKey alongside the
existing seedMetaKey. _bundle-runner.mjs (from PR 1) prefers canonicalKey
when both are present — gates section runs on envelope._seed.fetchedAt
read directly from the data key, eliminating the meta-outlives-data class
of bugs.

Files touched:
- climate (5), derived-signals (2), ecb-eu (3), energy-sources (6),
  health (2), imf-extended (4), macro (10), market-backup (9),
  portwatch (4), relay-backup (2), resilience-recovery (5), static-ref (2)

Skipped (14 sections, 3 whole bundles): multi-key writers, dynamic
templated keys (displacement year-scoped), or non-runSeed orchestrators
(regional brief cron, resilience-scores' 222-country publish, validation/
benchmark scripts). These continue to use seedMetaKey or their own gate.

seedMetaKey preserved everywhere — dual-write. _bundle-runner.mjs falls
back to legacy when canonicalKey is absent.

All 15 bundles pass node --check. test:data: 5307/5307. typecheck:all: clean.
…mismatches + envelope leaks

Addresses both P1 findings and the extra-key seed-meta leak surfaced in review:

1. runSeed helper-level invariant: seed-meta:* keys NEVER envelope.
   scripts/_seed-utils.mjs exports shouldEnvelopeKey(key) — returns false for
   any key starting with 'seed-meta:'. Both atomicPublish (canonical) and
   writeExtraKey (extras) gate the envelope wrap through this helper. Fixes
   seed-iea-oil-stocks' ANALYSIS_META_EXTRA_KEY silently getting enveloped,
   which broke health.js parsing the value as bare {fetchedAt, recordCount}.
   Also defends against any future manual writeExtraKey(..., envelopeMeta)
   call that happens to target a seed-meta:* key.

2. seed-token-panels canonical + extras fixed.
   publishTransform returns data.defi (the defi panel itself, shape {tokens}).
   Old declareRecords counted data.defi.tokens + data.ai.tokens + data.other.tokens
   on the transformed payload → 0 → RETRY path → canonical market:defi-tokens:v1
   never wrote, and because runSeed returned before the extraKeys loop,
   market:ai-tokens:v1 + market:other-tokens:v1 stayed stale too.
   New: declareRecords counts data.tokens on the transformed shape. AI_KEY +
   OTHER_KEY extras reuse the same function (transforms return structurally
   identical panels). Added isMain guard so test imports don't fire runSeed.

3. api/product-catalog.js cached reader unwraps envelope.
   ais-relay.cjs now envelopes product-catalog:v2 via envelopeWrite(). The
   edge reader did raw JSON.parse(result) and returned {_seed, data} to
   clients, breaking the cached path. Fix: import unwrapEnvelope from
   ./_seed-envelope.js, apply after JSON.parse. One site — :238-241 is
   downstream of getFromCache(), so the single reader fix covers both.

4. Regression lock tests/seed-contract-transform-regressions.test.mjs (11 cases):
   - shouldEnvelopeKey invariant: seed-meta:* false, canonical true
   - Token-panels declareRecords works on transformed shape (canonical + both extras)
   - Explicit repro of pre-fix buggy signature returning 0 — guards against revert
   - resolveRecordCount accepts 0, rejects non-integer
   - Product-catalog envelope unwrap returns bare shape; legacy passes through

Verification:
- npm run test:data → 5318/5318 pass (was 5307 — 11 new regressions)
- npm run typecheck:all → clean
- node --check on every modified script

iea-oil-stocks canonical declareRecords was NOT broken (user confirmed during
review — buildIndex preserves .members); only its ANALYSIS_META_EXTRA_KEY
was affected, now covered generically by commit 1's helper invariant.
koala73 added 2 commits April 15, 2026 08:34
…ansform shape

Review finding: fixing declareRecords wasn't sufficient — atomicPublish() runs
validateFn(publishData) on the transformed payload too. seed-token-panels'
validate() checked data.defi/.ai/.other on the transformed {tokens} shape,
returned false, and runSeed took the early skipped-write branch (before even
reaching the declareRecords RETRY logic). Net effect: same as before the
declareRecords fix — canonical + both extras stayed stale.

Fix: validate() now checks the canonical defi panel directly (Array.isArray
(data?.tokens) && has at least one t.price > 0). AI/OTHER panels are validated
implicitly by their own extraKey declareRecords on write.

Audited the other 9 seeders with publishTransform (bls-series, bis-extended,
bis-data, gdelt-intel, trade-flows, iea-oil-stocks, jodi-gas, sanctions-pressure,
forecasts): all validateFn's correctly target the post-transform shape. Only
token-panels regressed.

Added 4 regression tests (tests/seed-contract-transform-regressions.test.mjs):
- validate accepts transformed panel with priced tokens
- validate rejects all-zero-price tokens
- validate rejects empty/missing tokens
- Explicit pre-fix repro (buggy old signature fails on transformed shape)

Verification:
- npm run test:data → 5322/5322 pass (was 5318; +4 new)
- npm run typecheck:all → clean
- node --check clean
Single machine-readable gate for 'is PR #3097 working in production'.
Replaces the curl/jq ritual with one authenticated edge call that returns
HTTP 200 ok:true or 503 + failing check list.

What it validates:
- 8 canonical keys have {_seed, data} envelopes with required data fields
  and minRecords floors (fsi-eu, zone-normals, 3 token panels + minRecords
  guard against token-panels RETRY regression, product-catalog, wildfire,
  earthquakes).
- 2 seed-meta:* keys remain BARE (shouldEnvelopeKey invariant; guards
  against iea-oil-stocks ANALYSIS_META_EXTRA_KEY-class regressions).
- /api/product-catalog + /api/bootstrap responses contain no '_seed' leak.

Auth: x-probe-secret header must match RELAY_SHARED_SECRET (reuses existing
Vercel↔Railway internal trust boundary).

Probe logic is exported (checkProbe, checkPublicBoundary, DEFAULT_PROBES) for
hermetic testing. tests/seed-contract-probe.test.mjs covers every branch:
envelope pass/fail on field/records/shape, bare pass/fail on shape/field,
missing/malformed JSON, Redis non-2xx, boundary seed-leak detection,
DEFAULT_PROBES sanity (seed-meta invariant present, token-panels minRecords
guard present).

Usage:
  curl -H "x-probe-secret: $RELAY_SHARED_SECRET" \
       https://api.worldmonitor.app/api/seed-contract-probe

PR 3 will extend the probe with a stricter mode that asserts seed-meta:*
keys are GONE (not just bare) once legacy dual-write is removed.

Verification:
- tests/seed-contract-probe.test.mjs → 15/15 pass
- npm run test:data → 5338/5338 (was 5322; +16 new incl. conformance)
- npm run typecheck:all → clean
…th source header

Review P2 findings: the probe's stated guards were weaker than advertised.

1. market:ai-tokens:v1 + market:other-tokens:v1 probes claimed to guard the
   token-panels extra-key RETRY regression but only checked shape='envelope'
   + dataHas:['tokens']. If an extra-key declareRecords regressed to 0, both
   probes would still pass because checkProbe() only inspects _seed.recordCount
   when minRecords is set. Now both enforce minRecords: 1.

2. /api/product-catalog boundary check only asserted no '_seed' leak — which
   is also true for the static fallback path. A broken cached reader
   (getFromCache returning null or throwing) could serve fallback silently
   and still pass this probe. Now:
   - api/product-catalog.js emits X-Product-Catalog-Source: cache|dodo|fallback
     on the response (the json() helper gained an optional source param wired
     to each of the three branches).
   - checkPublicBoundary declaratively requires that header's value match
     'cache' for /api/product-catalog, so a fallback-serve fails the probe
     with reason 'source:fallback!=cache' or 'source:missing!=cache'.

Test updates (tests/seed-contract-probe.test.mjs):
- Boundary check reworked to use a BOUNDARY_CHECKS config with optional
  requireSourceHeader per endpoint.
- New cases: served-from-cache passes, served-from-fallback fails with source
  mismatch, missing header fails, seed-leak still takes precedence, bad
  status fails.
- Token-panels sanity test now asserts minRecords≥1 on all 3 panels.

Verification:
- tests/seed-contract-probe.test.mjs → 17/17 pass (was 15, +2 net)
- npm run test:data → 5340/5340
- npm run typecheck:all → clean
@koala73 koala73 merged commit 0445983 into main Apr 15, 2026
10 checks passed
@koala73 koala73 deleted the feat/seed-contract-pr2-migrate branch April 15, 2026 05:16
koala73 added a commit that referenced this pull request Apr 16, 2026
…n Railway IP

Railway logs.1776312819911.log showed seed-climate-zone-normals failing
every batch with HTTP 429 from Open-Meteo's free-tier per-IP throttle
(2026-04-16). The seeder retried with 2/4/8/16s backoff but exhausted
without ever falling back to the project's Decodo proxy infrastructure
that other rate-limited sources (FRED, IMF) already use.

Open-Meteo throttles by source IP. Railway containers share IP pools and
get 429 storms whenever zone-normals fires (monthly cron — high churn
when it runs). Result: PR #3097's bake clock for climate:zone-normals:v1
couldn't start, because the seeder couldn't write the contract envelope
even when manually triggered.

Fix: after direct retries exhaust, _open-meteo-archive.mjs falls back to
httpsProxyFetchRaw (Decodo) — same pattern as fredFetchJson and
imfFetchJson in _seed-utils.mjs. Skips silently if no proxy is configured
(preserves existing behavior in non-Railway envs).

Added tests/open-meteo-proxy-fallback.test.mjs (4 cases):
- 429 with no proxy → throws after exhausting retries (pre-fix behavior preserved)
- 200 OK → returns parsed batch without touching proxy path
- batch size mismatch → throws even on 200
- Non-retryable 500 → break out, attempt proxy, throw exhausted (no extra
  direct retry — matches new control flow)

Verification: npm run test:data → 5359/5359, +4 new. node --check clean.

Same pattern can be applied to any other helper that fetches Open-Meteo
(grep 'open-meteo' scripts/) if more 429s show up.
koala73 added a commit that referenced this pull request Apr 16, 2026
…n Railway IP (#3118)

* fix(seed-climate-zone-normals): proxy fallback when Open-Meteo 429s on Railway IP

Railway logs.1776312819911.log showed seed-climate-zone-normals failing
every batch with HTTP 429 from Open-Meteo's free-tier per-IP throttle
(2026-04-16). The seeder retried with 2/4/8/16s backoff but exhausted
without ever falling back to the project's Decodo proxy infrastructure
that other rate-limited sources (FRED, IMF) already use.

Open-Meteo throttles by source IP. Railway containers share IP pools and
get 429 storms whenever zone-normals fires (monthly cron — high churn
when it runs). Result: PR #3097's bake clock for climate:zone-normals:v1
couldn't start, because the seeder couldn't write the contract envelope
even when manually triggered.

Fix: after direct retries exhaust, _open-meteo-archive.mjs falls back to
httpsProxyFetchRaw (Decodo) — same pattern as fredFetchJson and
imfFetchJson in _seed-utils.mjs. Skips silently if no proxy is configured
(preserves existing behavior in non-Railway envs).

Added tests/open-meteo-proxy-fallback.test.mjs (4 cases):
- 429 with no proxy → throws after exhausting retries (pre-fix behavior preserved)
- 200 OK → returns parsed batch without touching proxy path
- batch size mismatch → throws even on 200
- Non-retryable 500 → break out, attempt proxy, throw exhausted (no extra
  direct retry — matches new control flow)

Verification: npm run test:data → 5359/5359, +4 new. node --check clean.

Same pattern can be applied to any other helper that fetches Open-Meteo
(grep 'open-meteo' scripts/) if more 429s show up.

* fix: proxy fallback runs on thrown direct errors + actually-exercised tests

Addresses two PR #3118 review findings.

P1: catch block did 'throw err' on the final direct attempt, silently
bypassing the proxy fallback for thrown-error cases (timeout, ECONNRESET,
DNS failures). Only non-OK HTTP responses reached the proxy path. Fix:
record the error in lastDirectError and 'break' so control falls through
to the proxy fallback regardless of whether the direct path failed via
thrown error or non-OK status.

Also: include lastDirectError context in the final 'retries exhausted'
message + Error.cause so on-call can see what triggered the fallback
attempt (was: opaque 'retries exhausted').

P2: tests didn't exercise the actual proxy path. Refactored helper to
accept _proxyResolver and _proxyFetcher opt overrides (production
defaults to real resolveProxy/httpsProxyFetchRaw from _seed-utils.mjs;
tests inject mocks). Added 4 new cases:

- 429 + proxy succeeds → returns proxy data
- thrown fetch error on final retry → proxy fallback runs (P1 regression
  guard with explicit assertion: directCalls=2, proxyCalls=1)
- 429 + proxy ALSO fails → throws exhausted, original HTTP 429 in
  message + cause chain
- Proxy returns wrong batch size → caught + warns + throws exhausted

Verification:
- tests/open-meteo-proxy-fallback.test.mjs: 8/8 pass (4 added)
- npm run test:data: 5363/5363 pass (+4 from prior 5359)
- node --check clean
koala73 added a commit that referenced this pull request Apr 16, 2026
…le.relay

Root cause of chokepointFlows STALE_SEED (1911min stale, maxStaleMin=720):
since 2026-04-14 (PR #3097/#3101 landing), scripts/_seed-utils.mjs imports
_seed-envelope-source.mjs and _seed-contract.mjs. Dockerfile.relay COPY'd
_seed-utils.mjs but NOT its new transitive dependencies, so every execFile
invocation of seed-chokepoint-flows.mjs, seed-climate-news.mjs, and
seed-ember-electricity.mjs crashed at import with ERR_MODULE_NOT_FOUND.
The ais-relay loop kept firing every 6h but each child died instantly —
no visible error because execFile only surfaces child stderr to the
parent relay's log stream.

Local repro: node scripts/seed-chokepoint-flows.mjs runs fine in 3.6s
and writes 7 records. Same command inside the relay container would
throw at the import line because the file doesn't exist.

Fix:
1. Add COPY scripts/_seed-envelope-source.mjs and
   COPY scripts/_seed-contract.mjs to Dockerfile.relay.
2. Add a static guard test (tests/dockerfile-relay-imports.test.mjs)
   that BFS's the transitive-import graph from every COPY'd entrypoint
   and fails if any reached scripts/*.mjs|cjs isn't also COPY'd. This
   would have caught the original regression.

Matches feedback_dockerfile_relay_explicit_copy.md — we now have a test
enforcing it.
koala73 added a commit that referenced this pull request Apr 16, 2026
…kepointFlows stale 32h (#3132)

* fix(relay): COPY _seed-envelope-source + _seed-contract into Dockerfile.relay

Root cause of chokepointFlows STALE_SEED (1911min stale, maxStaleMin=720):
since 2026-04-14 (PR #3097/#3101 landing), scripts/_seed-utils.mjs imports
_seed-envelope-source.mjs and _seed-contract.mjs. Dockerfile.relay COPY'd
_seed-utils.mjs but NOT its new transitive dependencies, so every execFile
invocation of seed-chokepoint-flows.mjs, seed-climate-news.mjs, and
seed-ember-electricity.mjs crashed at import with ERR_MODULE_NOT_FOUND.
The ais-relay loop kept firing every 6h but each child died instantly —
no visible error because execFile only surfaces child stderr to the
parent relay's log stream.

Local repro: node scripts/seed-chokepoint-flows.mjs runs fine in 3.6s
and writes 7 records. Same command inside the relay container would
throw at the import line because the file doesn't exist.

Fix:
1. Add COPY scripts/_seed-envelope-source.mjs and
   COPY scripts/_seed-contract.mjs to Dockerfile.relay.
2. Add a static guard test (tests/dockerfile-relay-imports.test.mjs)
   that BFS's the transitive-import graph from every COPY'd entrypoint
   and fails if any reached scripts/*.mjs|cjs isn't also COPY'd. This
   would have caught the original regression.

Matches feedback_dockerfile_relay_explicit_copy.md — we now have a test
enforcing it.

* fix(test): scanner also covers require() and createRequire(...)(...) — greptile P2

Review finding on PR #3132: collectRelativeImports only matched ESM
import/export syntax, so require('./x.cjs') in ais-relay.cjs and
createRequire(import.meta.url)('./x.cjs') in _seed-utils.mjs were
invisible to the guard. No active bug (_proxy-utils.cjs is already
COPY'd) but a future createRequire pointing at a new uncopied helper
would slip through.

Two regexes now cover both forms:
- cjsRe: direct require('./x') — with a non-identifier lookbehind so
  'thisrequire(' or 'foorequire(' can't match.
- createRequireRe: createRequire(...)('./x') chained-call — the outer
  call is applied to createRequire's return value, not to a 'require('
  token, so the first regex misses it on its own.

Added a unit test asserting both forms resolve on known sites
(_seed-utils.mjs and ais-relay.cjs) so the next edit to this file
can't silently drop coverage.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant