Skip to content

fix(resilience): revert overall score to domain-weighted average + fix RSF direction#2847

Merged
koala73 merged 3 commits into
mainfrom
fix/resilience-formula-revert-rsf-direction
Apr 9, 2026
Merged

fix(resilience): revert overall score to domain-weighted average + fix RSF direction#2847
koala73 merged 3 commits into
mainfrom
fix/resilience-formula-revert-rsf-direction

Conversation

@koala73
Copy link
Copy Markdown
Owner

@koala73 koala73 commented Apr 9, 2026

Summary

  • Overall score formula reverted from baseline*(1-stressFactor) to sum(domainScore * domainWeight). The multiplicative formula crushed all scores by 30-50% (Norway scored 60.7 instead of ~83, Germany 39.9 instead of ~68).
  • RSF press freedom direction fixed: normalizeHigherBetter changed to normalizeLowerBetter since RSF uses 0=best, 100=worst. Norway's raw score of 6.52 was producing 7 instead of ~93.
  • Seed script ranking write removed: the handler owns ranking generation with proper greyedOut split. Script becomes a read-only health check.
  • Widget "Impact: -X%" row removed: stressFactor no longer drives the headline score.
  • Cache keys bumped: score v6, ranking v6, history v3 (new formula = incomparable scores).

Test plan

  • All 85 resilience tests pass (dimension scorers, scorers, release gate, handlers, ranking, seed, widget)
  • TypeScript checks pass (both tsconfig.json and tsconfig.api.json)
  • Pre-push hooks pass (typecheck, tests, lint, edge function isolation)
  • Verify Norway overall >= 75 in production after deploy
  • Verify Germany overall >= 60 in production after deploy
  • Verify Western Sahara is greyed out (not in ranked items)

…x RSF direction

1. overallScore reverted from baseline*(1-stressFactor) to
   sum(domainScore * domainWeight) — the multiplicative formula
   crushed all scores by 30-50%
2. RSF press freedom: normalizeHigherBetter → normalizeLowerBetter
   (RSF 0=best, 100=worst; Norway 6.52 was scoring 7 instead of 93)
3. Seed script ranking write removed (handler owns greyedOut split)
4. Widget Impact row removed (stressFactor no longer drives headline)
5. Cache keys bumped: score v6, ranking v6, history v3
@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 9, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
worldmonitor Ignored Ignored Preview Apr 9, 2026 4:46am

Request Review

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 9, 2026

Greptile Summary

This PR reverts the overall resilience score from a multiplicative baseline*(1-stressFactor) formula back to a domain-weighted average, fixes the RSF press-freedom direction (normalizeLowerBetter replaces normalizeHigherBetter), removes the ranking write from the seed script, strips the widget's "Impact" row, and bumps all cache keys to v6/v3 to invalidate stale data.

The formula revert and RSF direction fix are both correct: domain weights sum to exactly 1.0 (0.22+0.20+0.15+0.25+0.18), and RSF scores (0=best, 100=worst) now map as intended so Norway's raw score of 6.52 will produce ~93 rather than ~7.

Confidence Score: 5/5

Safe to merge — all changes are correct bug fixes with consistent cache-key versioning and full test coverage.

The two core fixes (RSF direction and overall score formula) are mathematically correct; domain weights verified to sum to 1.0; cache keys are bumped consistently across server, seed, and health check; the only remaining finding is a P2 orphaned export with no runtime impact.

No files require special attention; the orphaned RESILIENCE_RANKING_CACHE_TTL_SECONDS export in scripts/seed-resilience-scores.mjs is cosmetic only.

Vulnerabilities

No security concerns identified. The changes are formula corrections, cache key bumps, and UI cleanup with no auth, input validation, or data-exposure implications.

Important Files Changed

Filename Overview
server/worldmonitor/resilience/v1/_shared.ts Core fix: overall score reverted to domain-weighted average; stressFactor still computed and returned for API compatibility; cache keys bumped to v6/v3; weights sum to 1.0.
server/worldmonitor/resilience/v1/_dimension-scorers.ts RSF press-freedom direction corrected: normalizeHigherBetternormalizeLowerBetter(rsfScore, 0, 100), aligning with RSF's 0=best, 100=worst scale.
scripts/seed-resilience-scores.mjs Seed script converted to read-only health check; ranking write and helper functions removed; cache prefix/key bumped to v6; RESILIENCE_RANKING_CACHE_TTL_SECONDS kept exported but no longer used inside the script.
src/components/ResilienceWidget.ts Widget "Impact: -X%" row removed; baseline/stress display retained; no other functional changes.
api/health.js Single cache key bump: resilience:ranking:v5resilience:ranking:v6; consistent with server-side and seed-script constants.
tests/resilience-scores-seed.test.mjs Tests trimmed to assert v6 key constants and confirm ranking helpers are no longer exported; no logic under test is lost.
tests/resilience-handlers.test.mts History key bumped to v3 in fixtures; handler test asserts score cache uses resilience:score:v6:US, consistent with the new constants.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Country Request] --> B[scoreAllDimensions]
    B --> C[buildDimensionList]
    C --> D["buildDomainList\ncoverage-weighted mean per domain"]
    D --> E["overallScore = sum of domain.score x domain.weight\neconomic 22% + infra 20% + energy 15%\nsocial-gov 25% + health-food 18%"]
    E --> F["classifyResilienceLevel\nhigh>=70 / medium>=40 / low"]
    F --> G["Cache: resilience:score:v6:iso2"]

    subgraph RSF_Fix ["RSF Direction Fix"]
      H["Raw RSF score 0=best 100=worst"]
      H --> I["normalizeLowerBetter(rsfScore, 0, 100)\nNorway 6.52 maps to ~93\nwas normalizeHigherBetter producing ~7"]
    end

    subgraph Ranking ["Ranking (Vercel handler)"]
      G --> J["Build greyedOut split\nwarm missing scores"]
      J --> K["Cache: resilience:ranking:v6"]
    end

    subgraph Seed ["Seed Script - read-only"]
      L["Railway cron every 5h"] --> M["Read static index"]
      M --> N["Count cached scores\nno Redis writes"]
    end
Loading

Reviews (1): Last reviewed commit: "fix(resilience): revert overall score to..." | Re-trigger Greptile

Comment thread scripts/seed-resilience-scores.mjs Outdated
…ead-only seed

1. Validation scripts (backtest, correlation, sensitivity) updated from
   v5 to v6 cache keys. Sensitivity formula updated to domain-weighted.
2. Seed script lock removed — read-only health check needs no lock.
@koala73 koala73 merged commit 09ed68d into main Apr 9, 2026
10 checks passed
@koala73 koala73 deleted the fix/resilience-formula-revert-rsf-direction branch April 9, 2026 04:49
koala73 added a commit that referenced this pull request Apr 11, 2026
Why this PR?

Captures the 90-day plan to upgrade Country Resilience to a
reference-grade composite index, decomposed into three phases:
transparency + calibration, structural rebuild with three pillars,
explanatory product.

Colocates the plan with the internal origin review that prompted
it, so every carried-forward decision has a working reference to
its source.

Locks the construct memo (definition, horizon, audience, polarity,
partly non-compensatory aggregation via penalized weighted mean
with tunable alpha) before any code begins, so the index cannot be
re-argued dimension by dimension later.

Splits the product into an Annual Reference Edition + Live Monitor,
per INFORM, ND-GAIN, WorldRiskIndex, and FSI precedent.

Prerequisite PRs verified merged before landing:

- #2821 (baseline / stress engine)
- #2847 (formula revert + RSF direction fix)
- #2858 (seed direct scoring)

Docs only. docs/internal/ is gitignored; both files are force-added,
matching the existing youtube-desktop.md precedent in the same
directory. No runtime impact.

Generated with Claude Opus 4.6 (1M context) via Claude Code
+ Compound Engineering v2.49.0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
koala73 added a commit that referenced this pull request Apr 11, 2026
Why this PR?

Ships the Phase 1 T1.1 deliverable from the country-resilience
reference-grade upgrade plan: reproduce the origin-doc "Norway and US
both hit 100" ceiling bug with a failing test before any fix lands.

The plan explicitly committed to reproducing-before-fixing and to
updating the origin doc's changelog if the symptom turned out to be
misattributed. This PR documents the measured outcome and pins the
current correct behavior.

Investigation outcome: the claim does NOT reproduce.

Measured scores under the current release-gate fixtures and the
post-PR-#2847 domain-weighted-average formula:

  NO (elite tier)   overallScore = 86.58, baseline 86.85, stress 84.36
  US (strong tier)  overallScore = 72.80, baseline 73.15, stress 70.58
  Delta             NO - US = 13.78 points

Nothing pins at 100. The ordering elite > strong > stressed > fragile
is preserved, and the domain-weighted average cannot reach 100 unless
every dimension saturates (which does not happen for any fixture tier).
The origin claim is misattributed or stale, probably predating PR #2847
which reverted the multiplicative baseline*(1-stressFactor) formula
that had over-penalized all countries.

What this PR commits:

- A new regression test in tests/resilience-release-gate.test.mts
  asserting that NO and US are not pinned at 100, that NO > US, and
  that the delta is at least 3 points. The threshold leaves room for
  fixture tuning without over-fitting to the measured 13.78 delta.
- A detailed comment block inside the test capturing the measured
  numbers, the conclusion, and the rationale so future regressions to
  a real ceiling bug are caught immediately and so the origin-doc
  changelog update (tracked separately) has a cite-able evidence base.

The origin-doc changelog update is deliberately deferred to a
trailing commit after PR #2938 (the reference-grade plan, which
contains the origin doc) merges, to avoid a cross-branch conflict.

Side finding, out of scope for this PR, tracked as a follow-up:
the release-gate fixtures use qualityFor(profile) to derive all
inputs from a single quality value per tier (elite/strong/stressed/
fragile), so every country within a tier produces identical scores.
This means the release-gate suite cannot detect within-tier ordering
issues. Not a scorer bug, a fixture-design limitation.

Prerequisite PRs verified merged:
- #2821 (baseline / stress engine)
- #2847 (formula revert + RSF direction fix)
- #2858 (seed direct scoring)

Test suite: 172/172 resilience tests pass locally. Typecheck clean.

Generated with Claude Opus 4.6 (1M context) via Claude Code
+ Compound Engineering v2.49.0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
koala73 added a commit that referenced this pull request Apr 11, 2026
Why this PR?

Ships Phase 1 T1.4 of the country-resilience reference-grade upgrade
plan: surface the dataVersion field in the resilience widget footer so
analysts can see how fresh the underlying source data is.

The dataVersion field was added to the response proto in PR #2821 and
is already populated end-to-end on the server side: the Railway
static-seed job writes fetchedAt into seed-meta:resilience:static, and
buildResilienceScore in server/worldmonitor/resilience/v1/_shared.ts
reads it back, slices to YYYY-MM-DD, and returns it on every country
score response. The handler test at tests/resilience-handlers.test.mts
already verifies that round-trip.

What was missing: the widget never rendered the field. LOCKED_PREVIEW
and the widget test fixture both carry dataVersion, but no call site
surfaced it to users. This PR closes the gap with a narrow UI-only
change, no schema or scorer edits.

What this PR commits:

- formatResilienceDataVersion(dataVersion) helper in
  src/components/resilience-widget-utils.ts. Validates the value is an
  ISO date YYYY-MM-DD via regex; returns an empty string for null,
  undefined, or malformed inputs so the caller skips rendering instead
  of showing a dangling "Data " label.
- ResilienceWidget.ts footer render: adds a resilience-widget__data-version
  span next to the existing confidence and 30d-delta spans, only when
  the formatter returns non-empty. Tooltip explains the source.
- Three new widget-format tests in tests/resilience-widget.test.mts:
  (1) formats a valid ISO date as "Data YYYY-MM-DD", (2) returns empty
  for malformed or missing input, (3) regression assertion that
  baseResponse still carries dataVersion so future schema edits that
  drop the field break visibly.

Scope boundaries (not in this PR):

- No per-dimension freshness badges. That is T1.5 (source-recency
  badges, lastObservedAt + staleness per signal) and has its own plan
  task and its own PR.
- No existing em-dash fix in formatResilienceConfidence. The string
  "Low confidence — sparse data" at line 42 of resilience-widget-utils.ts
  predates this PR and is a separate memory-rule violation; left to a
  dedicated cleanup PR.

Prerequisite PRs (verified merged):
- #2821 (baseline / stress engine, added the dataVersion field)
- #2847 (formula revert + RSF direction fix)
- #2858 (seed direct scoring)

Testing:
- npx tsx --test tests/resilience-widget.test.mts: 9/9 passing
- npx tsx --test tests/resilience-*.test.mts tests/resilience-*.test.mjs: 174/174 passing
- npm run typecheck: clean

Generated with Claude Opus 4.6 (1M context) via Claude Code
+ Compound Engineering v2.49.0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
koala73 added a commit that referenced this pull request Apr 11, 2026
Why this PR?

Ships the foundation slice of Phase 1 T1.7 of the country-resilience
reference-grade upgrade plan: tag every absence-based imputation with
one of four semantic classes so downstream consumers can distinguish
"nothing is happening" from "we do not know" from "upstream is down"
from "the dimension does not apply."

Scope of this PR (foundation-only):

- Adds an `ImputationClass` type union with four values:
  - `stable-absence`: the source publishes globally and the country is
    not listed, which means the tracked phenomenon is not happening
    (e.g., no IPC Phase 3+, no UCDP event). Strong positive signal.
  - `unmonitored`: the source is a curated list that may not cover
    every country. Absence is ambiguous; penalized conservatively.
  - `source-failure`: reserved for the runtime path that consults
    seed-meta failedDatasets. Not yet wired; comment explains this
    will land in T1.9.
  - `not-applicable`: reserved for structural N/A (e.g., landlocked
    country has no maritime exposure). No current scorer branches on
    it; reserved for future dimensions.
- Introduces an `ImputationEntry` interface so both the generic
  `IMPUTATION` table and the per-metric `IMPUTE` overrides share a
  single shape with `score`, `certaintyCoverage`, and `imputationClass`.
- Tags every existing table entry with its class:
  - `crisis_monitoring_absent` (IPC, UCDP, UNHCR global feeds) ->
    stable-absence
  - `curated_list_absent` (BIS, WTO curated lists) -> unmonitored
  - `ipcFood` -> stable-absence (food-specific override)
  - `wtoData` -> unmonitored (trade-specific override)
  - `unhcrDisplacement` -> stable-absence (displacement-specific)
  - `bisEer`, `bisCredit` inherit via shared reference (same object)
- Uses `as const satisfies Record<string, ImputationEntry>` so the
  literal types stay narrow (required for the existing call sites
  that destructure specific fields) while the compiler enforces the
  shape on every entry.
- Exports `IMPUTATION`, `IMPUTE`, and `ImputationClass` so tests and
  (later) downstream consumers in T1.5 / T1.6 / T1.9 can import them.

What is deliberately NOT in this PR:

- No changes to the response schema (GetResilienceScoreResponse,
  ResilienceDimension). Exposing the class breakdown on the response
  is T1.6 (widget dimension confidence bar with imputation icon).
- No changes to the scorer aggregation logic (coverage, certainty,
  score composition). The taxonomy is tagged at definition time, not
  propagated through the 13 dimension scorers yet. Propagation lands
  with T1.5 source-recency so the two schema additions ship together.
- No seed-time tagging. The plan's T1.7 description mentions writing
  `imputationClass` to `resilience:static:signal:<id>:<cc>` keys, but
  the current storage model uses global source keys (UCDP, UNHCR,
  etc.) fetched once per request and keyed by ISO2 inside. A per-
  signal storage refactor is out of scope; classification at read
  time via the IMPUTATION table achieves the same downstream effect
  at zero storage cost.
- No source-failure detection. The seed-meta failedDatasets array is
  already written by the seeder; wiring the scorer to read it and
  re-tag imputations as source-failure lands with T1.9.

New tests (tests/resilience-dimension-scorers.test.mts):

- Every IMPUTATION entry has a valid imputationClass.
- Every IMPUTE entry has a valid imputationClass.
- crisis_monitoring_absent is stable-absence with the expected score
  and certaintyCoverage constants.
- curated_list_absent is unmonitored with the expected constants.
- Per-metric overrides (ipcFood, wtoData, unhcrDisplacement) carry
  the right class; bisEer / bisCredit preserve shared-reference
  semantics and inherit the class from the parent entry.
- Semantic sanity: stable-absence score and certaintyCoverage are
  both higher than unmonitored (fails loudly if the taxonomy ever
  drifts meaning).

Test results:
- npx tsx --test tests/resilience-dimension-scorers.test.mts: 51/51 pass
  (46 existing + 5 new)
- npx tsx --test tests/resilience-*.test.mts tests/resilience-*.test.mjs:
  176/176 pass
- npm run typecheck: clean

Prerequisite PRs verified merged:
- #2821 (baseline / stress engine)
- #2847 (formula revert + RSF direction fix)
- #2858 (seed direct scoring)

Generated with Claude Opus 4.6 (1M context) via Claude Code
+ Compound Engineering v2.49.0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
koala73 added a commit that referenced this pull request Apr 11, 2026
Why this PR?

Ships Phase 1 T1.3 of the country-resilience reference-grade upgrade
plan: promote the v1.0 methodology document from a scratch markdown
file to a published MDX page at parity with the Country Instability
Index (docs/country-instability-index.mdx).

Before this PR there was no publicly documented resilience methodology
page, only an internal markdown draft. The reference-grade upgrade
plan's origin review explicitly called out methodological opacity as
the biggest gap keeping the product below OECD/JRC standards, and
committed to documenting every dimension, formula, goalpost, cadence,
weight rationale, and imputation rule before any Phase 2 schema work
begins. This PR is that documentation.

What this PR commits:

- git mv docs/methodology/resilience-index.md to
  docs/methodology/country-resilience-index.mdx so git history tracks
  the rename; edits land as modifications on the new path.
- Adds MDX frontmatter (title + description) matching CII shape so the
  page renders correctly in the docs site and shows up in search.
- Prepends a short v1.0 scope note that distinguishes the currently
  shipping behavior (this doc) from the planned v2.0 three-pillar
  rebuild (tracked in the reference-grade upgrade plan). Readers who
  land on the page know exactly what is live and what is coming.
- Fixes an orphan table row (`| Food & Water | Mixed |`) that had
  drifted into the Overall Score section from an earlier edit and
  was breaking rendering around the score-formula block.
- Rewrites the Imputation Taxonomy section to use the formal
  four-class naming from T1.7 (stable-absence, unmonitored,
  source-failure, not-applicable) and adds a mapping table from each
  concrete IMPUTATION / IMPUTE entry to its class. This is the
  public-facing surface of the T1.7 foundation PR.
- Adds a Reproducibility Appendix listing every Redis key used by the
  scorer (score cache, ranking cache, history sorted set, intervals,
  seed-meta, static record, static index), the meaning of the
  `dataVersion` field, and a step-by-step procedure for reproducing
  any published country score by hand from a Redis snapshot.
- Adds a Changelog section with a v1.0 entry that references the
  prerequisite PRs (#2821, #2847, #2858) and the Phase 1 work landing
  so far (T1.1 regression test, T1.4 dataVersion widget wire,
  T1.7 imputation taxonomy foundation), plus a v2.0 placeholder that
  summarizes the reference-grade upgrade plan for readers.
- Adds editorial notes pointing at the Phase 1 T1.8 methodology doc
  linter that will enforce parity between this document and the
  indicator registry in _dimension-scorers.ts.

What is deliberately NOT in this PR:

- No v2.0 content presented as shipping. The three-pillar rebuild,
  recovery capacity pillar, penalized weighted mean aggregation,
  cross-index benchmark, and annual Reference Edition are all tracked
  in the reference-grade upgrade plan but not yet implemented. The
  changelog v2.0 section names them as planned work, not current
  behavior.
- No methodology doc linter. That is T1.8 and ships separately so it
  can be reviewed as a tooling change.
- No deletion of the old .md path. git mv preserves history; the old
  location renders a 404 which is expected.

Prerequisite PRs verified merged:
- #2821 (baseline / stress engine)
- #2847 (formula revert + RSF direction fix)
- #2858 (seed direct scoring)

Testing:
- Markdown-only change, no code.
- npm run typecheck: clean
- Pre-push hook (typecheck + build:full + version:check) passes.

Generated with Claude Opus 4.6 (1M context) via Claude Code
+ Compound Engineering v2.49.0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
koala73 added a commit that referenced this pull request Apr 11, 2026
…1.8)

Why this PR?

Ships Phase 1 T1.8 of the country-resilience reference-grade upgrade
plan: add a test that fails loudly if the published methodology
document drifts from the scorer's RESILIENCE_DIMENSION_ORDER. This is
the discipline that keeps the methodology page trustworthy over time,
a forever risk on composite indices per the OECD/JRC handbook.

The linter runs on every test pass and checks four things:

1. Every dimension in RESILIENCE_DIMENSION_ORDER has an H4 subsection
   in the methodology document.
2. Every H4 subsection in the methodology maps to a real scorer
   dimension (no stale docs).
3. Every H4 subsection is either a mapped dimension or explicitly
   allowlisted (prevents typos and unwired new sections).
4. HEADING_TO_DIMENSION in the test file maps exactly onto
   RESILIENCE_DIMENSION_ORDER with no extras and no gaps. This makes
   the test file itself the single source of truth for how the doc
   labels map to scorer IDs.

Location-agnostic: the linter looks for the methodology file at a
short list of candidate paths and prefers the newer
country-resilience-index.mdx once T1.3 lands on main. On the current
origin/main it finds the older resilience-index.md and lints that.
This keeps T1.8 independent of T1.3's merge order so the PRs can land
in either sequence.

What this PR commits:
- New test file tests/resilience-methodology-lint.test.mts with 5
  scenarios covering the four checks above plus a smoke test that
  the file locator works.
- Hardcoded HEADING_TO_DIMENSION map (13 entries, one per scorer
  dimension) as the source of truth for the heading-to-ID mapping.
  Any future dimension add must update this map in lockstep with the
  scorer and the methodology doc, which is exactly the drift
  prevention we want.

What is NOT in this PR:
- No changes to the methodology document or the scorer.
- No automated HTML comment markers in the mdx. The hardcoded map in
  the test file is simpler and produces the same drift-detection
  signal.
- No integration with lint-staged or CI-specific gating. The linter
  runs as part of the standard test:data suite so it fires on every
  pre-push hook run.

Prerequisite PRs verified merged:
- #2821 (baseline / stress engine)
- #2847 (formula revert + RSF direction fix)
- #2858 (seed direct scoring)

Related (not prerequisite) in-flight Phase 1 PRs this session:
- #2941 T1.1 regression test
- #2943 T1.4 dataVersion widget wire
- #2944 T1.7 imputation taxonomy foundation
- #2945 T1.3 methodology mdx promotion

Testing:
- npx tsx --test tests/resilience-methodology-lint.test.mts: 5/5 pass
- npx tsx --test tests/resilience-*.test.mts tests/resilience-*.test.mjs:
  176/176 pass
- npm run typecheck: clean

Generated with Claude Opus 4.6 (1M context) via Claude Code
+ Compound Engineering v2.49.0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
koala73 added a commit that referenced this pull request Apr 11, 2026
Why this PR?

Ships the foundation-only slice of Phase 1 T1.5 of the country-
resilience reference-grade upgrade plan: a pure staleness classifier
that maps a `lastObservedAt` timestamp and a source cadence to one of
three staleness levels (fresh, aging, stale). This is the primitive
that T1.6 (widget dimension confidence bar with freshness badge) and
the later T1.5 scorer propagation pass both consume.

Same pattern as the T1.7 foundation PR (#2944): define the type and
the primitive in isolation with comprehensive tests, then land the
consumer wiring in separate PRs so each unit is bounded and
reviewable.

What this PR commits:

- New module `server/_shared/resilience-freshness.ts` (110 lines)
  exporting:
    - `ResilienceCadence` type union covering the 5 cadences the
      methodology document lists (realtime, daily, weekly, monthly,
      annual).
    - `StalenessLevel` type union: fresh / aging / stale.
    - `cadenceUnitMs(cadence)` helper returning a canonical duration
      per cadence: realtime = 1 hour, daily = 1 day, weekly = 7 days,
      monthly = 30 days, annual = 365 days.
    - `FRESH_MULTIPLIER` = 1.5 and `AGING_MULTIPLIER` = 3. A signal is
      fresh when age is strictly less than 1.5x its cadence unit,
      aging when strictly less than 3x, stale otherwise.
    - `classifyStaleness({ lastObservedAtMs, cadence, nowMs })` pure
      function returning `{ staleness, ageMs, ageInCadenceUnits }`.
      Null / undefined / NaN / future timestamps return stale with
      positive-infinity age. `nowMs` is accepted as a deterministic
      override for unit testing.
- New test file `tests/resilience-freshness.test.mts` (170 lines, 10
  tests covering cadence ordering, fresh/aging/stale classification
  across all 5 cadences, defensive handling of null/NaN/future
  timestamps, exact threshold boundaries, internal consistency, and
  classifier purity).

What is deliberately NOT in this PR:

- No changes to the 13 dimension scorers. Propagating `lastObservedAt`
  through each scorer and aggregating max age per dimension is the
  next slice of T1.5 and will consume this classifier as a pure
  import.
- No schema changes (proto, OpenAPI, `ResilienceDimension` response
  type). The schema field `freshness: { lastObservedAt, staleness }`
  lands alongside the widget rendering in T1.6.
- No widget rendering. T1.6 owns the per-dimension freshness badge UI
  and will call `classifyStaleness` at render time.

Prerequisite PRs verified merged:
- #2821 (baseline / stress engine)
- #2847 (formula revert + RSF direction fix)
- #2858 (seed direct scoring)

Related in-flight Phase 1 PRs this session:
- #2941 T1.1 regression test
- #2943 T1.4 dataVersion widget wire
- #2944 T1.7 imputation taxonomy foundation
- #2945 T1.3 methodology mdx promotion
- #2946 T1.8 methodology doc linter

Testing:
- npx tsx --test tests/resilience-freshness.test.mts: 10/10 pass
- npm run typecheck: clean

Generated with Claude Opus 4.6 (1M context) via Claude Code
+ Compound Engineering v2.49.0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
koala73 added a commit that referenced this pull request Apr 11, 2026
Why this PR?

Ships Phase 1 T1.6 of the country-resilience reference-grade upgrade
plan: a compact per-dimension coverage grid below the 5-domain rows
in the resilience widget so analysts can see per-dimension data
provenance without opening the deep-dive panel.

This is a scope-narrowed slice of T1.6. The plan's full description
adds an imputation class icon and a freshness badge per dimension,
but both of those require proto schema additions that have not
landed yet (T1.7 foundation and T1.5 foundation introduced the types
and classifier, but neither exposes the fields through the response
schema). This PR ships the coverage column immediately using the
existing `coverage`, `observedWeight`, `imputedWeight` fields that
are already on every ResilienceDimension, and leaves two follow-up
columns (imputation class icon, freshness badge) to later PRs once
the schema lands.

What this PR commits:

- New utils in `src/components/resilience-widget-utils.ts`:
    - `DIMENSION_LABELS` map with short display labels for each of
      the 13 scorer dimensions (`Macro`, `Currency`, `Trade`, `Cyber`,
      `Logistics`, `Infra`, `Energy`, `Gov`, `Social`, `Border`,
      `Info`, `Health`, `Food`).
    - `getResilienceDimensionLabel(dimensionId)` helper, matching
      the existing `getResilienceDomainLabel` pattern.
    - `DimensionConfidenceInput`, `DimensionCoverageStatus`, and
      `DimensionConfidence` types for the confidence classifier.
    - `formatDimensionConfidence(input)` pure function: returns
      `{ id, label, coveragePct, status, absent }` where status is
      one of `observed`, `partial`, `imputed`, `absent`. The 80%
      observed-share threshold for `observed` vs `partial` matches
      the existing `lowConfidence` rule in `_shared.ts` (where a 40%
      imputation share trips the widget-wide flag), applied per
      dimension so one well-covered dimension is not obscured by the
      domain's worst case.
    - `collectDimensionConfidences(domains)` helper that walks every
      domain and every dimension in scorer order so the widget
      renders a stable grid.
- New render methods in `src/components/ResilienceWidget.ts`:
    - `renderDimensionConfidenceGrid(data)` produces the container.
    - `renderDimensionConfidenceCell(dim)` produces one row per
      dimension with label, coverage bar, and percentage. Status
      enum is on the cell className so CSS can style observed,
      partial, imputed, and absent cells differently.
    - Wired into `renderScoreCard` between the existing domain rows
      and the footer, so the layout is domains, dimension grid,
      footer.
- 8 new tests in `tests/resilience-widget.test.mts` covering:
    - All 13 dimension labels plus the unknown-ID fallback.
    - Observed-heavy classification (observed).
    - Mixed observed and imputed classification (partial).
    - All-imputed classification (imputed).
    - Zero-weight absent classification (absent, `coveragePct=0`,
      `absent: true`).
    - Clamping for out-of-range coverage (above 1, negative) and
      NaN-safe fallback to zero weight and absent status.
    - `collectDimensionConfidences` preserves scorer order across
      domains and returns empty lists for empty responses.

What is deliberately NOT in this PR:

- No imputation class icon per dimension. That requires exposing
  `imputationClass` on the `ResilienceDimension` response type
  (proto change). Tracked as a follow-up after the T1.7 schema pass.
- No freshness badge per dimension. That requires exposing
  `lastObservedAt` and a staleness level on the response type (proto
  change). Tracked as a follow-up after the T1.5 full propagation pass.
- No CSS changes. The new cell classes are scaffolded for styling
  (`--observed`, `--partial`, `--imputed`, `--absent` modifiers) but
  the actual stylesheet edits will be folded into the CSS pass that
  picks up the full three-column dimension row once the icon and
  badge columns land.

Prerequisite PRs verified merged:
- #2821 (baseline / stress engine)
- #2847 (formula revert + RSF direction fix)
- #2858 (seed direct scoring)

Related in-flight Phase 1 PRs from this session:
- #2941 T1.1 regression test
- #2943 T1.4 dataVersion widget wire
- #2944 T1.7 imputation taxonomy foundation
- #2945 T1.3 methodology mdx promotion
- #2946 T1.8 methodology doc linter
- #2947 T1.5 staleness classifier foundation

Testing:
- npx tsx --test tests/resilience-widget.test.mts: 14/14 pass
  (6 existing + 8 new dimension-confidence tests)
- npx tsx --test tests/resilience-*.test.mts tests/resilience-*.test.mjs:
  179/179 pass
- npm run typecheck: clean

Generated with Claude Opus 4.6 (1M context) via Claude Code
+ Compound Engineering v2.49.0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
koala73 added a commit that referenced this pull request Apr 11, 2026
Appends an editor's note to the origin review preserving the original
text as written and recording the Phase 1 T1.1 investigation outcome.

The review states that "Norway and the US both hit 100 under current
fixtures, which broke the intended ordering and exposed a ceiling
effect at the top end of the ranking." The T1.1 regression test shipped
on PR #2941 investigated this claim and did NOT reproduce it. Measured
scores under the current release-gate fixtures and the post-PR-#2847
domain-weighted-average formula:

  Norway (elite tier):  overallScore = 86.58, baseline 86.85, stress 84.36
  US (strong tier):     overallScore = 72.80, baseline 73.15, stress 70.58
  Delta:                NO minus US = 13.78 points

Neither country approaches 100, the ordering is preserved, and the
scorer cannot produce a hard 100 ceiling under any fixture tier. The
specific Norway=US=100 illustration is retracted; the scorecard
judgment and the six prescribed improvements remain valid.

Also records the side finding from the investigation that the
release-gate fixtures use one quality value per tier (elite, strong,
stressed, fragile), so every country within a tier produces byte-
identical scores. This is a fixture-design limitation, not a scorer
bug, and is tracked as Phase 2 follow-up work.

The original review text is preserved unchanged; the changelog is
appended as an "Editor's Note" section at the end of the file so the
historical record of what was originally filed stays auditable. The
T1.1 regression test itself stays in the release-gate suite so a real
ceiling bug, if ever introduced, is caught immediately by CI.

Closes the one deliberate deferral noted in PR #2941's description
(the "origin-doc changelog update" trailing commit that was waiting
to land alongside the plan once the cross-branch conflict concern was
resolved). Lands as a fourth commit on this PR so both the plan and
the corrected origin doc ship together.

Docs only, no code, no runtime impact.

Generated with Claude Opus 4.6 (1M context) via Claude Code
+ Compound Engineering v2.49.0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
koala73 added a commit that referenced this pull request Apr 11, 2026
…2945)

* docs(resilience): promote methodology to .mdx at CII parity (T1.3)

Why this PR?

Ships Phase 1 T1.3 of the country-resilience reference-grade upgrade
plan: promote the v1.0 methodology document from a scratch markdown
file to a published MDX page at parity with the Country Instability
Index (docs/country-instability-index.mdx).

Before this PR there was no publicly documented resilience methodology
page, only an internal markdown draft. The reference-grade upgrade
plan's origin review explicitly called out methodological opacity as
the biggest gap keeping the product below OECD/JRC standards, and
committed to documenting every dimension, formula, goalpost, cadence,
weight rationale, and imputation rule before any Phase 2 schema work
begins. This PR is that documentation.

What this PR commits:

- git mv docs/methodology/resilience-index.md to
  docs/methodology/country-resilience-index.mdx so git history tracks
  the rename; edits land as modifications on the new path.
- Adds MDX frontmatter (title + description) matching CII shape so the
  page renders correctly in the docs site and shows up in search.
- Prepends a short v1.0 scope note that distinguishes the currently
  shipping behavior (this doc) from the planned v2.0 three-pillar
  rebuild (tracked in the reference-grade upgrade plan). Readers who
  land on the page know exactly what is live and what is coming.
- Fixes an orphan table row (`| Food & Water | Mixed |`) that had
  drifted into the Overall Score section from an earlier edit and
  was breaking rendering around the score-formula block.
- Rewrites the Imputation Taxonomy section to use the formal
  four-class naming from T1.7 (stable-absence, unmonitored,
  source-failure, not-applicable) and adds a mapping table from each
  concrete IMPUTATION / IMPUTE entry to its class. This is the
  public-facing surface of the T1.7 foundation PR.
- Adds a Reproducibility Appendix listing every Redis key used by the
  scorer (score cache, ranking cache, history sorted set, intervals,
  seed-meta, static record, static index), the meaning of the
  `dataVersion` field, and a step-by-step procedure for reproducing
  any published country score by hand from a Redis snapshot.
- Adds a Changelog section with a v1.0 entry that references the
  prerequisite PRs (#2821, #2847, #2858) and the Phase 1 work landing
  so far (T1.1 regression test, T1.4 dataVersion widget wire,
  T1.7 imputation taxonomy foundation), plus a v2.0 placeholder that
  summarizes the reference-grade upgrade plan for readers.
- Adds editorial notes pointing at the Phase 1 T1.8 methodology doc
  linter that will enforce parity between this document and the
  indicator registry in _dimension-scorers.ts.

What is deliberately NOT in this PR:

- No v2.0 content presented as shipping. The three-pillar rebuild,
  recovery capacity pillar, penalized weighted mean aggregation,
  cross-index benchmark, and annual Reference Edition are all tracked
  in the reference-grade upgrade plan but not yet implemented. The
  changelog v2.0 section names them as planned work, not current
  behavior.
- No methodology doc linter. That is T1.8 and ships separately so it
  can be reviewed as a tooling change.
- No deletion of the old .md path. git mv preserves history; the old
  location renders a 404 which is expected.

Prerequisite PRs verified merged:
- #2821 (baseline / stress engine)
- #2847 (formula revert + RSF direction fix)
- #2858 (seed direct scoring)

Testing:
- Markdown-only change, no code.
- npm run typecheck: clean
- Pre-push hook (typecheck + build:full + version:check) passes.

Generated with Claude Opus 4.6 (1M context) via Claude Code
+ Compound Engineering v2.49.0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs(resilience): add v1.0 content to methodology mdx (T1.3 followup)

Companion commit to 4e7f64a (which was a rename-only because git
add after git mv re-staged the original file content). This commit
adds the actual v1.0 content the T1.3 task requires:

- MDX frontmatter (title + description) matching the Country
  Instability Index shape so the page renders in the docs site.
- Short v1.0 scope note that distinguishes the currently shipping
  behavior (this doc) from the planned v2.0 three-pillar rebuild
  (tracked in the reference-grade upgrade plan).
- Removes an orphan table row (`| Food & Water | Mixed |`) from the
  Overall Score section that had drifted from an earlier edit.
- Rewrites the Imputation Taxonomy section to use the formal
  four-class naming from T1.7 (stable-absence, unmonitored,
  source-failure, not-applicable) with a mapping table from each
  concrete IMPUTATION / IMPUTE entry to its class.
- Reproducibility Appendix listing every Redis key used by the
  scorer, the dataVersion semantics, and a step-by-step reproduction
  procedure.
- Changelog section with v1.0 (prerequisite PRs + T1.1 / T1.4 / T1.7
  landing) and v2.0 placeholder summarizing the reference-grade
  upgrade plan.
- Editorial notes pointing at the T1.8 methodology doc linter that
  will enforce parity between this document and the indicator
  registry.

Net change: 89 insertions, 13 deletions, file grows from 300 to 376
lines.

Docs-only, no code, no runtime impact.

Generated with Claude Opus 4.6 (1M context) via Claude Code
+ Compound Engineering v2.49.0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs(resilience): address PR #2945 review (dead link + roads reuse note)

Addresses two review comments on PR #2945:

1. **P2 Medium: broken reference in the v1.0 scope note at line 8.**
   The intro linked to `../internal/country-resilience-upgrade-plan.md`,
   which resolves to `docs/internal/country-resilience-upgrade-plan.md`.
   That file lives on PR #2938 and is not in this PR's tree or on
   origin/main, so until #2938 merges the link would render as a 404
   on the docs site. Dropped the markdown link and kept the text
   reference to "a separate reference-grade upgrade plan" so readers
   still know context exists elsewhere. The existing prose reference
   to the same file at line 365 is already inside backticks and is
   therefore a plain-text citation, not a link, so it stays.

2. **Additional suggestion: roads series shared between two
   dimensions.** `roadsPavedLogistics` (Logistics & Supply, weight
   0.50) and `roadsPavedInfra` (Infrastructure, weight 0.35) both
   read from World Bank `IS.ROD.PAVE.ZS`. Added an explicit note
   under the Infrastructure dimension table clarifying that this is
   deliberate source reuse (Logistics uses it as a transit-viability
   proxy, Infrastructure uses it as a baseline-public-capital proxy)
   and pointing forward to the v2.0 plan's consolidation of shared
   upstream signals into a single indicator registry. No change to
   the indicator tables themselves; this is a clarifying paragraph.

No content semantics change beyond the clarifying paragraph and the
dead-link removal. Docs-only, no code.

Generated with Claude Opus 4.6 (1M context) via Claude Code
+ Compound Engineering v2.49.0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
koala73 added a commit that referenced this pull request Apr 11, 2026
…1.8) (#2946)

Why this PR?

Ships Phase 1 T1.8 of the country-resilience reference-grade upgrade
plan: add a test that fails loudly if the published methodology
document drifts from the scorer's RESILIENCE_DIMENSION_ORDER. This is
the discipline that keeps the methodology page trustworthy over time,
a forever risk on composite indices per the OECD/JRC handbook.

The linter runs on every test pass and checks four things:

1. Every dimension in RESILIENCE_DIMENSION_ORDER has an H4 subsection
   in the methodology document.
2. Every H4 subsection in the methodology maps to a real scorer
   dimension (no stale docs).
3. Every H4 subsection is either a mapped dimension or explicitly
   allowlisted (prevents typos and unwired new sections).
4. HEADING_TO_DIMENSION in the test file maps exactly onto
   RESILIENCE_DIMENSION_ORDER with no extras and no gaps. This makes
   the test file itself the single source of truth for how the doc
   labels map to scorer IDs.

Location-agnostic: the linter looks for the methodology file at a
short list of candidate paths and prefers the newer
country-resilience-index.mdx once T1.3 lands on main. On the current
origin/main it finds the older resilience-index.md and lints that.
This keeps T1.8 independent of T1.3's merge order so the PRs can land
in either sequence.

What this PR commits:
- New test file tests/resilience-methodology-lint.test.mts with 5
  scenarios covering the four checks above plus a smoke test that
  the file locator works.
- Hardcoded HEADING_TO_DIMENSION map (13 entries, one per scorer
  dimension) as the source of truth for the heading-to-ID mapping.
  Any future dimension add must update this map in lockstep with the
  scorer and the methodology doc, which is exactly the drift
  prevention we want.

What is NOT in this PR:
- No changes to the methodology document or the scorer.
- No automated HTML comment markers in the mdx. The hardcoded map in
  the test file is simpler and produces the same drift-detection
  signal.
- No integration with lint-staged or CI-specific gating. The linter
  runs as part of the standard test:data suite so it fires on every
  pre-push hook run.

Prerequisite PRs verified merged:
- #2821 (baseline / stress engine)
- #2847 (formula revert + RSF direction fix)
- #2858 (seed direct scoring)

Related (not prerequisite) in-flight Phase 1 PRs this session:
- #2941 T1.1 regression test
- #2943 T1.4 dataVersion widget wire
- #2944 T1.7 imputation taxonomy foundation
- #2945 T1.3 methodology mdx promotion

Testing:
- npx tsx --test tests/resilience-methodology-lint.test.mts: 5/5 pass
- npx tsx --test tests/resilience-*.test.mts tests/resilience-*.test.mjs:
  176/176 pass
- npm run typecheck: clean

Generated with Claude Opus 4.6 (1M context) via Claude Code
+ Compound Engineering v2.49.0

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
koala73 added a commit that referenced this pull request Apr 11, 2026
Why this PR?

Ships the foundation-only slice of Phase 1 T1.5 of the country-
resilience reference-grade upgrade plan: a pure staleness classifier
that maps a `lastObservedAt` timestamp and a source cadence to one of
three staleness levels (fresh, aging, stale). This is the primitive
that T1.6 (widget dimension confidence bar with freshness badge) and
the later T1.5 scorer propagation pass both consume.

Same pattern as the T1.7 foundation PR (#2944): define the type and
the primitive in isolation with comprehensive tests, then land the
consumer wiring in separate PRs so each unit is bounded and
reviewable.

What this PR commits:

- New module `server/_shared/resilience-freshness.ts` (110 lines)
  exporting:
    - `ResilienceCadence` type union covering the 5 cadences the
      methodology document lists (realtime, daily, weekly, monthly,
      annual).
    - `StalenessLevel` type union: fresh / aging / stale.
    - `cadenceUnitMs(cadence)` helper returning a canonical duration
      per cadence: realtime = 1 hour, daily = 1 day, weekly = 7 days,
      monthly = 30 days, annual = 365 days.
    - `FRESH_MULTIPLIER` = 1.5 and `AGING_MULTIPLIER` = 3. A signal is
      fresh when age is strictly less than 1.5x its cadence unit,
      aging when strictly less than 3x, stale otherwise.
    - `classifyStaleness({ lastObservedAtMs, cadence, nowMs })` pure
      function returning `{ staleness, ageMs, ageInCadenceUnits }`.
      Null / undefined / NaN / future timestamps return stale with
      positive-infinity age. `nowMs` is accepted as a deterministic
      override for unit testing.
- New test file `tests/resilience-freshness.test.mts` (170 lines, 10
  tests covering cadence ordering, fresh/aging/stale classification
  across all 5 cadences, defensive handling of null/NaN/future
  timestamps, exact threshold boundaries, internal consistency, and
  classifier purity).

What is deliberately NOT in this PR:

- No changes to the 13 dimension scorers. Propagating `lastObservedAt`
  through each scorer and aggregating max age per dimension is the
  next slice of T1.5 and will consume this classifier as a pure
  import.
- No schema changes (proto, OpenAPI, `ResilienceDimension` response
  type). The schema field `freshness: { lastObservedAt, staleness }`
  lands alongside the widget rendering in T1.6.
- No widget rendering. T1.6 owns the per-dimension freshness badge UI
  and will call `classifyStaleness` at render time.

Prerequisite PRs verified merged:
- #2821 (baseline / stress engine)
- #2847 (formula revert + RSF direction fix)
- #2858 (seed direct scoring)

Related in-flight Phase 1 PRs this session:
- #2941 T1.1 regression test
- #2943 T1.4 dataVersion widget wire
- #2944 T1.7 imputation taxonomy foundation
- #2945 T1.3 methodology mdx promotion
- #2946 T1.8 methodology doc linter

Testing:
- npx tsx --test tests/resilience-freshness.test.mts: 10/10 pass
- npm run typecheck: clean

Generated with Claude Opus 4.6 (1M context) via Claude Code
+ Compound Engineering v2.49.0

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
koala73 added a commit that referenced this pull request Apr 11, 2026
Why this PR?

Ships Phase 1 T1.6 of the country-resilience reference-grade upgrade
plan: a compact per-dimension coverage grid below the 5-domain rows
in the resilience widget so analysts can see per-dimension data
provenance without opening the deep-dive panel.

This is a scope-narrowed slice of T1.6. The plan's full description
adds an imputation class icon and a freshness badge per dimension,
but both of those require proto schema additions that have not
landed yet (T1.7 foundation and T1.5 foundation introduced the types
and classifier, but neither exposes the fields through the response
schema). This PR ships the coverage column immediately using the
existing `coverage`, `observedWeight`, `imputedWeight` fields that
are already on every ResilienceDimension, and leaves two follow-up
columns (imputation class icon, freshness badge) to later PRs once
the schema lands.

What this PR commits:

- New utils in `src/components/resilience-widget-utils.ts`:
    - `DIMENSION_LABELS` map with short display labels for each of
      the 13 scorer dimensions (`Macro`, `Currency`, `Trade`, `Cyber`,
      `Logistics`, `Infra`, `Energy`, `Gov`, `Social`, `Border`,
      `Info`, `Health`, `Food`).
    - `getResilienceDimensionLabel(dimensionId)` helper, matching
      the existing `getResilienceDomainLabel` pattern.
    - `DimensionConfidenceInput`, `DimensionCoverageStatus`, and
      `DimensionConfidence` types for the confidence classifier.
    - `formatDimensionConfidence(input)` pure function: returns
      `{ id, label, coveragePct, status, absent }` where status is
      one of `observed`, `partial`, `imputed`, `absent`. The 80%
      observed-share threshold for `observed` vs `partial` matches
      the existing `lowConfidence` rule in `_shared.ts` (where a 40%
      imputation share trips the widget-wide flag), applied per
      dimension so one well-covered dimension is not obscured by the
      domain's worst case.
    - `collectDimensionConfidences(domains)` helper that walks every
      domain and every dimension in scorer order so the widget
      renders a stable grid.
- New render methods in `src/components/ResilienceWidget.ts`:
    - `renderDimensionConfidenceGrid(data)` produces the container.
    - `renderDimensionConfidenceCell(dim)` produces one row per
      dimension with label, coverage bar, and percentage. Status
      enum is on the cell className so CSS can style observed,
      partial, imputed, and absent cells differently.
    - Wired into `renderScoreCard` between the existing domain rows
      and the footer, so the layout is domains, dimension grid,
      footer.
- 8 new tests in `tests/resilience-widget.test.mts` covering:
    - All 13 dimension labels plus the unknown-ID fallback.
    - Observed-heavy classification (observed).
    - Mixed observed and imputed classification (partial).
    - All-imputed classification (imputed).
    - Zero-weight absent classification (absent, `coveragePct=0`,
      `absent: true`).
    - Clamping for out-of-range coverage (above 1, negative) and
      NaN-safe fallback to zero weight and absent status.
    - `collectDimensionConfidences` preserves scorer order across
      domains and returns empty lists for empty responses.

What is deliberately NOT in this PR:

- No imputation class icon per dimension. That requires exposing
  `imputationClass` on the `ResilienceDimension` response type
  (proto change). Tracked as a follow-up after the T1.7 schema pass.
- No freshness badge per dimension. That requires exposing
  `lastObservedAt` and a staleness level on the response type (proto
  change). Tracked as a follow-up after the T1.5 full propagation pass.
- No CSS changes. The new cell classes are scaffolded for styling
  (`--observed`, `--partial`, `--imputed`, `--absent` modifiers) but
  the actual stylesheet edits will be folded into the CSS pass that
  picks up the full three-column dimension row once the icon and
  badge columns land.

Prerequisite PRs verified merged:
- #2821 (baseline / stress engine)
- #2847 (formula revert + RSF direction fix)
- #2858 (seed direct scoring)

Related in-flight Phase 1 PRs from this session:
- #2941 T1.1 regression test
- #2943 T1.4 dataVersion widget wire
- #2944 T1.7 imputation taxonomy foundation
- #2945 T1.3 methodology mdx promotion
- #2946 T1.8 methodology doc linter
- #2947 T1.5 staleness classifier foundation

Testing:
- npx tsx --test tests/resilience-widget.test.mts: 14/14 pass
  (6 existing + 8 new dimension-confidence tests)
- npx tsx --test tests/resilience-*.test.mts tests/resilience-*.test.mjs:
  179/179 pass
- npm run typecheck: clean

Generated with Claude Opus 4.6 (1M context) via Claude Code
+ Compound Engineering v2.49.0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
koala73 added a commit that referenced this pull request Apr 11, 2026
* feat(resilience): per-dimension confidence grid in widget (T1.6)

Why this PR?

Ships Phase 1 T1.6 of the country-resilience reference-grade upgrade
plan: a compact per-dimension coverage grid below the 5-domain rows
in the resilience widget so analysts can see per-dimension data
provenance without opening the deep-dive panel.

This is a scope-narrowed slice of T1.6. The plan's full description
adds an imputation class icon and a freshness badge per dimension,
but both of those require proto schema additions that have not
landed yet (T1.7 foundation and T1.5 foundation introduced the types
and classifier, but neither exposes the fields through the response
schema). This PR ships the coverage column immediately using the
existing `coverage`, `observedWeight`, `imputedWeight` fields that
are already on every ResilienceDimension, and leaves two follow-up
columns (imputation class icon, freshness badge) to later PRs once
the schema lands.

What this PR commits:

- New utils in `src/components/resilience-widget-utils.ts`:
    - `DIMENSION_LABELS` map with short display labels for each of
      the 13 scorer dimensions (`Macro`, `Currency`, `Trade`, `Cyber`,
      `Logistics`, `Infra`, `Energy`, `Gov`, `Social`, `Border`,
      `Info`, `Health`, `Food`).
    - `getResilienceDimensionLabel(dimensionId)` helper, matching
      the existing `getResilienceDomainLabel` pattern.
    - `DimensionConfidenceInput`, `DimensionCoverageStatus`, and
      `DimensionConfidence` types for the confidence classifier.
    - `formatDimensionConfidence(input)` pure function: returns
      `{ id, label, coveragePct, status, absent }` where status is
      one of `observed`, `partial`, `imputed`, `absent`. The 80%
      observed-share threshold for `observed` vs `partial` matches
      the existing `lowConfidence` rule in `_shared.ts` (where a 40%
      imputation share trips the widget-wide flag), applied per
      dimension so one well-covered dimension is not obscured by the
      domain's worst case.
    - `collectDimensionConfidences(domains)` helper that walks every
      domain and every dimension in scorer order so the widget
      renders a stable grid.
- New render methods in `src/components/ResilienceWidget.ts`:
    - `renderDimensionConfidenceGrid(data)` produces the container.
    - `renderDimensionConfidenceCell(dim)` produces one row per
      dimension with label, coverage bar, and percentage. Status
      enum is on the cell className so CSS can style observed,
      partial, imputed, and absent cells differently.
    - Wired into `renderScoreCard` between the existing domain rows
      and the footer, so the layout is domains, dimension grid,
      footer.
- 8 new tests in `tests/resilience-widget.test.mts` covering:
    - All 13 dimension labels plus the unknown-ID fallback.
    - Observed-heavy classification (observed).
    - Mixed observed and imputed classification (partial).
    - All-imputed classification (imputed).
    - Zero-weight absent classification (absent, `coveragePct=0`,
      `absent: true`).
    - Clamping for out-of-range coverage (above 1, negative) and
      NaN-safe fallback to zero weight and absent status.
    - `collectDimensionConfidences` preserves scorer order across
      domains and returns empty lists for empty responses.

What is deliberately NOT in this PR:

- No imputation class icon per dimension. That requires exposing
  `imputationClass` on the `ResilienceDimension` response type
  (proto change). Tracked as a follow-up after the T1.7 schema pass.
- No freshness badge per dimension. That requires exposing
  `lastObservedAt` and a staleness level on the response type (proto
  change). Tracked as a follow-up after the T1.5 full propagation pass.
- No CSS changes. The new cell classes are scaffolded for styling
  (`--observed`, `--partial`, `--imputed`, `--absent` modifiers) but
  the actual stylesheet edits will be folded into the CSS pass that
  picks up the full three-column dimension row once the icon and
  badge columns land.

Prerequisite PRs verified merged:
- #2821 (baseline / stress engine)
- #2847 (formula revert + RSF direction fix)
- #2858 (seed direct scoring)

Related in-flight Phase 1 PRs from this session:
- #2941 T1.1 regression test
- #2943 T1.4 dataVersion widget wire
- #2944 T1.7 imputation taxonomy foundation
- #2945 T1.3 methodology mdx promotion
- #2946 T1.8 methodology doc linter
- #2947 T1.5 staleness classifier foundation

Testing:
- npx tsx --test tests/resilience-widget.test.mts: 14/14 pass
  (6 existing + 8 new dimension-confidence tests)
- npx tsx --test tests/resilience-*.test.mts tests/resilience-*.test.mjs:
  179/179 pass
- npm run typecheck: clean

Generated with Claude Opus 4.6 (1M context) via Claude Code
+ Compound Engineering v2.49.0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(resilience): ship CSS + preview data for T1.6 grid (PR #2949 review)

Addresses the P2 REQUEST_CHANGES review on PR #2949:

> "New confidence-grid DOM is added without matching stylesheet
>  support. The widget now renders resilience-widget__dimension-grid
>  and resilience-widget__dimension-cell, but the stylesheet only
>  covers the existing domain rows and footer. That means the new
>  section will render as a tall unstyled stack instead of the
>  compact grid the PR describes. The locked preview path also stays
>  effectively empty because LOCKED_PREVIEW still has empty dimension
>  arrays, so gated users get a blank gap instead of a representative
>  preview."

Two changes in one pass:

1. **CSS for the dimension grid.** Added .resilience-widget__dimension-grid
   (2-column grid on desktop, 1-column under 560px), .__dimension-cell
   (72px label + flex bar + 28px pct), .__dimension-bar-track and
   .__dimension-bar-fill, .__dimension-label, .__dimension-pct, plus
   the four status modifiers (--observed, --partial, --imputed,
   --absent) which tint the bar fill with the existing resilience
   visual-level palette (#84cc16 observed, #eab308 partial, #f97316
   imputed, text-faint absent) so the grid stays in the same
   chromatic family as the domain bars. Added a mobile breakpoint
   rule so the grid collapses to one column on narrow widths.
   Inserted between the existing .__domains and .__footer rules at
   src/styles/country-deep-dive.css so ordering stays obvious.

2. **Populated LOCKED_PREVIEW with representative dimension data.**
   Every domain in the locked preview now carries real-looking
   dimension entries (id, score, coverage, observedWeight,
   imputedWeight) so non-entitled users see a blurred grid that
   matches the shape of a real card, not a blank gap between the
   domain bars and the footer. The exact values do not need to match
   any real country (the preview is blurred + non-interactive via
   the .resilience-widget__preview CSS rule), they just need to fill
   all 13 dimensions with plausible coverage values.

Also moved LOCKED_PREVIEW out of ResilienceWidget.ts and into
resilience-widget-utils.ts so the new regression test (see below)
can import it without dragging in the full ResilienceWidget class
transitive graph. The class indirectly depends on `import.meta.env.DEV`
via proxy.ts, which breaks plain node test runners. The utils file is
already dependency-free, so putting the fixture there is consistent
with the existing split between pure helpers and runtime widget code.

New regression test in tests/resilience-widget.test.mts:
`LOCKED_PREVIEW populates all 13 dimensions for the gated preview`
asserts that collectDimensionConfidences(LOCKED_PREVIEW.domains)
returns exactly 13 entries, every cell resolves to a short display
label (no raw IDs leaking through), and no cell is `absent`. If a
future edit accidentally drops a dimension from the preview, this
test fails loudly instead of producing a silent blank gap for gated
users.

Testing:
- npx tsx --test tests/resilience-widget.test.mts: 15/15 pass
  (14 existing + 1 new LOCKED_PREVIEW regression)
- npx tsx --test tests/resilience-*.test.mts tests/resilience-*.test.mjs:
  180/180 pass
- npm run typecheck: clean

Addresses the reviewer's requested changes directly; no DOM changes,
no new helpers, no scope expansion beyond the CSS + preview-data
pass.

Generated with Claude Opus 4.6 (1M context) via Claude Code
+ Compound Engineering v2.49.0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
koala73 added a commit that referenced this pull request Apr 11, 2026
…se-out

Final PR of the Phase 1 reference-grade upgrade. Closes out the Phase 1
acceptance gate with two deliverables:

1. T1.9 cache-key / health-registry sync regression test

   New tests/resilience-cache-keys-health-sync.test.mts asserts that
   the RESILIENCE_RANKING_CACHE_KEY string from _shared.ts literally
   appears in api/health.js, and that the score/history prefixes match
   the expected resilience:<kind>:v<n>: shape. This guards against
   future silent version drift where a cache-key bump in _shared.ts
   could leave the health registry watching the old key indefinitely.

   No cache keys were bumped in Phase 1 because every schema addition
   (imputationClass, freshness) was additive with default fallbacks on
   the existing resilience:score:v7 / ranking:v8 / history:v4 keys.

2. Methodology changelog v1.1 + Phase 1 self-assessment scorecard

   docs/methodology/country-resilience-index.mdx:
   - v1.0 entry trimmed to only reference the actual v1.0 baseline PRs
     (#2821, #2847, #2858). Moved from "Current published version" to
     "Baseline".
   - New v1.1 entry inserted between v1.0 and v2.0, marked as "Current
     published version". Lists all Phase 1 tasks T1.1-T1.9 with the
     specific PRs that shipped each slice.
   - New "Scorecard (v1.1 self-assessment)" section at the end of the
     changelog with ratings on six standard composite-indicator review
     axes: Methodology 7.5, Explainability 7.5, Reproducibility 8.0,
     Source quality 7.0, Timeliness 6.5, Sensitivity 7.0. Every axis has
     a named rationale and a named gap tied to a Phase 2 or Phase 3
     task. Both Phase 1 required thresholds (Methodology >=7.5,
     Explainability >=7.5) are met.

   docs/internal/country-resilience-upgrade-plan.md:
   - Phase 1 acceptance checklist: 7 of 7 items flipped to [x] with
     PR references.
   - T1.9 task bullet closed out with a reference to this PR.

What is deliberately NOT in this PR

- No code changes to the scorer, response builder, or widget.
- No new proto fields or schema changes.
- No cache-key bumps.
- No external expert review (Phase 3 T3.8b, runs in parallel).
- No Phase 2 work.

Depends on nothing at code level; references PRs 2959/2961/2962/2964
which are still in the review queue. This PR can land independently.

Verified

- tests/resilience-cache-keys-health-sync.test.mts 3/3 passing
- methodology doc linter (T1.8) still passes on the updated mdx
- lint:md clean
- test:data clean (4355/4355)
- typecheck clean
- typecheck:api clean
koala73 added a commit that referenced this pull request Apr 11, 2026
…se-out (#2965)

* chore(resilience): T1.9 cache-key health sync + Phase 1 scorecard close-out

Final PR of the Phase 1 reference-grade upgrade. Closes out the Phase 1
acceptance gate with two deliverables:

1. T1.9 cache-key / health-registry sync regression test

   New tests/resilience-cache-keys-health-sync.test.mts asserts that
   the RESILIENCE_RANKING_CACHE_KEY string from _shared.ts literally
   appears in api/health.js, and that the score/history prefixes match
   the expected resilience:<kind>:v<n>: shape. This guards against
   future silent version drift where a cache-key bump in _shared.ts
   could leave the health registry watching the old key indefinitely.

   No cache keys were bumped in Phase 1 because every schema addition
   (imputationClass, freshness) was additive with default fallbacks on
   the existing resilience:score:v7 / ranking:v8 / history:v4 keys.

2. Methodology changelog v1.1 + Phase 1 self-assessment scorecard

   docs/methodology/country-resilience-index.mdx:
   - v1.0 entry trimmed to only reference the actual v1.0 baseline PRs
     (#2821, #2847, #2858). Moved from "Current published version" to
     "Baseline".
   - New v1.1 entry inserted between v1.0 and v2.0, marked as "Current
     published version". Lists all Phase 1 tasks T1.1-T1.9 with the
     specific PRs that shipped each slice.
   - New "Scorecard (v1.1 self-assessment)" section at the end of the
     changelog with ratings on six standard composite-indicator review
     axes: Methodology 7.5, Explainability 7.5, Reproducibility 8.0,
     Source quality 7.0, Timeliness 6.5, Sensitivity 7.0. Every axis has
     a named rationale and a named gap tied to a Phase 2 or Phase 3
     task. Both Phase 1 required thresholds (Methodology >=7.5,
     Explainability >=7.5) are met.

   docs/internal/country-resilience-upgrade-plan.md:
   - Phase 1 acceptance checklist: 7 of 7 items flipped to [x] with
     PR references.
   - T1.9 task bullet closed out with a reference to this PR.

What is deliberately NOT in this PR

- No code changes to the scorer, response builder, or widget.
- No new proto fields or schema changes.
- No cache-key bumps.
- No external expert review (Phase 3 T3.8b, runs in parallel).
- No Phase 2 work.

Depends on nothing at code level; references PRs 2959/2961/2962/2964
which are still in the review queue. This PR can land independently.

Verified

- tests/resilience-cache-keys-health-sync.test.mts 3/3 passing
- methodology doc linter (T1.8) still passes on the updated mdx
- lint:md clean
- test:data clean (4355/4355)
- typecheck clean
- typecheck:api clean

* docs(resilience): correct Timeliness scorecard gap (#2965 P2)

Greptile P2 finding: the Timeliness row in the Phase 1 self-assessment
scorecard claimed "no real-time signals in v1.1" and described
conflict events and power outages as Phase 2 additions, which is
factually wrong. Thirteen stress-side indicators already run at
realtime or daily cadence via the cross-source stack:

  realtime: ucdpConflict, internetOutages, infraOutages,
            unrestEvents, socialVelocity
  daily:    sanctionCount, cyberThreats, gpsJamming,
            shippingStress, transitDisruption, gasStorageStress,
            energyPriceStress, newsThreatScore

The real cadence gap is that structural sources (WGI, GPI, RSF, WHO,
IMF macro) are annual and still carry the majority of index weight,
while the live-shock pillar is already rolling. Phase 2 T2.2 adds
FX volatility at daily cadence to narrow the gap on the
currency-external dimension.

Score itself unchanged (6.5), it was a defensible number. Only
the gap rationale is corrected to match reality.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant