Skip to content

Add target diagnostics to eCPS comparison#85

Merged
MaxGhenis merged 1 commit into
mainfrom
codex/ecps-target-diagnostics-sidecar-20260529
May 29, 2026
Merged

Add target diagnostics to eCPS comparison#85
MaxGhenis merged 1 commit into
mainfrom
codex/ecps-target-diagnostics-sidecar-20260529

Conversation

@MaxGhenis
Copy link
Copy Markdown
Contributor

Summary

  • add target-level loss diagnostics to the sound eCPS replacement comparison payload
  • write a target_loss_diagnostics.json sidecar with top regressions, top improvements, full target rows, and family-level deltas
  • expose --target-diagnostics-path and --target-diagnostics-top-k on the comparison CLI

Tests

  • uv run --python 3.13 --extra dev ruff check src/microplex_us/pipelines/ecps_replacement_comparison.py tests/pipelines/test_ecps_replacement_comparison.py
  • uv run --python 3.13 --extra dev --extra policyengine pytest tests/pipelines/test_ecps_replacement_comparison.py
  • uv run --python 3.13 --extra dev --extra policyengine pytest tests/pipelines/test_mp300k_artifact_gates.py -k ecps

Refs #11.

@MaxGhenis MaxGhenis merged commit a314efe into main May 29, 2026
4 checks passed
@MaxGhenis MaxGhenis deleted the codex/ecps-target-diagnostics-sidecar-20260529 branch May 29, 2026 17:14
MaxGhenis added a commit that referenced this pull request May 29, 2026
…nment

The build-manifest's build-time calibration_diagnostics shares target_name /
family / split keys with Codex's comparison-time target_diagnostics (#85), so
the two join cleanly instead of becoming a third divergent schema. Spell out the
complementary split: comparison-time = 'did mp beat eCPS on this target',
build-time = 'what did this build hit + where did the target come from' (Pavel's
per-variable-aggregate gap).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant