Skip to content

Require holdout and MSRE eCPS wins#246

Merged
MaxGhenis merged 1 commit into
mainfrom
codex/ecps-holdout-msre-gate-20260606
Jun 6, 2026
Merged

Require holdout and MSRE eCPS wins#246
MaxGhenis merged 1 commit into
mainfrom
codex/ecps-holdout-msre-gate-20260606

Conversation

@MaxGhenis
Copy link
Copy Markdown
Contributor

Summary

  • require the mp300k eCPS comparison gate to fail when candidate holdout loss does not beat the pinned eCPS holdout loss
  • require the gate to fail when candidate unweighted MSRE does not beat the pinned eCPS unweighted MSRE
  • expose the holdout/MSRE values in gate metrics/details and add adverse-regression tests

Why

The clean-surface rollback showed that a candidate can be unsafe even when other content gates pass. The release gate should enforce the actual promotion rule: MP must beat production eCPS on full loss, holdout loss, and unweighted MSRE using the frozen benchmark evidence.

Tests

  • uv run ruff format src/microplex_us/pipelines/mp300k_artifact_gates.py tests/pipelines/test_mp300k_artifact_gates.py
  • uv run ruff check src/microplex_us/pipelines/mp300k_artifact_gates.py tests/pipelines/test_mp300k_artifact_gates.py
  • uv run --extra dev --extra policyengine python -m pytest -q tests/pipelines/test_mp300k_artifact_gates.py

@MaxGhenis MaxGhenis force-pushed the codex/ecps-holdout-msre-gate-20260606 branch from cffa309 to 6cea1bf Compare June 6, 2026 09:03
@MaxGhenis MaxGhenis merged commit 3921e7b into main Jun 6, 2026
5 checks passed
@MaxGhenis MaxGhenis deleted the codex/ecps-holdout-msre-gate-20260606 branch June 6, 2026 09:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant