contracts(ccpa): v1.32.0 — add FALSIFY-CCPA-019 + FALSIFY-CCPA-020 at PROPOSED#1794
Merged
Conversation
… PROPOSED Two-axis bump: catch up to companion-led v1.31.0 + ship Phase 6 gate in one PR. Gate registry: 18 → 20 entries. v1.31.0 SKIPPED (companion-led at companion-repo M236 / PR #221 squash 188a328 without aprender-side authoring); v1.30.0 → v1.32.0 directly, same SKIP pattern v1.28.0 → v1.30.0 used for the auto-closed aprender#1705 PR. ## FALSIFY-CCPA-019 calibration_required_before_verdict (PROPOSED) Codifies the M196-M224 4-bug-stack lesson. Any future verdict on CCPA-016/017/018 — promotion PROPOSED → ACTIVE_RUNTIME OR treating an evidence file as discharging the gate — requires a fresh calibration record (identity_pass + regression_fail, ≤30 days old) at evidence/calibration/calibration-runs.json. Bidirectional-sensitivity: a meter that ALWAYS-passes would pass identity but also pass regression (caught); a meter that ALWAYS-fails would fail regression correctly but also fail identity (caught). Freshness window catches infrastructure drift (rustc bumps, apr CLI changes, claude CLI changes) without weekly runs. Test scaffold: companion-repo crates/ccpa-differ/tests/ falsify_ccpa_019_calibration.rs (7 active synthetic + 1 #[ignore]'d live-evidence). The M234 calibration evidence (evidence/calibration/calibration- runs.json) records both the trivial in-house identity fixture + decy#39 regression dispatch; discharges the gate currently. ## FALSIFY-CCPA-020 contract_compliance_per_turn (PROPOSED) Codifies the Phase 6 operator-directive (companion-repo M250+): the right experiment for paiml-org is claude-bound-by-pmat-comply- and-pv vs apr-bound-by-pmat-comply-and-pv, NOT raw-vs-raw. Every paiml commit must pass pmat comply + pv validate to merge. Per-turn pmat comply check --strict + pv validate fire on every Write/Edit in the under-contract regime (ArenaSession::with_compliance (N)). Compound oracle (cargo test + pmat comply + pv validate) gates OraclePassed. Bidirectional sensitivity: - Identity: clean-history-with-pass MUST satisfy - Regression: pass-with-failing-compliance-turn MUST be falsified Test scaffold: companion-repo crates/ccpa-arena/tests/ falsify_ccpa_020_contract_compliance.rs (7 active synthetic + 1 #[ignore]'d live-evidence). ## Companion-side ship trail (M250-M264) M250 plan + n=20 corpus; M252 schema; M254 dispatch hook + trap; M256 compound oracle; M258 CCPA-020 gate; M260 first valid n=15 calibration evidence; M262 Toyota-Way root-cause + upstream fixes (#1782 timeout + #1790 matmul guard, both MERGED); M264 P6.6 bench runner (operator-dispatchable end-to-end). ## Activation path CCPA-019 + CCPA-020 stay PROPOSED until first operator-dispatched Phase 6 bench produces evidence/under-contract/scores.json AND a fresh calibration record. ACTIVE_RUNTIME flip awaits both. `pv validate contracts/claude-code-parity-apr-v1.yaml` clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two-axis bump on
contracts/claude-code-parity-apr-v1.yaml: catch up to companion-led v1.31.0 + ship Phase 6 gate in one PR. Gate registry: 18 → 20 entries.v1.31.0 SKIPPED (companion-led at companion-repo M236 without aprender-side authoring); v1.30.0 → v1.32.0 directly, same SKIP pattern v1.28.0 → v1.30.0 used.
FALSIFY-CCPA-019 calibration_required_before_verdict
Codifies M196-M224 4-bug-stack lesson. Any verdict on CCPA-016/017/018 requires fresh calibration (identity_pass + regression_fail, ≤30 days). Bidirectional sensitivity catches always-pass and always-fail meters.
FALSIFY-CCPA-020 contract_compliance_per_turn
Phase 6 operator-directive: measure claude-bound-by-pmat-comply-and-pv vs apr-bound-by-pmat-comply-and-pv, NOT raw-vs-raw. Per-turn
pmat comply check --strict+pv validatefire on every Write/Edit; compound oracle gates OraclePassed.Companion-side ship trail
M250 plan + n=20 corpus → M252 schema → M254 dispatch hook + trap → M256 compound oracle → M258 CCPA-020 gate → M260 first valid n=15 calibration → M262 Toyota-Way root-cause (upstreamed as aprender#1782 + #1790, both MERGED) → M264 P6.6 bench runner.
Activation path
Both new gates stay PROPOSED until first operator-dispatched Phase 6 bench produces
evidence/under-contract/scores.jsonAND fresh calibration record. ACTIVE_RUNTIME flip awaits both.Test plan
pv validate contracts/claude-code-parity-apr-v1.yaml→ 0 errors, 0 warnings