Add deterministic incident response page triage fixture family#133
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a new family of deterministic fixtures for incident response triage, ranging from baseline to severe degradation levels. Each fixture includes dependency graphs, state information, and operational/relational contracts. Feedback highlights that the 'moderate' and 'mild' fixtures are currently redundant and should be differentiated to provide a meaningful progression. Additionally, the new test suite should be expanded to cover all degradation levels and refactored to use the fixture metadata for validation instead of assuming all contracts fail in the severe case.
| "fixture_id": "incident_response_page_triage_moderate_v1", | ||
| "fixture_version": "1.0.0", | ||
| "category": "incident_response", | ||
| "family": "incident_response_page_triage", | ||
| "degradation_level": "moderate", | ||
| "path": "fixtures/incident_response_page_triage_moderate_v1", | ||
| "expected_admissible": false, | ||
| "contracts": [ | ||
| "alert_ack_before_mitigation", | ||
| "no_orphan_mitigation_steps", | ||
| "rollback_reachable", | ||
| "root_cause_links_incident" | ||
| ], | ||
| "expected_failure_labels": [ | ||
| "RECOVERY_PATH_INVALID" | ||
| ] | ||
| }, |
There was a problem hiding this comment.
The incident_response_page_triage_moderate_v1 fixture is currently identical to the mild version in both its reconstructed artifacts and expected failure labels. To provide a meaningful progression of degradation within this family, the moderate level should ideally introduce additional failures (e.g., CAUSAL_DEPENDENCY_LOSS) or different artifact drift. If this is intentional for now, consider differentiating them to avoid redundancy in the test suite.
| def test_incident_response_severe_emits_only_expected_failure_labels() -> None: | ||
| original, reconstructed, contracts = _payload(SEVERE_ROOT) | ||
| expected_admissibility = _load_json(SEVERE_ROOT / "expected/admissibility.json") | ||
| expected_failures = _load_json(SEVERE_ROOT / "expected/failures.json") | ||
|
|
||
| results = ContractValidator().validate_contracts(original=original, reconstructed=reconstructed, contracts=contracts) | ||
|
|
||
| assert expected_admissibility["expected_admissible"] is False | ||
| assert all(not result.passed for result in results) | ||
|
|
||
| observed_contracts = sorted(result.contract_id for result in results) | ||
| assert observed_contracts == sorted(expected_admissibility["must_fail_contracts"]) | ||
|
|
||
| observed_labels = sorted({result.failure_label for result in results if result.failure_label is not None}) | ||
| assert observed_labels == sorted(expected_failures["expected_failures"]) |
There was a problem hiding this comment.
The current test suite only validates the baseline and severe (degraded) fixtures, leaving mild and moderate untested. Additionally, the severe test assumes all contracts fail (all(not result.passed for result in results)), which is fragile if the fixture is updated to include passing contracts. I recommend refactoring these tests to iterate through all degradation levels and verify the passed status of each contract against the must_hold_contracts and must_fail_contracts lists defined in the fixture's admissibility.json.
Summary:
incident_response_page_triagewith four levels (baseline,mild,moderate,severe) and deterministic manifest registration to demonstrate admissibility degradation beyond the coding-workflow family.Changed files:
fixtures/manifest.json(registered four new fixtures in deterministic order)fixtures/incident_response_page_triage_v1/**(baseline bundle: original/reconstructed traces, state, dependency graph,original/contracts,expected/*)fixtures/incident_response_page_triage_mild_v1/**(mild bundle)fixtures/incident_response_page_triage_moderate_v1/**(moderate bundle)fixtures/incident_response_page_triage_degraded_v1/**(severe bundle)tests/test_fixture_manifest.py(updatedEXPECTED_FIXTURE_ORDERto include new fixture ids)tests/test_incident_response_fixture_contract_bundle.py(new targeted validator tests for baseline and severe fixtures)Testing:
pytest tests/test_fixture_manifest.py -qand it passed (8 passed).pytest tests/test_contract_validator.py -qand it passed (10 passed).pytest tests/test_fixture_contract_bundle.py -qandpytest tests/test_negative_fixture_contract_bundle.py -qand both passed.pytest tests/test_incident_response_fixture_contract_bundle.py -qand it passed (2 tests).pytest -qand observed all tests passing (199 passed).npm run checkand it completed successfully.Risks:
mildlevel is implemented as a deterministic weak degradation that yields one contract failure (RECOVERY_PATH_INVALID) because current validators do not express non-contract drift; this is an intentional, conservative choice to stay within existing validator semantics.Next:
artifacts/layered_admissibility_results.json) in a follow-up PR only if multi-family artifact inclusion is explicitly requested.Codex Task