test: cover invalid quality probe state shape by devkade · Pull Request #107 · devkade/ilchul

devkade · 2026-05-14T11:02:45Z

Summary

Add regression coverage for the issue [Cross-mode] Build quality probe matrix for Ralph and Autoresearch readiness #15 quality-probe CLI adapter when state.json is parseable JSON but not a valid workflow-state shape.
Prove kapi report --json fails closed with a state_json_invalid_shape state-dimension blocker instead of reporting quality readiness or crashing.

Linked issue

Refs #15

Problem

Issue #15 asks for a repeatable Ralph/Autoresearch readiness probe that exposes blockers across artifacts, state, parseability, runtime, approval, evidence, and closeout surfaces.

The existing quality-probe adapter already handled several trust-boundary cases, but this lane needed the smallest bounded probe-matrix slice: a concrete regression that invalid-but-parseable workflow state remains blocked and operator-visible through kapi report --json.

Without this regression, a future refactor could accidentally treat malformed workflow-state shape as ready, crash JSON report output, or hide the blocker outside the quality-probe matrix.

Options considered

Broaden the quality-probe implementation in this PR.
- Pros: Could close more of the original matrix request in one branch.
- Cons: Larger PR, higher review friction, and likely overlap with prior issue [Cross-mode] Build quality probe matrix for Ralph and Autoresearch readiness #15 slices already merged.
Add a focused CLI-regression slice for invalid state shape.
- Pros: Small, directly tied to a known fail-closed matrix contract, low review risk, and preserves existing implementation boundaries.
- Cons: Does not expand every remaining quality dimension in this PR.
Documentation-only readiness checklist.
- Pros: Lowest implementation risk.
- Cons: Does not protect the machine-readable qualityProbe contract from regressions.

Selected approach

Selected option: focused CLI-regression slice.
Why this one: it advances [Cross-mode] Build quality probe matrix for Ralph and Autoresearch readiness #15 with the smallest useful quality-probe matrix hardening slice while keeping changed lines well below the kapi-agent review gate.
Risks/trade-offs: this is intentionally a test-only hardening PR; it verifies existing fail-closed behavior rather than adding a new operator-facing command surface.

Implementation by file/surface

test/cli-args.test.ts
- Adds CLI report quality probe fails closed on invalid workflow state shape.
- Creates a fixture repo and safe .kapi/workflows/ralph/<slug> artifact root.
- Writes state.json with an invalid workflow-state shape (artifacts: null) while keeping events.jsonl and snapshot.json parseable.
- Saves a matching registry entry, runs kapi report <slug> --json, and asserts the state dimension includes state_json_invalid_shape.

Why this fixes it

The regression exercises the actual CLI report adapter path rather than only the pure domain matrix. It proves the report surface converts invalid-but-parseable state into a blocked quality-probe state dimension, preserving the fail-closed readiness contract required by issue #15.

QA / Verification

npm test -- test/cli-args.test.ts test/quality-probe-matrix.test.ts — pass (tsx --test test/*.test.ts ..., 374 tests, 363 pass, 11 skipped)
npm run check — pass
npm run check:unused — pass
npm run quality:budgets — pass; existing budget output still reports code_smells as warn and coverage as not configured
npm run verify — pass (374 tests, 363 pass, 11 skipped, then typecheck/unused/quality budgets pass)
git diff --check origin/dev...HEAD — pass

Anomalies observed

Running the test suite flips src/cli/kapi-review-cli.ts to executable mode locally (100644 => 100755) with no content changes; restored it to 644 before PR creation.
The first worker-side gh pr create attempt reported HTTP 401, but supervisor auth was valid via gh auth status; PR creation was retried from the supervisor shell.
npm run quality:budgets passes while reporting the existing code_smells warning budget and coverage as not configured.

Risks / Follow-up

This PR is a deliberately small test-only slice and does not attempt to complete every acceptance criterion from the broader [Cross-mode] Build quality probe matrix for Ralph and Autoresearch readiness #15 matrix issue.
Future slices can add more live Ralph/Autoresearch readiness cases; this PR intentionally references rather than closes [Cross-mode] Build quality probe matrix for Ralph and Autoresearch readiness #15.

kapi-agent review expectations and current-head merge gate

Expected review focus: confirm this stays a bounded [Cross-mode] Build quality probe matrix for Ralph and Autoresearch readiness #15 regression slice and that the fixture exercises CLI adapter behavior without masking malformed state.
Merge gate: require current-head kapi-agent approval/check success before merging into dev.

devkade · 2026-05-14T11:03:29Z

@kapi-agent review

kapi-agent

kapi-agent review

Verdict: APPROVE

Final approval summary

Review journey

The original goal is to harden issue #15’s quality-probe contract with a small regression test proving kapi report --json fails closed when state.json is parseable JSON but not a valid workflow-state shape. The semantic scope is intentionally test-only: one CLI adapter regression in test/cli-args.test.ts. There is no prior kapi-agent feedback or fix ping-pong for this PR.

What changed

The PR adds one test:

test/cli-args.test.ts
- Creates a fixture git repo.
- Creates a safe .kapi/workflows/ralph/<slug> artifact root.
- Writes parseable but invalid state.json with artifacts: null.
- Writes parseable events.jsonl and snapshot.json.
- Saves a matching registry entry.
- Runs kapi report <slug> --from <workspace> --json.
- Asserts the state dimension reports state_json_invalid_shape.

Why this is correct

The test exercises the real CLI report path rather than only a pure helper. It specifically covers the trust-boundary case where JSON parsing succeeds but workflow-state validation fails, which is the regression risk described in the PR. The existing implementation maps that condition through isWorkflowState(...) ? validateWorkflowState(...) : invalidQualityStateReport(..., "state_json_invalid_shape"), so the test locks in the desired fail-closed behavior.

Evidence

Verifier gate: PASS — npm ci && npm run verify exited 0.
Size gate: PASS — 22 changed lines, below the 200-line limit.
Revision explanation status: not required; found.
Inspected files/evidence:
- test/cli-args.test.ts
- src/cli/kapi-cli.ts
- Diff against origin/dev...HEAD
Local/CI evidence supplied by the gate includes full test, typecheck, unused-check, and quality-budget verification passing.

Remaining risks and approval rationale

Remaining risks are minimal and appropriate for a test-only PR. This does not complete every broader #15 acceptance surface, but it accurately protects one bounded quality-probe matrix contract. Approval is justified because the PR is small, scoped, verified, and adds meaningful regression coverage without changing runtime behavior.

Blocking issues

None.

Warnings / risks

test/cli-args.test.ts: the new assertion only checks that the state blockers include state_json_invalid_shape; it does not assert the entire quality-probe shape or readiness status. This is acceptable for the intended focused regression but leaves broader matrix assertions to other tests.

Suggestions

Consider adding a small helper for extracting a named quality-probe dimension if more CLI quality-probe tests are added, to avoid repeated loose JSON casts.

Looks good

The fixture exercises the actual runKapiCli(["report", ... "--json"]) path.
The invalid state is parseable JSON but structurally invalid, matching the targeted regression.
Safe artifact-root setup avoids conflating this case with artifact-root safety failures.
The test keeps events.jsonl and snapshot.json parseable, isolating the blocker to the workflow-state shape.

Verification notes

Verifier gate status: PASS — npm ci && npm run verify exited 0.
Size gate status: PASS — 22 changed lines < 200.
Revision-explanation status: not required for this opened PR; explanation present in PR body.
Evidence reviewed: PR diff, current implementation around quality-probe state validation, and supplied verifier output.

Engine: pi

kapi-agent

kapi-agent review

Verdict: APPROVE

Final approval summary

Review journey

The original goal is to add a small regression test for issue #15 proving the quality-probe CLI report path fails closed when state.json is valid JSON but has an invalid workflow-state shape. The semantic scope is test-only and limited to test/cli-args.test.ts. A prior kapi-agent review already approved this current head, and the revision metadata indicates no new revision explanation is required now.

What changed

test/cli-args.test.ts
- Adds a CLI-level regression test for kapi report <slug> --from <workspace> --json.
- Creates a fixture repo and safe workflow artifact root.
- Writes invalid-but-parseable state.json with artifacts: null.
- Writes parseable events.jsonl and snapshot.json.
- Asserts the reported quality-probe state dimension includes state_json_invalid_shape.

Why this is correct

The test exercises the real CLI adapter/report path rather than only a lower-level helper. It targets the relevant trust-boundary behavior: parseable JSON must still be schema/shape-validated before quality readiness is reported. The expected blocker demonstrates fail-closed behavior and protects against future regressions that could hide malformed workflow state or incorrectly mark the workflow ready.

Evidence

Verifier gate: PASS — npm ci && npm run verify exited 0.
Size gate: PASS — 22 changed lines, below the 200-line rejection threshold.
Revision-explanation status: not required now; found true.
Inspected files/diff:
- test/cli-args.test.ts
- src/cli/kapi-cli.ts quality-probe state inspection path around state_json_invalid_shape.

Remaining risks and approval rationale

No blocking risks remain. This is a focused regression-test PR with low implementation risk, clear issue linkage, passing verification, and a bounded semantic scope. Approval is justified.

Blocking issues

None.

Warnings / risks

None blocking. This PR intentionally does not expand the broader issue #15 quality matrix beyond the invalid-state-shape regression.

Suggestions

Consider a future small slice that asserts the top-level qualityProbe.ok === false for this same invalid-shape case, if not already covered elsewhere.

Looks good

The fixture keeps the artifact root inside the workspace, so the test isolates invalid state shape rather than unsafe-path behavior.
The test uses the CLI report path and JSON output, matching the operator-facing contract.
The malformed state is parseable JSON, so it specifically covers validation failure rather than parse failure.

Verification notes

Verifier gate: PASS — npm ci && npm run verify exited 0.
Size gate: PASS — 22 changed lines < 200.
Revision-explanation status: not required now; found true.
Local review inspected the PR diff and the relevant quality-probe implementation path in src/cli/kapi-cli.ts.

Engine: pi

devkade · 2026-05-14T11:05:46Z

@kapi-agent review

Revision explanation: What changed: updated the PR body linked-issue wording from Closes #15 to Refs #15 because this PR is intentionally a small bounded #15 slice rather than the full umbrella closure. Why: avoids accidentally closing the broader quality-probe matrix issue while preserving traceability. Verification: head remains dc171f73e9a7f693af377801c38070180bc45b28; prior commands passed (npm test -- test/cli-args.test.ts test/quality-probe-matrix.test.ts, npm run check, npm run check:unused, npm run quality:budgets, git diff --check origin/dev...HEAD).

kapi-agent Bot approved these changes May 14, 2026

View reviewed changes

test: cover invalid quality probe state shape

dc171f7

devkade force-pushed the feat/coding-15 branch from edc408a to dc171f7 Compare May 14, 2026 11:04

kapi-agent Bot approved these changes May 14, 2026

View reviewed changes

devkade merged commit b82d5ef into dev May 14, 2026
1 check passed

devkade deleted the feat/coding-15 branch May 15, 2026 09:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: cover invalid quality probe state shape#107

test: cover invalid quality probe state shape#107
devkade merged 1 commit into
devfrom
feat/coding-15

devkade commented May 14, 2026 •

edited

Loading

Uh oh!

devkade commented May 14, 2026

Uh oh!

kapi-agent Bot left a comment

Uh oh!

kapi-agent Bot left a comment

Uh oh!

devkade commented May 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

devkade commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Linked issue

Problem

Options considered

Selected approach

Implementation by file/surface

Why this fixes it

QA / Verification

Anomalies observed

Risks / Follow-up

kapi-agent review expectations and current-head merge gate

Uh oh!

devkade commented May 14, 2026

Uh oh!

kapi-agent Bot left a comment

Choose a reason for hiding this comment

kapi-agent review

Final approval summary

Review journey

What changed

Why this is correct

Evidence

Remaining risks and approval rationale

Blocking issues

Warnings / risks

Suggestions

Looks good

Verification notes

Uh oh!

kapi-agent Bot left a comment

Choose a reason for hiding this comment

kapi-agent review

Final approval summary

Review journey

What changed

Why this is correct

Evidence

Remaining risks and approval rationale

Blocking issues

Warnings / risks

Suggestions

Looks good

Verification notes

Uh oh!

devkade commented May 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

devkade commented May 14, 2026 •

edited

Loading