feat(gate): add pure decision kernel by rianjs · Pull Request #47 · open-cli-collective/codereview-cli

rianjs · 2026-05-31T18:56:30Z

Summary

add a pure internal/gate decision kernel for resume, early-exit, repair, retry-posts, stale abort, fresh, and error outcomes
pin canonical gate summary enums and invalid-input behavior
cover the §3.3/§4.4 cross-product with table-driven tests plus a stdlib-only production import guard

Verification

rtk go test ./internal/gate
rtk go test ./...
rtk go test -race ./internal/gate
rtk make check

Closes #21

rianjs · 2026-05-31T18:58:05Z

Blocker
None.

Major

internal/gate/gate.go: PRStatePartial accepts PartialRun without verifying PartialRun.RunID == PR.RunID. The design rule is “dispatch by the marker run_id’s local row”; with the current API a caller bug can pass a different local row and the kernel will stop/abort/fresh on the wrong run. Add a consistency check for partial summaries and a regression test that a mismatched PR.RunID/PartialRun.RunID returns DecisionError with ErrorInvalidInput.

Minor

internal/gate/gate.go: Attempt is used for latest-run selection but is never validated. A zero or negative attempt can be selected if supplied in summaries, which weakens the “latest means highest attempt” contract. Reject non-positive attempts for selected/local run summaries and cover it in invalid-input tests.

Nit
None.

rianjs · 2026-05-31T19:01:25Z

Blocker
None.

Major
None.

Minor
None.

Nit
None.

The prior findings are resolved: partial marker reconciliation now rejects mismatched local run IDs, and non-positive attempts are validated with test coverage. I did not find new architectural drift or product-scope issues in the fix.

rianjs · 2026-05-31T19:03:37Z

Findings

Major: The suite never combines a resumable exact row with stale-base candidates or --retry-posts/--rerun inputs. The current resume coverage at gate_test.go:29, stale-base coverage at gate_test.go:118, and flag coverage at gate_test.go:161 and gate_test.go:180 each pass in isolation, so a regression that changes precedence between these branches could still slip through.
Minor: Stale-base behavior is only exercised with one candidate at a time, so multi-candidate aggregation and warning preservation on an abort_stale decision are still unpinned at gate_test.go:118.

monit-reviewer

Automated PR Review

Reviewed commit: d4abb0b

Summary

Reviewer	Findings
harness-engineering:harness-architecture-reviewer	3
harness-engineering:harness-enforcement-reviewer	4
harness-engineering:harness-self-documenting-code-reviewer	2

harness-engineering:harness-architecture-reviewer (3 findings)

💡 Suggestion - internal/gate/gate.go:265

Resume short-circuits before decideStaleBase is called, so stale-base candidates with LockStateHeld and HeartbeatStale=true never generate their warnings when a resumable exact run is present. A potentially stuck stale-base run goes silently unnoticed whenever a resume is available. The precedence rule is undocumented and untested.

💡 Suggestion - internal/gate/gate.go:318

decideStaleBase calls validateRunShape before the resumable(run) filter, so a non-resumable stale candidate with an invalid PostMode, non-positive Attempt, or invalid FailureClass returns DecisionError instead of being silently skipped. Callers supplying denormalized terminal/aborted rows from a broader query will hit unexpected errors. Either move validation after the resumable guard or document this as a strict-inputs contract and add a test covering an invalid non-resumable candidate.

💡 Suggestion - internal/gate/gate.go:426

validatePR does not check RunID for PRStateStaleBase (the case falls through to 'return Decision{}' without a RunID check), but decidePR echoes pr.RunID in the returned Decision for that state. Every other non-fresh PR state explicitly validates RunID. An empty RunID on a stale-base marker silently produces a Decision with RunID="", losing the audit reference the stale-base message explicitly documents.

harness-engineering:harness-enforcement-reviewer (4 findings)

⚠️ Should Fix - internal/gate/gate.go:242

DryRun is checked before the Rerun+RetryPosts mutual-exclusion guard, so passing DryRun=true combined with --rerun or --retry-posts silently returns DecisionFresh with no error, hiding an almost-certainly unintentional flag combination. The mutual-exclusion check covers only Rerun+RetryPosts. Add guards for DryRun+Rerun and DryRun+RetryPosts, or document that DryRun is always highest-priority and add tests for each silenced combination.

💡 Suggestion - internal/gate/gate.go:297

decideStaleBase only checks run.RunID == "" for resumable candidates; non-resumable stale-base entries with an empty RunID pass validateRunShape and are silently skipped without a validation error. This accepts garbage inputs inconsistently with the contract enforced by validateRun for ExactRuns. Lift the RunID check to apply before the resumable guard.

💡 Suggestion - internal/gate/gate.go:455

retryEligible selects by PostModeLive && requiredPending+requiredFailedTerminal > 0 without filtering by run state. A RunStateRunning row with non-zero pending counts satisfies the predicate; decideRetryPosts would then return DecisionRetryPosts targeting an actively-running row, risking duplicate post actions if the live run completes normally afterward. Consider excluding RunStateRunning and RunStateIncomplete from retry-posts eligibility, or add a test asserting a running row with pending counts is not selected.

💡 Suggestion - internal/gate/gate_test.go:29

No test combines a resumable exact row, stale-base candidates, and a non-trivial PR marker (e.g. PRStateCompleteReview or PRStatePartial) in the same request. The code returns on the first resumable hit before reaching stale-base or PR branches, so correctness is not in doubt, but there is no regression guard for the three-way precedence invariant. This was an open finding in the PR discussion (comment 3) that the current diff does not fully address.

harness-engineering:harness-self-documenting-code-reviewer (2 findings)

💡 Suggestion - internal/gate/gate.go:351

In the PRStatePartial switch, the explicit cases for RunStateRunning and RunStateIncomplete are unreachable: the preceding resumable(*partialRun) check already returns ErrorPartialResumableInconsistent for both states. The default branch is also unreachable because validateRun rejects unknown states before decidePR is reached. These dead arms misrepresent which states the switch actually handles.

💡 Suggestion - internal/gate/gate.go:387

decidePR for PRStateCompleteReview returns Decision{Kind: DecisionEarlyExit, RunID: pr.RunID} with no Outcome field, silently dropping any valid Outcome present on the marker. PRStateCompleteNoDiff propagates its outcome. Either forward the outcome in the EarlyExit decision or have validatePR reject a non-empty outcome for PRStateCompleteReview to make the contract explicit.

4 info-level observations excluded. Run with --verbose to include.

3 PR discussion threads considered.

Completed in 5m 42s | $1.75 | sonnet | daemon 0.2.121 | Glorfindel

Field	Value
Model	sonnet
Reviewers	hybrid-synthesis, harness-engineering:harness-architecture-reviewer, harness-engineering:harness-enforcement-reviewer, harness-engineering:harness-knowledge-reviewer, harness-engineering:harness-self-documenting-code-reviewer, security:security-code-auditor
Engine	claude · sonnet
Reviewed by	pr-review-daemon · monit-pr-reviewer
Duration	5m 42s wall · 14m 49s compute (Reviewers: 3m 37s · Synthesis: 2m 02s)
Cost	$1.75
Tokens	287.5k in / 56.3k out
Turns	10

Per-workstream usage

Workstream	Model	In	Out	Cache read	Cache create	Cost
hybrid-synthesis	sonnet	32.5k	8.1k	18.6k	13.8k (1h)	$0.19
harness-engineering:harness-architecture-reviewer	sonnet	54.4k	13.3k	6.9k	47.5k (1h)	$0.39
harness-engineering:harness-enforcement-reviewer	sonnet	58.0k	10.2k	6.9k	51.1k (1h)	$0.36
harness-engineering:harness-knowledge-reviewer	sonnet	60.5k	12.4k	6.9k	53.6k (1h)	$0.40
harness-engineering:harness-self-documenting-code-reviewer	sonnet	56.8k	8.7k	6.9k	49.9k (1h)	$0.33
security:security-code-auditor	haiku	25.4k	3.6k	0	25.4k (1h)	$0.07

_{Re-reviews only run when @monit-reviewer is re-requested as a reviewer — push as many commits as you need, then re-request when ready. PRs targeting branches other than main, master are skipped, even when @monit-reviewer is re-requested.}

monit-reviewer · 2026-05-31T19:10:24Z

+	AbortStaleRunIDs []string
+	Warnings         []string
+	ErrorReason      ErrorReason
+	FailureClass     FailureClass


🟡 Medium (harness-engineering:harness-enforcement-reviewer): DryRun is checked before the Rerun+RetryPosts mutual-exclusion guard, so passing DryRun=true combined with --rerun or --retry-posts silently returns DecisionFresh with no error, hiding an almost-certainly unintentional flag combination. The mutual-exclusion check covers only Rerun+RetryPosts. Add guards for DryRun+Rerun and DryRun+RetryPosts, or document that DryRun is always highest-priority and add tests for each silenced combination.

Reply to this thread when addressed.

Fixed - reject dry-run with rerun or retry-posts.

monit-reviewer · 2026-05-31T19:10:24Z

+		return decision
+	}
+
+	if run, ok := latestMatchingRun(req.ExactRuns, resumable); ok {


🔵 Low (harness-engineering:harness-architecture-reviewer): Resume short-circuits before decideStaleBase is called, so stale-base candidates with LockStateHeld and HeartbeatStale=true never generate their warnings when a resumable exact run is present. A potentially stuck stale-base run goes silently unnoticed whenever a resume is available. The precedence rule is undocumented and untested.

Reply to this thread when addressed.

Fixed - added explicit resume precedence coverage.

monit-reviewer · 2026-05-31T19:10:24Z

+		return decision
+	}
+	var supersede []string
+	for _, run := range runs {


🔵 Low (harness-engineering:harness-enforcement-reviewer): decideStaleBase only checks run.RunID == "" for resumable candidates; non-resumable stale-base entries with an empty RunID pass validateRunShape and are silently skipped without a validation error. This accepts garbage inputs inconsistently with the contract enforced by validateRun for ExactRuns. Lift the RunID check to apply before the resumable guard.

Reply to this thread when addressed.

Fixed - stale candidates now require run IDs.

monit-reviewer · 2026-05-31T19:10:24Z

+		return errorDecision(ErrorRetryPostsIneligible, "no live run has required pending or failed_terminal actions")
+	}
+	return Decision{Kind: DecisionRetryPosts, RunID: run.RunID}
+}


🔵 Low (harness-engineering:harness-architecture-reviewer): decideStaleBase calls validateRunShape before the resumable(run) filter, so a non-resumable stale candidate with an invalid PostMode, non-positive Attempt, or invalid FailureClass returns DecisionError instead of being silently skipped. Callers supplying denormalized terminal/aborted rows from a broader query will hit unexpected errors. Either move validation after the resumable guard or document this as a strict-inputs contract and add a test covering an invalid non-resumable candidate.

Reply to this thread when addressed.

Fixed - strict stale candidate validation is tested.

monit-reviewer · 2026-05-31T19:10:24Z

+	}
+	return warnings, Decision{}
+}
+


🔵 Low (harness-engineering:harness-self-documenting-code-reviewer): In the PRStatePartial switch, the explicit cases for RunStateRunning and RunStateIncomplete are unreachable: the preceding resumable(*partialRun) check already returns ErrorPartialResumableInconsistent for both states. The default branch is also unreachable because validateRun rejects unknown states before decidePR is reached. These dead arms misrepresent which states the switch actually handles.

Reply to this thread when addressed.

Fixed - simplified partial-state dispatch.

monit-reviewer · 2026-05-31T19:10:24Z

+			return invalidInput(fmt.Sprintf("partial marker local row %q has unsupported state %q", partialRun.RunID, partialRun.State))
+		}
+	default:
+		return invalidInput(fmt.Sprintf("invalid PR state %q", pr.State))


🔵 Low (harness-engineering:harness-self-documenting-code-reviewer): decidePR for PRStateCompleteReview returns Decision{Kind: DecisionEarlyExit, RunID: pr.RunID} with no Outcome field, silently dropping any valid Outcome present on the marker. PRStateCompleteNoDiff propagates its outcome. Either forward the outcome in the EarlyExit decision or have validatePR reject a non-empty outcome for PRStateCompleteReview to make the contract explicit.

Reply to this thread when addressed.

Fixed - reject outcomes on complete-review markers.

monit-reviewer · 2026-05-31T19:10:24Z

+	if !run.FailureClass.Valid() {
+		return invalidInput(fmt.Sprintf("run %q has invalid failure class %q", run.RunID, run.FailureClass))
+	}
+	return Decision{}


🔵 Low (harness-engineering:harness-architecture-reviewer): validatePR does not check RunID for PRStateStaleBase (the case falls through to 'return Decision{}' without a RunID check), but decidePR echoes pr.RunID in the returned Decision for that state. Every other non-fresh PR state explicitly validates RunID. An empty RunID on a stale-base marker silently produces a Decision with RunID="", losing the audit reference the stale-base message explicitly documents.

Reply to this thread when addressed.

Fixed - stale-base PR markers require run IDs.

monit-reviewer · 2026-05-31T19:10:24Z

+		}
+		if !pr.Outcome.RealVerdict() {
+			return invalidInput("partial marker must carry a real verdict outcome")
+		}


🔵 Low (harness-engineering:harness-enforcement-reviewer): retryEligible selects by PostModeLive && requiredPending+requiredFailedTerminal > 0 without filtering by run state. A RunStateRunning row with non-zero pending counts satisfies the predicate; decideRetryPosts would then return DecisionRetryPosts targeting an actively-running row, risking duplicate post actions if the live run completes normally afterward. Consider excluding RunStateRunning and RunStateIncomplete from retry-posts eligibility, or add a test asserting a running row with pending counts is not selected.

Reply to this thread when addressed.

Fixed - retry-posts excludes resumable rows.

monit-reviewer · 2026-05-31T19:10:24Z

+			want: Decision{Kind: DecisionRepair, RunID: "run-partial", Outcome: PROutcomeApproved},
+		},
+		{
+			name: "resumable exact row wins before complete PR marker",


🔵 Low (harness-engineering:harness-enforcement-reviewer): No test combines a resumable exact row, stale-base candidates, and a non-trivial PR marker (e.g. PRStateCompleteReview or PRStatePartial) in the same request. The code returns on the first resumable hit before reaching stale-base or PR branches, so correctness is not in doubt, but there is no regression guard for the three-way precedence invariant. This was an open finding in the PR discussion (comment 3) that the current diff does not fully address.

Reply to this thread when addressed.

Fixed - added three-way precedence coverage.

monit-reviewer

Automated PR Review

Reviewed commit: ffb4ff8 | Previous: d4abb0b (incremental)

Summary

No issues found.

12 PR discussion threads considered.

Completed in 1m 04s | $0.99 | sonnet | daemon 0.2.121 | Glorfindel

Field	Value
Model	sonnet
Mode	Re-review · Cycle 2 · Session resumed
Reviewers	hybrid-synthesis, harness-engineering:harness-architecture-reviewer, harness-engineering:harness-enforcement-reviewer, harness-engineering:harness-knowledge-reviewer, harness-engineering:harness-self-documenting-code-reviewer, security:security-code-auditor
Engine	claude · sonnet
Reviewed by	pr-review-daemon · monit-pr-reviewer
Duration	1m 04s wall · 1m 27s compute (Reviewers: 25s · Synthesis: 19s)
Cost	$0.99
Tokens	328.4k in / 5.4k out
Turns	6

Per-workstream usage

Workstream	Model	In	Out	Cache read	Cache create	Cost
hybrid-synthesis	sonnet	45.1k	1.1k	18.6k	26.5k (1h)	$0.12
harness-engineering:harness-architecture-reviewer	sonnet	54.6k	152	4.0k	50.6k (1h)	$0.19
harness-engineering:harness-enforcement-reviewer	sonnet	51.6k	145	4.0k	47.5k (1h)	$0.18
harness-engineering:harness-knowledge-reviewer	sonnet	53.7k	136	4.0k	49.7k (1h)	$0.19
harness-engineering:harness-self-documenting-code-reviewer	sonnet	50.1k	164	4.0k	46.0k (1h)	$0.18
security:security-code-auditor	haiku	45.6k	2.7k	0	45.6k (1h)	$0.07
discussion-summarizer	—	27.7k	945	18.6k	9.1k (1h)	$0.06

_{Re-reviews only run when @monit-reviewer is re-requested as a reviewer — push as many commits as you need, then re-request when ready. PRs targeting branches other than main, master are skipped, even when @monit-reviewer is re-requested.}

feat(gate): add pure decision kernel

590cff5

Closes #21

fix(gate): validate partial run summaries

2079530

test(gate): cover decision precedence combinations

d4abb0b

monit-reviewer suggested changes May 31, 2026

View reviewed changes

fix(gate): tighten daemon review contracts

ffb4ff8

monit-reviewer approved these changes May 31, 2026

View reviewed changes

fix(gate): restore retry-post plan semantics

5cb391e

rianjs merged commit ef77a8e into main May 31, 2026
8 checks passed

rianjs deleted the feat/cr-14a-gate-kernel branch May 31, 2026 19:23

Conversation

rianjs commented May 31, 2026

Summary

Verification

Uh oh!

rianjs commented May 31, 2026

Uh oh!

rianjs commented May 31, 2026

Uh oh!

rianjs commented May 31, 2026

Uh oh!

monit-reviewer left a comment

Choose a reason for hiding this comment

Automated PR Review

Summary

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

monit-reviewer left a comment

Choose a reason for hiding this comment

Automated PR Review

Summary

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants