Skip to content

feat(gate): add pure decision kernel#47

Merged
rianjs merged 5 commits into
mainfrom
feat/cr-14a-gate-kernel
May 31, 2026
Merged

feat(gate): add pure decision kernel#47
rianjs merged 5 commits into
mainfrom
feat/cr-14a-gate-kernel

Conversation

@rianjs
Copy link
Copy Markdown
Contributor

@rianjs rianjs commented May 31, 2026

Summary

  • add a pure internal/gate decision kernel for resume, early-exit, repair, retry-posts, stale abort, fresh, and error outcomes
  • pin canonical gate summary enums and invalid-input behavior
  • cover the §3.3/§4.4 cross-product with table-driven tests plus a stdlib-only production import guard

Verification

  • rtk go test ./internal/gate
  • rtk go test ./...
  • rtk go test -race ./internal/gate
  • rtk make check

Closes #21

@rianjs
Copy link
Copy Markdown
Contributor Author

rianjs commented May 31, 2026

Blocker
None.

Major

  • internal/gate/gate.go: PRStatePartial accepts PartialRun without verifying PartialRun.RunID == PR.RunID. The design rule is “dispatch by the marker run_id’s local row”; with the current API a caller bug can pass a different local row and the kernel will stop/abort/fresh on the wrong run. Add a consistency check for partial summaries and a regression test that a mismatched PR.RunID/PartialRun.RunID returns DecisionError with ErrorInvalidInput.

Minor

  • internal/gate/gate.go: Attempt is used for latest-run selection but is never validated. A zero or negative attempt can be selected if supplied in summaries, which weakens the “latest means highest attempt” contract. Reject non-positive attempts for selected/local run summaries and cover it in invalid-input tests.

Nit
None.

@rianjs
Copy link
Copy Markdown
Contributor Author

rianjs commented May 31, 2026

Blocker
None.

Major
None.

Minor
None.

Nit
None.

The prior findings are resolved: partial marker reconciliation now rejects mismatched local run IDs, and non-positive attempts are validated with test coverage. I did not find new architectural drift or product-scope issues in the fix.

@rianjs
Copy link
Copy Markdown
Contributor Author

rianjs commented May 31, 2026

Findings

  • Major: The suite never combines a resumable exact row with stale-base candidates or --retry-posts/--rerun inputs. The current resume coverage at gate_test.go:29, stale-base coverage at gate_test.go:118, and flag coverage at gate_test.go:161 and gate_test.go:180 each pass in isolation, so a regression that changes precedence between these branches could still slip through.
  • Minor: Stale-base behavior is only exercised with one candidate at a time, so multi-candidate aggregation and warning preservation on an abort_stale decision are still unpinned at gate_test.go:118.

Copy link
Copy Markdown

@monit-reviewer monit-reviewer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated PR Review

Reviewed commit: d4abb0b

Summary

Reviewer Findings
harness-engineering:harness-architecture-reviewer 3
harness-engineering:harness-enforcement-reviewer 4
harness-engineering:harness-self-documenting-code-reviewer 2
harness-engineering:harness-architecture-reviewer (3 findings)

💡 Suggestion - internal/gate/gate.go:265

Resume short-circuits before decideStaleBase is called, so stale-base candidates with LockStateHeld and HeartbeatStale=true never generate their warnings when a resumable exact run is present. A potentially stuck stale-base run goes silently unnoticed whenever a resume is available. The precedence rule is undocumented and untested.

💡 Suggestion - internal/gate/gate.go:318

decideStaleBase calls validateRunShape before the resumable(run) filter, so a non-resumable stale candidate with an invalid PostMode, non-positive Attempt, or invalid FailureClass returns DecisionError instead of being silently skipped. Callers supplying denormalized terminal/aborted rows from a broader query will hit unexpected errors. Either move validation after the resumable guard or document this as a strict-inputs contract and add a test covering an invalid non-resumable candidate.

💡 Suggestion - internal/gate/gate.go:426

validatePR does not check RunID for PRStateStaleBase (the case falls through to 'return Decision{}' without a RunID check), but decidePR echoes pr.RunID in the returned Decision for that state. Every other non-fresh PR state explicitly validates RunID. An empty RunID on a stale-base marker silently produces a Decision with RunID="", losing the audit reference the stale-base message explicitly documents.

harness-engineering:harness-enforcement-reviewer (4 findings)

⚠️ Should Fix - internal/gate/gate.go:242

DryRun is checked before the Rerun+RetryPosts mutual-exclusion guard, so passing DryRun=true combined with --rerun or --retry-posts silently returns DecisionFresh with no error, hiding an almost-certainly unintentional flag combination. The mutual-exclusion check covers only Rerun+RetryPosts. Add guards for DryRun+Rerun and DryRun+RetryPosts, or document that DryRun is always highest-priority and add tests for each silenced combination.

💡 Suggestion - internal/gate/gate.go:297

decideStaleBase only checks run.RunID == "" for resumable candidates; non-resumable stale-base entries with an empty RunID pass validateRunShape and are silently skipped without a validation error. This accepts garbage inputs inconsistently with the contract enforced by validateRun for ExactRuns. Lift the RunID check to apply before the resumable guard.

💡 Suggestion - internal/gate/gate.go:455

retryEligible selects by PostModeLive && requiredPending+requiredFailedTerminal > 0 without filtering by run state. A RunStateRunning row with non-zero pending counts satisfies the predicate; decideRetryPosts would then return DecisionRetryPosts targeting an actively-running row, risking duplicate post actions if the live run completes normally afterward. Consider excluding RunStateRunning and RunStateIncomplete from retry-posts eligibility, or add a test asserting a running row with pending counts is not selected.

💡 Suggestion - internal/gate/gate_test.go:29

No test combines a resumable exact row, stale-base candidates, and a non-trivial PR marker (e.g. PRStateCompleteReview or PRStatePartial) in the same request. The code returns on the first resumable hit before reaching stale-base or PR branches, so correctness is not in doubt, but there is no regression guard for the three-way precedence invariant. This was an open finding in the PR discussion (comment 3) that the current diff does not fully address.

harness-engineering:harness-self-documenting-code-reviewer (2 findings)

💡 Suggestion - internal/gate/gate.go:351

In the PRStatePartial switch, the explicit cases for RunStateRunning and RunStateIncomplete are unreachable: the preceding resumable(*partialRun) check already returns ErrorPartialResumableInconsistent for both states. The default branch is also unreachable because validateRun rejects unknown states before decidePR is reached. These dead arms misrepresent which states the switch actually handles.

💡 Suggestion - internal/gate/gate.go:387

decidePR for PRStateCompleteReview returns Decision{Kind: DecisionEarlyExit, RunID: pr.RunID} with no Outcome field, silently dropping any valid Outcome present on the marker. PRStateCompleteNoDiff propagates its outcome. Either forward the outcome in the EarlyExit decision or have validatePR reject a non-empty outcome for PRStateCompleteReview to make the contract explicit.

4 info-level observations excluded. Run with --verbose to include.

3 PR discussion threads considered.


Completed in 5m 42s | $1.75 | sonnet | daemon 0.2.121 | Glorfindel
Field Value
Model sonnet
Reviewers hybrid-synthesis, harness-engineering:harness-architecture-reviewer, harness-engineering:harness-enforcement-reviewer, harness-engineering:harness-knowledge-reviewer, harness-engineering:harness-self-documenting-code-reviewer, security:security-code-auditor
Engine claude · sonnet
Reviewed by pr-review-daemon · monit-pr-reviewer
Duration 5m 42s wall · 14m 49s compute (Reviewers: 3m 37s · Synthesis: 2m 02s)
Cost $1.75
Tokens 287.5k in / 56.3k out
Turns 10

Per-workstream usage

Workstream Model In Out Cache read Cache create Cost
hybrid-synthesis sonnet 32.5k 8.1k 18.6k 13.8k (1h) $0.19
harness-engineering:harness-architecture-reviewer sonnet 54.4k 13.3k 6.9k 47.5k (1h) $0.39
harness-engineering:harness-enforcement-reviewer sonnet 58.0k 10.2k 6.9k 51.1k (1h) $0.36
harness-engineering:harness-knowledge-reviewer sonnet 60.5k 12.4k 6.9k 53.6k (1h) $0.40
harness-engineering:harness-self-documenting-code-reviewer sonnet 56.8k 8.7k 6.9k 49.9k (1h) $0.33
security:security-code-auditor haiku 25.4k 3.6k 0 25.4k (1h) $0.07

Re-reviews only run when @monit-reviewer is re-requested as a reviewer — push as many commits as you need, then re-request when ready. PRs targeting branches other than main, master are skipped, even when @monit-reviewer is re-requested.

Comment thread internal/gate/gate.go
AbortStaleRunIDs []string
Warnings []string
ErrorReason ErrorReason
FailureClass FailureClass
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium (harness-engineering:harness-enforcement-reviewer): DryRun is checked before the Rerun+RetryPosts mutual-exclusion guard, so passing DryRun=true combined with --rerun or --retry-posts silently returns DecisionFresh with no error, hiding an almost-certainly unintentional flag combination. The mutual-exclusion check covers only Rerun+RetryPosts. Add guards for DryRun+Rerun and DryRun+RetryPosts, or document that DryRun is always highest-priority and add tests for each silenced combination.

Reply to this thread when addressed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed - reject dry-run with rerun or retry-posts.

Comment thread internal/gate/gate.go
return decision
}

if run, ok := latestMatchingRun(req.ExactRuns, resumable); ok {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 Low (harness-engineering:harness-architecture-reviewer): Resume short-circuits before decideStaleBase is called, so stale-base candidates with LockStateHeld and HeartbeatStale=true never generate their warnings when a resumable exact run is present. A potentially stuck stale-base run goes silently unnoticed whenever a resume is available. The precedence rule is undocumented and untested.

Reply to this thread when addressed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed - added explicit resume precedence coverage.

Comment thread internal/gate/gate.go
return decision
}
var supersede []string
for _, run := range runs {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 Low (harness-engineering:harness-enforcement-reviewer): decideStaleBase only checks run.RunID == "" for resumable candidates; non-resumable stale-base entries with an empty RunID pass validateRunShape and are silently skipped without a validation error. This accepts garbage inputs inconsistently with the contract enforced by validateRun for ExactRuns. Lift the RunID check to apply before the resumable guard.

Reply to this thread when addressed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed - stale candidates now require run IDs.

Comment thread internal/gate/gate.go
return errorDecision(ErrorRetryPostsIneligible, "no live run has required pending or failed_terminal actions")
}
return Decision{Kind: DecisionRetryPosts, RunID: run.RunID}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 Low (harness-engineering:harness-architecture-reviewer): decideStaleBase calls validateRunShape before the resumable(run) filter, so a non-resumable stale candidate with an invalid PostMode, non-positive Attempt, or invalid FailureClass returns DecisionError instead of being silently skipped. Callers supplying denormalized terminal/aborted rows from a broader query will hit unexpected errors. Either move validation after the resumable guard or document this as a strict-inputs contract and add a test covering an invalid non-resumable candidate.

Reply to this thread when addressed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed - strict stale candidate validation is tested.

Comment thread internal/gate/gate.go
}
return warnings, Decision{}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 Low (harness-engineering:harness-self-documenting-code-reviewer): In the PRStatePartial switch, the explicit cases for RunStateRunning and RunStateIncomplete are unreachable: the preceding resumable(*partialRun) check already returns ErrorPartialResumableInconsistent for both states. The default branch is also unreachable because validateRun rejects unknown states before decidePR is reached. These dead arms misrepresent which states the switch actually handles.

Reply to this thread when addressed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed - simplified partial-state dispatch.

Comment thread internal/gate/gate.go
return invalidInput(fmt.Sprintf("partial marker local row %q has unsupported state %q", partialRun.RunID, partialRun.State))
}
default:
return invalidInput(fmt.Sprintf("invalid PR state %q", pr.State))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 Low (harness-engineering:harness-self-documenting-code-reviewer): decidePR for PRStateCompleteReview returns Decision{Kind: DecisionEarlyExit, RunID: pr.RunID} with no Outcome field, silently dropping any valid Outcome present on the marker. PRStateCompleteNoDiff propagates its outcome. Either forward the outcome in the EarlyExit decision or have validatePR reject a non-empty outcome for PRStateCompleteReview to make the contract explicit.

Reply to this thread when addressed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed - reject outcomes on complete-review markers.

Comment thread internal/gate/gate.go
if !run.FailureClass.Valid() {
return invalidInput(fmt.Sprintf("run %q has invalid failure class %q", run.RunID, run.FailureClass))
}
return Decision{}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 Low (harness-engineering:harness-architecture-reviewer): validatePR does not check RunID for PRStateStaleBase (the case falls through to 'return Decision{}' without a RunID check), but decidePR echoes pr.RunID in the returned Decision for that state. Every other non-fresh PR state explicitly validates RunID. An empty RunID on a stale-base marker silently produces a Decision with RunID="", losing the audit reference the stale-base message explicitly documents.

Reply to this thread when addressed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed - stale-base PR markers require run IDs.

Comment thread internal/gate/gate.go
}
if !pr.Outcome.RealVerdict() {
return invalidInput("partial marker must carry a real verdict outcome")
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 Low (harness-engineering:harness-enforcement-reviewer): retryEligible selects by PostModeLive && requiredPending+requiredFailedTerminal > 0 without filtering by run state. A RunStateRunning row with non-zero pending counts satisfies the predicate; decideRetryPosts would then return DecisionRetryPosts targeting an actively-running row, risking duplicate post actions if the live run completes normally afterward. Consider excluding RunStateRunning and RunStateIncomplete from retry-posts eligibility, or add a test asserting a running row with pending counts is not selected.

Reply to this thread when addressed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed - retry-posts excludes resumable rows.

want: Decision{Kind: DecisionRepair, RunID: "run-partial", Outcome: PROutcomeApproved},
},
{
name: "resumable exact row wins before complete PR marker",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 Low (harness-engineering:harness-enforcement-reviewer): No test combines a resumable exact row, stale-base candidates, and a non-trivial PR marker (e.g. PRStateCompleteReview or PRStatePartial) in the same request. The code returns on the first resumable hit before reaching stale-base or PR branches, so correctness is not in doubt, but there is no regression guard for the three-way precedence invariant. This was an open finding in the PR discussion (comment 3) that the current diff does not fully address.

Reply to this thread when addressed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed - added three-way precedence coverage.

Copy link
Copy Markdown

@monit-reviewer monit-reviewer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated PR Review

Reviewed commit: ffb4ff8 | Previous: d4abb0b (incremental)

Summary

No issues found.

12 PR discussion threads considered.


Completed in 1m 04s | $0.99 | sonnet | daemon 0.2.121 | Glorfindel
Field Value
Model sonnet
Mode Re-review · Cycle 2 · Session resumed
Reviewers hybrid-synthesis, harness-engineering:harness-architecture-reviewer, harness-engineering:harness-enforcement-reviewer, harness-engineering:harness-knowledge-reviewer, harness-engineering:harness-self-documenting-code-reviewer, security:security-code-auditor
Engine claude · sonnet
Reviewed by pr-review-daemon · monit-pr-reviewer
Duration 1m 04s wall · 1m 27s compute (Reviewers: 25s · Synthesis: 19s)
Cost $0.99
Tokens 328.4k in / 5.4k out
Turns 6

Per-workstream usage

Workstream Model In Out Cache read Cache create Cost
hybrid-synthesis sonnet 45.1k 1.1k 18.6k 26.5k (1h) $0.12
harness-engineering:harness-architecture-reviewer sonnet 54.6k 152 4.0k 50.6k (1h) $0.19
harness-engineering:harness-enforcement-reviewer sonnet 51.6k 145 4.0k 47.5k (1h) $0.18
harness-engineering:harness-knowledge-reviewer sonnet 53.7k 136 4.0k 49.7k (1h) $0.19
harness-engineering:harness-self-documenting-code-reviewer sonnet 50.1k 164 4.0k 46.0k (1h) $0.18
security:security-code-auditor haiku 45.6k 2.7k 0 45.6k (1h) $0.07
discussion-summarizer 27.7k 945 18.6k 9.1k (1h) $0.06

Re-reviews only run when @monit-reviewer is re-requested as a reviewer — push as many commits as you need, then re-request when ready. PRs targeting branches other than main, master are skipped, even when @monit-reviewer is re-requested.

@rianjs rianjs merged commit ef77a8e into main May 31, 2026
8 checks passed
@rianjs rianjs deleted the feat/cr-14a-gate-kernel branch May 31, 2026 19:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CR-14a: gate **decision kernel** (pure)

2 participants