feat(gate): add gate IO integration by rianjs · Pull Request #48 · open-cli-collective/codereview-cli

rianjs · 2026-05-31T20:03:55Z

Summary

add gate IO orchestration that maps ledger/provider state into the pure gate kernel
add lock-authoritative stale-base cleanup, live fresh allocation, local resume, early exit, and base-moved abort handling
add scoped ledger lookup and reusable lock path helpers for resume-key integration

Tests

go test ./internal/gateio
go test ./internal/ledger
go test ./internal/statepaths
go test ./...
go test -race ./internal/gateio
make check

Closes #22

rianjs · 2026-05-31T20:09:34Z

Blocker
None.

Major

internal/gateio/gateio.go:128 / internal/gateio/gateio.go:241: Evaluate eagerly builds the full gate state before calling the pure kernel, including stale-base lock probes and provider marker reads. That means exact local resume, --rerun, and --retry-posts decisions can be blocked by stale-lock acquisition errors or ListIssueComments / ListReviews failures even though CR-14a gives those decisions precedence over stale/PR state. This weakens the local-resume regression guarantee. Add a test with an exact resumable row plus provider read failure/stale lock acquisition failure, and restructure to decide/short-circuit local-only precedence before PR/stale IO where possible.
internal/gateio/gateio.go:64 / internal/gateio/gateio.go:496: Request accepts PRRef, PR, and PRKey, but validation only checks that each is shaped enough to use. A mismatched PRKey or PR.Ref can make gate IO acquire a lock and scan markers for one PR while listing/allocating ledger rows under another PR key. Since this is a data-integrity boundary, derive PRKey from PRRef internally or validate PRKey and PR.Ref are consistent with PRRef.

Minor

The implementation is missing material plan-pinned tests for subtle marker precedence and mapping behavior: mixed stale/current markers, paired real-verdict rollup plus same-run submit review, retry-posts unsupported classification, normal dry-run fresh behavior, and full ledger-outcome-to-gate-state mapping. The code appears to implement several of these, but they are exactly the cases most likely to regress when CR-14c wires repair/retry execution.

Nit
None.

rianjs · 2026-05-31T20:36:11Z

Blocker
None.

Major
None.

Minor
None.

Nit
None.

rianjs · 2026-05-31T20:46:38Z

Findings

Blocker: TestAbortIfBaseMoved only exercises the helper in isolation ([internal/gateio/gateio_test.go](/Users/rianjs/dev/codereview-cli/internal/gateio/gateio_test.go#L412)). It does not prove the actual gate IO flow calls this logic, so the PR could still ship with base-moved abort missing from Evaluate and all tests would pass.
Major: TestEvaluateStaleBaseLockAuthority ([internal/gateio/gateio_test.go](/Users/rianjs/dev/codereview-cli/internal/gateio/gateio_test.go#L235)) only checks whether the stale run ended up aborted and whether a warning substring was emitted. It does not assert the final Evaluate status/decision after the cleanup/retry loop, so an implementation that aborts the stale run but returns the wrong terminal result would still pass.
Major: TestEvaluateLocalResumeSkipsExternalFailures and TestEvaluateRerunSupersedesResumable ([internal/gateio/gateio_test.go](/Users/rianjs/dev/codereview-cli/internal/gateio/gateio_test.go#L63) and [internal/gateio/gateio_test.go](/Users/rianjs/dev/codereview-cli/internal/gateio/gateio_test.go#L282)) inject provider errors, but they do not prove the provider was never consulted. A buggy implementation that still hits provider IO and ignores or absorbs those errors could still satisfy the assertions, so the local-resume and --rerun regression guards are weaker than the ticket intent calls for.

monit-reviewer

Automated PR Review

Reviewed commit: 4ca580b

Summary

Reviewer	Findings
harness-engineering:harness-architecture-reviewer	3
harness-engineering:harness-enforcement-reviewer	4
harness-engineering:harness-knowledge-reviewer	2

harness-engineering:harness-architecture-reviewer (3 findings)

⚠️ Should Fix - internal/gateio/gateio.go:496

validateOptions checks Store, Provider, and StaleHeartbeatThreshold but not opts.Layout. A zero-value Layout (empty DataRoot) passes validation silently and causes currentLockPath / lockPathForRun to produce bare relative paths resolved against the process CWD, undermining the lock-authority guarantee the entire package depends on. Add a Layout.DataRoot != "" check to surface misconfiguration at the API boundary instead of deep in path helpers.

⚠️ Should Fix - internal/gateio/gateio_test.go:432

TestEvaluateAbortsIfBaseMoved only covers the DecisionFresh path (no prior runs, so kernel allocates fresh then aborts). The DecisionResume branch in executeDecision also calls AbortIfBaseMoved, but no test exercises the case where a resumable exact-scope run exists and the PR base has since moved. A regression in that branch would not be caught by the current suite.

💡 Suggestion - internal/gateio/gateio.go:149

AbortIfBaseMoved is exported for standalone use but delegates to validateOptions, which requires opts.StaleHeartbeatThreshold > 0 — a field AbortIfBaseMoved never reads. Callers using the standalone path must supply a meaningless positive threshold or receive a confusing validation error. Consider a narrower validateOptionsForAbort or omitting threshold validation from AbortIfBaseMoved.

harness-engineering:harness-enforcement-reviewer (4 findings)

⚠️ Should Fix - internal/gateio/gateio.go:118

The for {} retry loop in Evaluate has no iteration bound. If the gate kernel re-emits DecisionAbortStale after all stale runs have been aborted (e.g., aborted rows re-read before transaction commits), or if every stale lock is held by another process so abortStaleRuns skips all candidates and state is rebuilt unchanged, the loop runs indefinitely. Context cancellation propagates through store calls but is not checked at the loop head. Add a max-iterations guard (e.g., if iterations > len(state.staleRuns)+2 { return error }) to make liveness failure observable.

⚠️ Should Fix - internal/gateio/gateio.go:348

abortStaleRuns silently skips any run ID in decision.AbortStaleRunIDs that is absent from state.staleLocks. If buildLocalState and attachExternalState disagree on which runs are stale, or the kernel emits an ID that was never probed, the abort is silently dropped and the retry loop re-enters with identical state — making no progress. Either return an error when a requested abort target is not in staleLocks, or document and test the invariant that kernel IDs are always a subset of probed candidates.

⚠️ Should Fix - internal/gateio/gateio_test.go:235

TestEvaluateStaleBaseLockAuthority asserts result.Decision.Kind == gate.DecisionFresh and result.Run.RunID != "" but does not verify (a) that the returned run's BaseSHA equals the current testBaseSHA rather than the stale one, or (b) that exactly one new run was allocated. An implementation that returns the stale run itself or allocates zero new runs would still pass. Add assertions that result.Run.RunID != staleRun.RunID and that the run count increases by exactly one for the 'free lock aborts' case.

💡 Suggestion - internal/gateio/gateio.go:619

newestSubmit unconditionally dereferences selected at the end (return *selected), which panics if the map is empty. The only call site guards with len(submits) > 0, so there is no current reachable crash, but the function itself is unsafe to call directly. Either return (markerRecord, bool) to make the empty case part of the contract, or add an explicit named panic guard so the precondition is visible at the function boundary.

harness-engineering:harness-knowledge-reviewer (2 findings)

⚠️ Should Fix - internal/gateio/gateio.go:183

buildLocalState calls summarizeRun (which calls Store.ListPlannedActions) for every run including stale-base runs before the localDecisionApplies short-circuit is evaluated. A store failure on ListPlannedActions for a stale run blocks local resume, --rerun, and --retry-posts decisions even though those have local precedence over stale/PR state. The test TestEvaluateLocalResumeSkipsExternalFailures covers provider failures but not store failures on stale run summaries. Defer stale run summarization to attachExternalState, or add a test with a store error on a stale run's planned actions to verify local resume still proceeds.

⚠️ Should Fix - internal/ledger/ledger.go:668

ListRunsForHeadScope's SELECT omits pr_url. If the shared scanRun function scans a pr_url column (as GetRun's SELECT presumably includes given AllocateRunParams carries PRURL), this query will fail at runtime with a column count mismatch. Confirm scanRun's column list matches this SELECT exactly; if pr_url is needed on returned Run values, add it to the SELECT.

3 info-level observations excluded. Run with --verbose to include.

3 PR discussion threads considered.

Completed in 5m 12s | $1.70 | sonnet | daemon 0.2.121 | Glorfindel

Field	Value
Model	sonnet
Reviewers	hybrid-synthesis, harness-engineering:harness-architecture-reviewer, harness-engineering:harness-enforcement-reviewer, harness-engineering:harness-knowledge-reviewer, harness-engineering:harness-self-documenting-code-reviewer, security:security-code-auditor
Engine	claude · sonnet
Reviewed by	pr-review-daemon · monit-pr-reviewer
Duration	5m 12s wall · 15m 00s compute (Reviewers: 3m 44s · Synthesis: 1m 25s)
Cost	$1.70
Tokens	446.8k in / 52.8k out
Turns	11

Per-workstream usage

Workstream	Model	In	Out	Cache read	Cache create	Cost
hybrid-synthesis	sonnet	33.8k	5.9k	18.6k	15.2k (1h)	$0.16
harness-engineering:harness-architecture-reviewer	sonnet	85.2k	11.9k	41.2k	44.0k (1h)	$0.38
harness-engineering:harness-enforcement-reviewer	sonnet	81.2k	8.3k	41.2k	40.0k (1h)	$0.31
harness-engineering:harness-knowledge-reviewer	sonnet	122.9k	10.7k	84.0k	38.9k (1h)	$0.36
harness-engineering:harness-self-documenting-code-reviewer	sonnet	86.0k	12.6k	41.2k	44.8k (1h)	$0.40
security:security-code-auditor	haiku	37.6k	3.5k	0	37.6k (1h)	$0.09

_{Re-reviews only run when @monit-reviewer is re-requested as a reviewer — push as many commits as you need, then re-request when ready. PRs targeting branches other than main, master are skipped, even when @monit-reviewer is re-requested.}

monit-reviewer · 2026-05-31T21:19:24Z

+		return Result{Status: StatusDryRunFresh, Decision: decision}, nil
+	}
+
+	lockPath, err := currentLockPath(opts.Layout, req)


🟡 Medium (harness-engineering:harness-enforcement-reviewer): The for {} retry loop in Evaluate has no iteration bound. If the gate kernel re-emits DecisionAbortStale after all stale runs have been aborted (e.g., aborted rows re-read before transaction commits), or if every stale lock is held by another process so abortStaleRuns skips all candidates and state is rebuilt unchanged, the loop runs indefinitely. Context cancellation propagates through store calls but is not checked at the loop head. Add a max-iterations guard (e.g., if iterations > len(state.staleRuns)+2 { return error }) to make liveness failure observable.

Reply to this thread when addressed.

Fixed - added context and retry bounds.

monit-reviewer · 2026-05-31T21:19:24Z

+		emitWarnings(opts.Warnings, result.Warnings)
+		return result, false, nil
+	default:
+		state.releaseStaleLocks()


🟡 Medium (harness-engineering:harness-enforcement-reviewer): abortStaleRuns silently skips any run ID in decision.AbortStaleRunIDs that is absent from state.staleLocks. If buildLocalState and attachExternalState disagree on which runs are stale, or the kernel emits an ID that was never probed, the abort is silently dropped and the retry loop re-enters with identical state — making no progress. Either return an error when a requested abort target is not in staleLocks, or document and test the invariant that kernel IDs are always a subset of probed candidates.

Reply to this thread when addressed.

Fixed - missing stale abort locks now error.

monit-reviewer · 2026-05-31T21:19:25Z

+	}
+	return gate.PRSummary{State: gate.PRStateFresh}
+}
+


🟡 Medium (harness-engineering:harness-architecture-reviewer): validateOptions checks Store, Provider, and StaleHeartbeatThreshold but not opts.Layout. A zero-value Layout (empty DataRoot) passes validation silently and causes currentLockPath / lockPathForRun to produce bare relative paths resolved against the process CWD, undermining the lock-authority guarantee the entire package depends on. Add a Layout.DataRoot != "" check to surface misconfiguration at the API boundary instead of deep in path helpers.

Reply to this thread when addressed.

Fixed - validate layout data root.

monit-reviewer · 2026-05-31T21:19:25Z

+		Run:    run,
+		Decision: gate.Decision{
+			Kind:    gate.DecisionError,
+			Message: fmt.Sprintf("base moved from %s to %s", run.BaseSHA, pr.Base.SHA),


🟡 Medium (harness-engineering:harness-knowledge-reviewer): buildLocalState calls summarizeRun (which calls Store.ListPlannedActions) for every run including stale-base runs before the localDecisionApplies short-circuit is evaluated. A store failure on ListPlannedActions for a stale run blocks local resume, --rerun, and --retry-posts decisions even though those have local precedence over stale/PR state. The test TestEvaluateLocalResumeSkipsExternalFailures covers provider failures but not store failures on stale run summaries. Defer stale run summarization to attachExternalState, or add a test with a store error on a stale run's planned actions to verify local resume still proceeds.

Reply to this thread when addressed.

Fixed - stale action summaries are deferred.

monit-reviewer · 2026-05-31T21:19:25Z

+	}
+	rows, err := s.db.QueryContext(ctx, `
+SELECT run_id, pr_key, sha, base_sha, attempt, profile, posting_identity, post_mode,
+	started_at, heartbeat_at, completed_at, outcome, artifact_path,


🟡 Medium (harness-engineering:harness-knowledge-reviewer): ListRunsForHeadScope's SELECT omits pr_url. If the shared scanRun function scans a pr_url column (as GetRun's SELECT presumably includes given AllocateRunParams carries PRURL), this query will fail at runtime with a column count mismatch. Confirm scanRun's column list matches this SELECT exactly; if pr_url is needed on returned Run values, add it to the SELECT.

Reply to this thread when addressed.

Disagree - scanRun and ListRunsForHeadScope use the same run-column list; PRURL belongs to PR rows, and the query is covered by ListRunsForHeadScope tests.

monit-reviewer · 2026-05-31T21:19:25Z

+	if err != nil {
+		t.Fatalf("Evaluate: %v", err)
+	}
+	if result.Status != StatusRepairUnsupported || result.Decision.Kind != gate.DecisionRepair {


🟡 Medium (harness-engineering:harness-enforcement-reviewer): TestEvaluateStaleBaseLockAuthority asserts result.Decision.Kind == gate.DecisionFresh and result.Run.RunID != "" but does not verify (a) that the returned run's BaseSHA equals the current testBaseSHA rather than the stale one, or (b) that exactly one new run was allocated. An implementation that returns the stale run itself or allocates zero new runs would still pass. Add assertions that result.Run.RunID != staleRun.RunID and that the run count increases by exactly one for the 'free lock aborts' case.

Reply to this thread when addressed.

Fixed - assert new current-base allocation.

monit-reviewer · 2026-05-31T21:19:25Z

+	moved.Base.SHA = testOldBase
+	if err := fixture.provider.SetPR(fixture.req.PRRef, moved); err != nil {
+		t.Fatalf("SetPR moved: %v", err)
+	}


🟡 Medium (harness-engineering:harness-architecture-reviewer): TestEvaluateAbortsIfBaseMoved only covers the DecisionFresh path (no prior runs, so kernel allocates fresh then aborts). The DecisionResume branch in executeDecision also calls AbortIfBaseMoved, but no test exercises the case where a resumable exact-scope run exists and the PR base has since moved. A regression in that branch would not be caught by the current suite.

Reply to this thread when addressed.

Fixed - added resumed base-moved abort coverage.

monit-reviewer · 2026-05-31T21:19:25Z

+		}
+
+		result, retry, err := executeDecision(ctx, opts, req, state, decision, currentLock, &releaseCurrent)
+		if err != nil {


🔵 Low (harness-engineering:harness-architecture-reviewer): AbortIfBaseMoved is exported for standalone use but delegates to validateOptions, which requires opts.StaleHeartbeatThreshold > 0 — a field AbortIfBaseMoved never reads. Callers using the standalone path must supply a meaningless positive threshold or receive a confusing validation error. Consider a narrower validateOptionsForAbort or omitting threshold validation from AbortIfBaseMoved.

Reply to this thread when addressed.

Fixed - abort helper uses narrower option validation.

monit-reviewer · 2026-05-31T21:19:25Z

+func validateRequest(req Request) error {
+	if err := req.PRRef.Validate(); err != nil {
+		return err
+	}


🔵 Low (harness-engineering:harness-enforcement-reviewer): newestSubmit unconditionally dereferences selected at the end (return *selected), which panics if the map is empty. The only call site guards with len(submits) > 0, so there is no current reachable crash, but the function itself is unsafe to call directly. Either return (markerRecord, bool) to make the empty case part of the contract, or add an explicit named panic guard so the precondition is visible at the function boundary.

Reply to this thread when addressed.

Fixed - newestSubmit returns ok for empty input.

monit-reviewer

Automated PR Review

Reviewed commit: 1d5e925 | Previous: 4ca580b (incremental)

Summary

No issues found.

12 PR discussion threads considered.

Completed in 1m 24s | $1.50 | sonnet | daemon 0.2.121 | Glorfindel

Field	Value
Model	sonnet
Mode	Re-review · Cycle 2 · Session resumed
Reviewers	hybrid-synthesis, harness-engineering:harness-architecture-reviewer, harness-engineering:harness-enforcement-reviewer, harness-engineering:harness-knowledge-reviewer, harness-engineering:harness-self-documenting-code-reviewer, security:security-code-auditor
Engine	claude · sonnet
Reviewed by	pr-review-daemon · monit-pr-reviewer
Duration	1m 24s wall · 1m 42s compute (Reviewers: 48s · Synthesis: 16s)
Cost	$1.50
Tokens	451.5k in / 5.2k out
Turns	6

Per-workstream usage

Workstream	Model	In	Out	Cache read	Cache create	Cost
hybrid-synthesis	sonnet	44.3k	943	18.6k	25.7k (1h)	$0.12
harness-engineering:harness-architecture-reviewer	sonnet	78.2k	97	0	78.2k (1h)	$0.29
harness-engineering:harness-enforcement-reviewer	sonnet	74.6k	124	0	74.6k (1h)	$0.28
harness-engineering:harness-knowledge-reviewer	sonnet	77.0k	80	0	77.0k (1h)	$0.29
harness-engineering:harness-self-documenting-code-reviewer	sonnet	78.9k	129	0	78.9k (1h)	$0.30
security:security-code-auditor	haiku	70.5k	2.9k	0	70.5k (1h)	$0.10
discussion-summarizer	—	28.0k	898	0	28.0k (1h)	$0.12

_{Re-reviews only run when @monit-reviewer is re-requested as a reviewer — push as many commits as you need, then re-request when ready. PRs targeting branches other than main, master are skipped, even when @monit-reviewer is re-requested.}

monit-reviewer

Automated PR Review

Reviewed commit: 413cd62 | Previous: 1d5e925 (incremental)

Summary

No issues found.

12 PR discussion threads considered.

Completed in 1m 15s | $2.01 | sonnet | daemon 0.2.121 | Glorfindel

Field	Value
Model	sonnet
Mode	Re-review · Cycle 3 · Session resumed
Reviewers	hybrid-synthesis, harness-engineering:harness-architecture-reviewer, harness-engineering:harness-enforcement-reviewer, harness-engineering:harness-knowledge-reviewer, harness-engineering:harness-self-documenting-code-reviewer, security:security-code-auditor
Engine	claude · sonnet
Reviewed by	pr-review-daemon · monit-pr-reviewer
Duration	1m 15s wall · 1m 31s compute (Reviewers: 51s · Synthesis: 5s)
Cost	$2.01
Tokens	607.4k in / 7.6k out
Turns	6

Per-workstream usage

Workstream	Model	In	Out	Cache read	Cache create	Cost
hybrid-synthesis	sonnet	47.8k	124	18.6k	29.2k (1h)	$0.12
harness-engineering:harness-architecture-reviewer	sonnet	108.0k	124	0	108.0k (1h)	$0.41
harness-engineering:harness-enforcement-reviewer	sonnet	104.5k	86	0	104.5k (1h)	$0.39
harness-engineering:harness-knowledge-reviewer	sonnet	106.8k	67	0	106.8k (1h)	$0.40
harness-engineering:harness-self-documenting-code-reviewer	sonnet	108.8k	143	0	108.8k (1h)	$0.41
security:security-code-auditor	haiku	103.4k	6.1k	0	103.4k (1h)	$0.16
discussion-summarizer	—	28.0k	933	0	28.0k (1h)	$0.12

_{Re-reviews only run when @monit-reviewer is re-requested as a reviewer — push as many commits as you need, then re-request when ready. PRs targeting branches other than main, master are skipped, even when @monit-reviewer is re-requested.}

feat(gate): add gate IO integration

29eece4

Closes #22

fix(gateio): preserve local gate precedence

0d1a6b5

test(gateio): cover TDD gate integration gaps

4ca580b

monit-reviewer suggested changes May 31, 2026

View reviewed changes

fix(gateio): tighten daemon review contracts

1d5e925

monit-reviewer approved these changes May 31, 2026

View reviewed changes

fix(gateio): precheck base before fresh mutation

43fc46f

rianjs changed the title ~~CR-14b: gate IO integration~~ feat(gate): add gate IO integration May 31, 2026

chore: retrigger PR checks

413cd62

monit-reviewer approved these changes May 31, 2026

View reviewed changes

rianjs merged commit a3a9337 into main May 31, 2026
8 checks passed

rianjs deleted the feat/cr-14b-gate-io branch May 31, 2026 23:37

Conversation

rianjs commented May 31, 2026

Summary

Tests

Uh oh!

rianjs commented May 31, 2026

Uh oh!

rianjs commented May 31, 2026

Uh oh!

rianjs commented May 31, 2026

Uh oh!

monit-reviewer left a comment

Choose a reason for hiding this comment

Automated PR Review

Summary

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

monit-reviewer left a comment

Choose a reason for hiding this comment

Automated PR Review

Summary

Uh oh!

monit-reviewer left a comment

Choose a reason for hiding this comment

Automated PR Review

Summary

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants