Skip to content

feat(gate): add gate IO integration#48

Merged
rianjs merged 6 commits into
mainfrom
feat/cr-14b-gate-io
May 31, 2026
Merged

feat(gate): add gate IO integration#48
rianjs merged 6 commits into
mainfrom
feat/cr-14b-gate-io

Conversation

@rianjs
Copy link
Copy Markdown
Contributor

@rianjs rianjs commented May 31, 2026

Summary

  • add gate IO orchestration that maps ledger/provider state into the pure gate kernel
  • add lock-authoritative stale-base cleanup, live fresh allocation, local resume, early exit, and base-moved abort handling
  • add scoped ledger lookup and reusable lock path helpers for resume-key integration

Tests

  • go test ./internal/gateio
  • go test ./internal/ledger
  • go test ./internal/statepaths
  • go test ./...
  • go test -race ./internal/gateio
  • make check

Closes #22

@rianjs
Copy link
Copy Markdown
Contributor Author

rianjs commented May 31, 2026

Blocker
None.

Major

  • internal/gateio/gateio.go:128 / internal/gateio/gateio.go:241: Evaluate eagerly builds the full gate state before calling the pure kernel, including stale-base lock probes and provider marker reads. That means exact local resume, --rerun, and --retry-posts decisions can be blocked by stale-lock acquisition errors or ListIssueComments / ListReviews failures even though CR-14a gives those decisions precedence over stale/PR state. This weakens the local-resume regression guarantee. Add a test with an exact resumable row plus provider read failure/stale lock acquisition failure, and restructure to decide/short-circuit local-only precedence before PR/stale IO where possible.

  • internal/gateio/gateio.go:64 / internal/gateio/gateio.go:496: Request accepts PRRef, PR, and PRKey, but validation only checks that each is shaped enough to use. A mismatched PRKey or PR.Ref can make gate IO acquire a lock and scan markers for one PR while listing/allocating ledger rows under another PR key. Since this is a data-integrity boundary, derive PRKey from PRRef internally or validate PRKey and PR.Ref are consistent with PRRef.

Minor

  • The implementation is missing material plan-pinned tests for subtle marker precedence and mapping behavior: mixed stale/current markers, paired real-verdict rollup plus same-run submit review, retry-posts unsupported classification, normal dry-run fresh behavior, and full ledger-outcome-to-gate-state mapping. The code appears to implement several of these, but they are exactly the cases most likely to regress when CR-14c wires repair/retry execution.

Nit
None.

@rianjs
Copy link
Copy Markdown
Contributor Author

rianjs commented May 31, 2026

Blocker
None.

Major
None.

Minor
None.

Nit
None.

@rianjs
Copy link
Copy Markdown
Contributor Author

rianjs commented May 31, 2026

Findings

  • Blocker: TestAbortIfBaseMoved only exercises the helper in isolation ([internal/gateio/gateio_test.go](/Users/rianjs/dev/codereview-cli/internal/gateio/gateio_test.go#L412)). It does not prove the actual gate IO flow calls this logic, so the PR could still ship with base-moved abort missing from Evaluate and all tests would pass.
  • Major: TestEvaluateStaleBaseLockAuthority ([internal/gateio/gateio_test.go](/Users/rianjs/dev/codereview-cli/internal/gateio/gateio_test.go#L235)) only checks whether the stale run ended up aborted and whether a warning substring was emitted. It does not assert the final Evaluate status/decision after the cleanup/retry loop, so an implementation that aborts the stale run but returns the wrong terminal result would still pass.
  • Major: TestEvaluateLocalResumeSkipsExternalFailures and TestEvaluateRerunSupersedesResumable ([internal/gateio/gateio_test.go](/Users/rianjs/dev/codereview-cli/internal/gateio/gateio_test.go#L63) and [internal/gateio/gateio_test.go](/Users/rianjs/dev/codereview-cli/internal/gateio/gateio_test.go#L282)) inject provider errors, but they do not prove the provider was never consulted. A buggy implementation that still hits provider IO and ignores or absorbs those errors could still satisfy the assertions, so the local-resume and --rerun regression guards are weaker than the ticket intent calls for.

Copy link
Copy Markdown

@monit-reviewer monit-reviewer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated PR Review

Reviewed commit: 4ca580b

Summary

Reviewer Findings
harness-engineering:harness-architecture-reviewer 3
harness-engineering:harness-enforcement-reviewer 4
harness-engineering:harness-knowledge-reviewer 2
harness-engineering:harness-architecture-reviewer (3 findings)

⚠️ Should Fix - internal/gateio/gateio.go:496

validateOptions checks Store, Provider, and StaleHeartbeatThreshold but not opts.Layout. A zero-value Layout (empty DataRoot) passes validation silently and causes currentLockPath / lockPathForRun to produce bare relative paths resolved against the process CWD, undermining the lock-authority guarantee the entire package depends on. Add a Layout.DataRoot != "" check to surface misconfiguration at the API boundary instead of deep in path helpers.

⚠️ Should Fix - internal/gateio/gateio_test.go:432

TestEvaluateAbortsIfBaseMoved only covers the DecisionFresh path (no prior runs, so kernel allocates fresh then aborts). The DecisionResume branch in executeDecision also calls AbortIfBaseMoved, but no test exercises the case where a resumable exact-scope run exists and the PR base has since moved. A regression in that branch would not be caught by the current suite.

💡 Suggestion - internal/gateio/gateio.go:149

AbortIfBaseMoved is exported for standalone use but delegates to validateOptions, which requires opts.StaleHeartbeatThreshold > 0 — a field AbortIfBaseMoved never reads. Callers using the standalone path must supply a meaningless positive threshold or receive a confusing validation error. Consider a narrower validateOptionsForAbort or omitting threshold validation from AbortIfBaseMoved.

harness-engineering:harness-enforcement-reviewer (4 findings)

⚠️ Should Fix - internal/gateio/gateio.go:118

The for {} retry loop in Evaluate has no iteration bound. If the gate kernel re-emits DecisionAbortStale after all stale runs have been aborted (e.g., aborted rows re-read before transaction commits), or if every stale lock is held by another process so abortStaleRuns skips all candidates and state is rebuilt unchanged, the loop runs indefinitely. Context cancellation propagates through store calls but is not checked at the loop head. Add a max-iterations guard (e.g., if iterations > len(state.staleRuns)+2 { return error }) to make liveness failure observable.

⚠️ Should Fix - internal/gateio/gateio.go:348

abortStaleRuns silently skips any run ID in decision.AbortStaleRunIDs that is absent from state.staleLocks. If buildLocalState and attachExternalState disagree on which runs are stale, or the kernel emits an ID that was never probed, the abort is silently dropped and the retry loop re-enters with identical state — making no progress. Either return an error when a requested abort target is not in staleLocks, or document and test the invariant that kernel IDs are always a subset of probed candidates.

⚠️ Should Fix - internal/gateio/gateio_test.go:235

TestEvaluateStaleBaseLockAuthority asserts result.Decision.Kind == gate.DecisionFresh and result.Run.RunID != "" but does not verify (a) that the returned run's BaseSHA equals the current testBaseSHA rather than the stale one, or (b) that exactly one new run was allocated. An implementation that returns the stale run itself or allocates zero new runs would still pass. Add assertions that result.Run.RunID != staleRun.RunID and that the run count increases by exactly one for the 'free lock aborts' case.

💡 Suggestion - internal/gateio/gateio.go:619

newestSubmit unconditionally dereferences selected at the end (return *selected), which panics if the map is empty. The only call site guards with len(submits) > 0, so there is no current reachable crash, but the function itself is unsafe to call directly. Either return (markerRecord, bool) to make the empty case part of the contract, or add an explicit named panic guard so the precondition is visible at the function boundary.

harness-engineering:harness-knowledge-reviewer (2 findings)

⚠️ Should Fix - internal/gateio/gateio.go:183

buildLocalState calls summarizeRun (which calls Store.ListPlannedActions) for every run including stale-base runs before the localDecisionApplies short-circuit is evaluated. A store failure on ListPlannedActions for a stale run blocks local resume, --rerun, and --retry-posts decisions even though those have local precedence over stale/PR state. The test TestEvaluateLocalResumeSkipsExternalFailures covers provider failures but not store failures on stale run summaries. Defer stale run summarization to attachExternalState, or add a test with a store error on a stale run's planned actions to verify local resume still proceeds.

⚠️ Should Fix - internal/ledger/ledger.go:668

ListRunsForHeadScope's SELECT omits pr_url. If the shared scanRun function scans a pr_url column (as GetRun's SELECT presumably includes given AllocateRunParams carries PRURL), this query will fail at runtime with a column count mismatch. Confirm scanRun's column list matches this SELECT exactly; if pr_url is needed on returned Run values, add it to the SELECT.

3 info-level observations excluded. Run with --verbose to include.

3 PR discussion threads considered.


Completed in 5m 12s | $1.70 | sonnet | daemon 0.2.121 | Glorfindel
Field Value
Model sonnet
Reviewers hybrid-synthesis, harness-engineering:harness-architecture-reviewer, harness-engineering:harness-enforcement-reviewer, harness-engineering:harness-knowledge-reviewer, harness-engineering:harness-self-documenting-code-reviewer, security:security-code-auditor
Engine claude · sonnet
Reviewed by pr-review-daemon · monit-pr-reviewer
Duration 5m 12s wall · 15m 00s compute (Reviewers: 3m 44s · Synthesis: 1m 25s)
Cost $1.70
Tokens 446.8k in / 52.8k out
Turns 11

Per-workstream usage

Workstream Model In Out Cache read Cache create Cost
hybrid-synthesis sonnet 33.8k 5.9k 18.6k 15.2k (1h) $0.16
harness-engineering:harness-architecture-reviewer sonnet 85.2k 11.9k 41.2k 44.0k (1h) $0.38
harness-engineering:harness-enforcement-reviewer sonnet 81.2k 8.3k 41.2k 40.0k (1h) $0.31
harness-engineering:harness-knowledge-reviewer sonnet 122.9k 10.7k 84.0k 38.9k (1h) $0.36
harness-engineering:harness-self-documenting-code-reviewer sonnet 86.0k 12.6k 41.2k 44.8k (1h) $0.40
security:security-code-auditor haiku 37.6k 3.5k 0 37.6k (1h) $0.09

Re-reviews only run when @monit-reviewer is re-requested as a reviewer — push as many commits as you need, then re-request when ready. PRs targeting branches other than main, master are skipped, even when @monit-reviewer is re-requested.

Comment thread internal/gateio/gateio.go
return Result{Status: StatusDryRunFresh, Decision: decision}, nil
}

lockPath, err := currentLockPath(opts.Layout, req)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium (harness-engineering:harness-enforcement-reviewer): The for {} retry loop in Evaluate has no iteration bound. If the gate kernel re-emits DecisionAbortStale after all stale runs have been aborted (e.g., aborted rows re-read before transaction commits), or if every stale lock is held by another process so abortStaleRuns skips all candidates and state is rebuilt unchanged, the loop runs indefinitely. Context cancellation propagates through store calls but is not checked at the loop head. Add a max-iterations guard (e.g., if iterations > len(state.staleRuns)+2 { return error }) to make liveness failure observable.

Reply to this thread when addressed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed - added context and retry bounds.

Comment thread internal/gateio/gateio.go
emitWarnings(opts.Warnings, result.Warnings)
return result, false, nil
default:
state.releaseStaleLocks()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium (harness-engineering:harness-enforcement-reviewer): abortStaleRuns silently skips any run ID in decision.AbortStaleRunIDs that is absent from state.staleLocks. If buildLocalState and attachExternalState disagree on which runs are stale, or the kernel emits an ID that was never probed, the abort is silently dropped and the retry loop re-enters with identical state — making no progress. Either return an error when a requested abort target is not in staleLocks, or document and test the invariant that kernel IDs are always a subset of probed candidates.

Reply to this thread when addressed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed - missing stale abort locks now error.

Comment thread internal/gateio/gateio.go
}
return gate.PRSummary{State: gate.PRStateFresh}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium (harness-engineering:harness-architecture-reviewer): validateOptions checks Store, Provider, and StaleHeartbeatThreshold but not opts.Layout. A zero-value Layout (empty DataRoot) passes validation silently and causes currentLockPath / lockPathForRun to produce bare relative paths resolved against the process CWD, undermining the lock-authority guarantee the entire package depends on. Add a Layout.DataRoot != "" check to surface misconfiguration at the API boundary instead of deep in path helpers.

Reply to this thread when addressed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed - validate layout data root.

Comment thread internal/gateio/gateio.go Outdated
Run: run,
Decision: gate.Decision{
Kind: gate.DecisionError,
Message: fmt.Sprintf("base moved from %s to %s", run.BaseSHA, pr.Base.SHA),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium (harness-engineering:harness-knowledge-reviewer): buildLocalState calls summarizeRun (which calls Store.ListPlannedActions) for every run including stale-base runs before the localDecisionApplies short-circuit is evaluated. A store failure on ListPlannedActions for a stale run blocks local resume, --rerun, and --retry-posts decisions even though those have local precedence over stale/PR state. The test TestEvaluateLocalResumeSkipsExternalFailures covers provider failures but not store failures on stale run summaries. Defer stale run summarization to attachExternalState, or add a test with a store error on a stale run's planned actions to verify local resume still proceeds.

Reply to this thread when addressed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed - stale action summaries are deferred.

Comment thread internal/ledger/ledger.go
}
rows, err := s.db.QueryContext(ctx, `
SELECT run_id, pr_key, sha, base_sha, attempt, profile, posting_identity, post_mode,
started_at, heartbeat_at, completed_at, outcome, artifact_path,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium (harness-engineering:harness-knowledge-reviewer): ListRunsForHeadScope's SELECT omits pr_url. If the shared scanRun function scans a pr_url column (as GetRun's SELECT presumably includes given AllocateRunParams carries PRURL), this query will fail at runtime with a column count mismatch. Confirm scanRun's column list matches this SELECT exactly; if pr_url is needed on returned Run values, add it to the SELECT.

Reply to this thread when addressed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disagree - scanRun and ListRunsForHeadScope use the same run-column list; PRURL belongs to PR rows, and the query is covered by ListRunsForHeadScope tests.

if err != nil {
t.Fatalf("Evaluate: %v", err)
}
if result.Status != StatusRepairUnsupported || result.Decision.Kind != gate.DecisionRepair {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium (harness-engineering:harness-enforcement-reviewer): TestEvaluateStaleBaseLockAuthority asserts result.Decision.Kind == gate.DecisionFresh and result.Run.RunID != "" but does not verify (a) that the returned run's BaseSHA equals the current testBaseSHA rather than the stale one, or (b) that exactly one new run was allocated. An implementation that returns the stale run itself or allocates zero new runs would still pass. Add assertions that result.Run.RunID != staleRun.RunID and that the run count increases by exactly one for the 'free lock aborts' case.

Reply to this thread when addressed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed - assert new current-base allocation.

moved.Base.SHA = testOldBase
if err := fixture.provider.SetPR(fixture.req.PRRef, moved); err != nil {
t.Fatalf("SetPR moved: %v", err)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium (harness-engineering:harness-architecture-reviewer): TestEvaluateAbortsIfBaseMoved only covers the DecisionFresh path (no prior runs, so kernel allocates fresh then aborts). The DecisionResume branch in executeDecision also calls AbortIfBaseMoved, but no test exercises the case where a resumable exact-scope run exists and the PR base has since moved. A regression in that branch would not be caught by the current suite.

Reply to this thread when addressed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed - added resumed base-moved abort coverage.

Comment thread internal/gateio/gateio.go
}

result, retry, err := executeDecision(ctx, opts, req, state, decision, currentLock, &releaseCurrent)
if err != nil {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 Low (harness-engineering:harness-architecture-reviewer): AbortIfBaseMoved is exported for standalone use but delegates to validateOptions, which requires opts.StaleHeartbeatThreshold > 0 — a field AbortIfBaseMoved never reads. Callers using the standalone path must supply a meaningless positive threshold or receive a confusing validation error. Consider a narrower validateOptionsForAbort or omitting threshold validation from AbortIfBaseMoved.

Reply to this thread when addressed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed - abort helper uses narrower option validation.

Comment thread internal/gateio/gateio.go
func validateRequest(req Request) error {
if err := req.PRRef.Validate(); err != nil {
return err
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 Low (harness-engineering:harness-enforcement-reviewer): newestSubmit unconditionally dereferences selected at the end (return *selected), which panics if the map is empty. The only call site guards with len(submits) > 0, so there is no current reachable crash, but the function itself is unsafe to call directly. Either return (markerRecord, bool) to make the empty case part of the contract, or add an explicit named panic guard so the precondition is visible at the function boundary.

Reply to this thread when addressed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed - newestSubmit returns ok for empty input.

Copy link
Copy Markdown

@monit-reviewer monit-reviewer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated PR Review

Reviewed commit: 1d5e925 | Previous: 4ca580b (incremental)

Summary

No issues found.

12 PR discussion threads considered.


Completed in 1m 24s | $1.50 | sonnet | daemon 0.2.121 | Glorfindel
Field Value
Model sonnet
Mode Re-review · Cycle 2 · Session resumed
Reviewers hybrid-synthesis, harness-engineering:harness-architecture-reviewer, harness-engineering:harness-enforcement-reviewer, harness-engineering:harness-knowledge-reviewer, harness-engineering:harness-self-documenting-code-reviewer, security:security-code-auditor
Engine claude · sonnet
Reviewed by pr-review-daemon · monit-pr-reviewer
Duration 1m 24s wall · 1m 42s compute (Reviewers: 48s · Synthesis: 16s)
Cost $1.50
Tokens 451.5k in / 5.2k out
Turns 6

Per-workstream usage

Workstream Model In Out Cache read Cache create Cost
hybrid-synthesis sonnet 44.3k 943 18.6k 25.7k (1h) $0.12
harness-engineering:harness-architecture-reviewer sonnet 78.2k 97 0 78.2k (1h) $0.29
harness-engineering:harness-enforcement-reviewer sonnet 74.6k 124 0 74.6k (1h) $0.28
harness-engineering:harness-knowledge-reviewer sonnet 77.0k 80 0 77.0k (1h) $0.29
harness-engineering:harness-self-documenting-code-reviewer sonnet 78.9k 129 0 78.9k (1h) $0.30
security:security-code-auditor haiku 70.5k 2.9k 0 70.5k (1h) $0.10
discussion-summarizer 28.0k 898 0 28.0k (1h) $0.12

Re-reviews only run when @monit-reviewer is re-requested as a reviewer — push as many commits as you need, then re-request when ready. PRs targeting branches other than main, master are skipped, even when @monit-reviewer is re-requested.

@rianjs rianjs changed the title CR-14b: gate IO integration feat(gate): add gate IO integration May 31, 2026
Copy link
Copy Markdown

@monit-reviewer monit-reviewer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated PR Review

Reviewed commit: 413cd62 | Previous: 1d5e925 (incremental)

Summary

No issues found.

12 PR discussion threads considered.


Completed in 1m 15s | $2.01 | sonnet | daemon 0.2.121 | Glorfindel
Field Value
Model sonnet
Mode Re-review · Cycle 3 · Session resumed
Reviewers hybrid-synthesis, harness-engineering:harness-architecture-reviewer, harness-engineering:harness-enforcement-reviewer, harness-engineering:harness-knowledge-reviewer, harness-engineering:harness-self-documenting-code-reviewer, security:security-code-auditor
Engine claude · sonnet
Reviewed by pr-review-daemon · monit-pr-reviewer
Duration 1m 15s wall · 1m 31s compute (Reviewers: 51s · Synthesis: 5s)
Cost $2.01
Tokens 607.4k in / 7.6k out
Turns 6

Per-workstream usage

Workstream Model In Out Cache read Cache create Cost
hybrid-synthesis sonnet 47.8k 124 18.6k 29.2k (1h) $0.12
harness-engineering:harness-architecture-reviewer sonnet 108.0k 124 0 108.0k (1h) $0.41
harness-engineering:harness-enforcement-reviewer sonnet 104.5k 86 0 104.5k (1h) $0.39
harness-engineering:harness-knowledge-reviewer sonnet 106.8k 67 0 106.8k (1h) $0.40
harness-engineering:harness-self-documenting-code-reviewer sonnet 108.8k 143 0 108.8k (1h) $0.41
security:security-code-auditor haiku 103.4k 6.1k 0 103.4k (1h) $0.16
discussion-summarizer 28.0k 933 0 28.0k (1h) $0.12

Re-reviews only run when @monit-reviewer is re-requested as a reviewer — push as many commits as you need, then re-request when ready. PRs targeting branches other than main, master are skipped, even when @monit-reviewer is re-requested.

@rianjs rianjs merged commit a3a9337 into main May 31, 2026
8 checks passed
@rianjs rianjs deleted the feat/cr-14b-gate-io branch May 31, 2026 23:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CR-14b: gate **IO integration**

2 participants