feat(gate): add gate IO integration#48
Conversation
|
Blocker Major
Minor
Nit |
|
Blocker Major Minor Nit |
|
Findings
|
monit-reviewer
left a comment
There was a problem hiding this comment.
Automated PR Review
Reviewed commit: 4ca580b
Summary
| Reviewer | Findings |
|---|---|
| harness-engineering:harness-architecture-reviewer | 3 |
| harness-engineering:harness-enforcement-reviewer | 4 |
| harness-engineering:harness-knowledge-reviewer | 2 |
harness-engineering:harness-architecture-reviewer (3 findings)
internal/gateio/gateio.go:496
validateOptionschecksStore,Provider, andStaleHeartbeatThresholdbut notopts.Layout. A zero-valueLayout(emptyDataRoot) passes validation silently and causescurrentLockPath/lockPathForRunto produce bare relative paths resolved against the process CWD, undermining the lock-authority guarantee the entire package depends on. Add aLayout.DataRoot != ""check to surface misconfiguration at the API boundary instead of deep in path helpers.
internal/gateio/gateio_test.go:432
TestEvaluateAbortsIfBaseMovedonly covers theDecisionFreshpath (no prior runs, so kernel allocates fresh then aborts). TheDecisionResumebranch inexecuteDecisionalso callsAbortIfBaseMoved, but no test exercises the case where a resumable exact-scope run exists and the PR base has since moved. A regression in that branch would not be caught by the current suite.
💡 Suggestion - internal/gateio/gateio.go:149
AbortIfBaseMovedis exported for standalone use but delegates tovalidateOptions, which requiresopts.StaleHeartbeatThreshold > 0— a fieldAbortIfBaseMovednever reads. Callers using the standalone path must supply a meaningless positive threshold or receive a confusing validation error. Consider a narrowervalidateOptionsForAbortor omitting threshold validation fromAbortIfBaseMoved.
harness-engineering:harness-enforcement-reviewer (4 findings)
internal/gateio/gateio.go:118
The
for {}retry loop inEvaluatehas no iteration bound. If the gate kernel re-emitsDecisionAbortStaleafter all stale runs have been aborted (e.g., aborted rows re-read before transaction commits), or if every stale lock is held by another process soabortStaleRunsskips all candidates and state is rebuilt unchanged, the loop runs indefinitely. Context cancellation propagates through store calls but is not checked at the loop head. Add a max-iterations guard (e.g.,if iterations > len(state.staleRuns)+2 { return error }) to make liveness failure observable.
internal/gateio/gateio.go:348
abortStaleRunssilently skips any run ID indecision.AbortStaleRunIDsthat is absent fromstate.staleLocks. IfbuildLocalStateandattachExternalStatedisagree on which runs are stale, or the kernel emits an ID that was never probed, the abort is silently dropped and the retry loop re-enters with identical state — making no progress. Either return an error when a requested abort target is not instaleLocks, or document and test the invariant that kernel IDs are always a subset of probed candidates.
internal/gateio/gateio_test.go:235
TestEvaluateStaleBaseLockAuthorityassertsresult.Decision.Kind == gate.DecisionFreshandresult.Run.RunID != ""but does not verify (a) that the returned run'sBaseSHAequals the currenttestBaseSHArather than the stale one, or (b) that exactly one new run was allocated. An implementation that returns the stale run itself or allocates zero new runs would still pass. Add assertions thatresult.Run.RunID != staleRun.RunIDand that the run count increases by exactly one for the 'free lock aborts' case.
💡 Suggestion - internal/gateio/gateio.go:619
newestSubmitunconditionally dereferencesselectedat the end (return *selected), which panics if the map is empty. The only call site guards withlen(submits) > 0, so there is no current reachable crash, but the function itself is unsafe to call directly. Either return(markerRecord, bool)to make the empty case part of the contract, or add an explicit named panic guard so the precondition is visible at the function boundary.
harness-engineering:harness-knowledge-reviewer (2 findings)
internal/gateio/gateio.go:183
buildLocalStatecallssummarizeRun(which callsStore.ListPlannedActions) for every run including stale-base runs before thelocalDecisionAppliesshort-circuit is evaluated. A store failure onListPlannedActionsfor a stale run blocks local resume,--rerun, and--retry-postsdecisions even though those have local precedence over stale/PR state. The testTestEvaluateLocalResumeSkipsExternalFailurescovers provider failures but not store failures on stale run summaries. Defer stale run summarization toattachExternalState, or add a test with a store error on a stale run's planned actions to verify local resume still proceeds.
internal/ledger/ledger.go:668
ListRunsForHeadScope's SELECT omitspr_url. If the sharedscanRunfunction scans apr_urlcolumn (asGetRun's SELECT presumably includes givenAllocateRunParamscarriesPRURL), this query will fail at runtime with a column count mismatch. ConfirmscanRun's column list matches this SELECT exactly; ifpr_urlis needed on returnedRunvalues, add it to the SELECT.
3 info-level observations excluded. Run with --verbose to include.
3 PR discussion threads considered.
Completed in 5m 12s | $1.70 | sonnet | daemon 0.2.121 | Glorfindel
| Field | Value |
|---|---|
| Model | sonnet |
| Reviewers | hybrid-synthesis, harness-engineering:harness-architecture-reviewer, harness-engineering:harness-enforcement-reviewer, harness-engineering:harness-knowledge-reviewer, harness-engineering:harness-self-documenting-code-reviewer, security:security-code-auditor |
| Engine | claude · sonnet |
| Reviewed by | pr-review-daemon · monit-pr-reviewer |
| Duration | 5m 12s wall · 15m 00s compute (Reviewers: 3m 44s · Synthesis: 1m 25s) |
| Cost | $1.70 |
| Tokens | 446.8k in / 52.8k out |
| Turns | 11 |
Per-workstream usage
| Workstream | Model | In | Out | Cache read | Cache create | Cost |
|---|---|---|---|---|---|---|
| hybrid-synthesis | sonnet | 33.8k | 5.9k | 18.6k | 15.2k (1h) | $0.16 |
| harness-engineering:harness-architecture-reviewer | sonnet | 85.2k | 11.9k | 41.2k | 44.0k (1h) | $0.38 |
| harness-engineering:harness-enforcement-reviewer | sonnet | 81.2k | 8.3k | 41.2k | 40.0k (1h) | $0.31 |
| harness-engineering:harness-knowledge-reviewer | sonnet | 122.9k | 10.7k | 84.0k | 38.9k (1h) | $0.36 |
| harness-engineering:harness-self-documenting-code-reviewer | sonnet | 86.0k | 12.6k | 41.2k | 44.8k (1h) | $0.40 |
| security:security-code-auditor | haiku | 37.6k | 3.5k | 0 | 37.6k (1h) | $0.09 |
Re-reviews only run when @monit-reviewer is re-requested as a reviewer — push as many commits as you need, then re-request when ready. PRs targeting branches other than main, master are skipped, even when @monit-reviewer is re-requested.
| return Result{Status: StatusDryRunFresh, Decision: decision}, nil | ||
| } | ||
|
|
||
| lockPath, err := currentLockPath(opts.Layout, req) |
There was a problem hiding this comment.
🟡 Medium (harness-engineering:harness-enforcement-reviewer): The for {} retry loop in Evaluate has no iteration bound. If the gate kernel re-emits DecisionAbortStale after all stale runs have been aborted (e.g., aborted rows re-read before transaction commits), or if every stale lock is held by another process so abortStaleRuns skips all candidates and state is rebuilt unchanged, the loop runs indefinitely. Context cancellation propagates through store calls but is not checked at the loop head. Add a max-iterations guard (e.g., if iterations > len(state.staleRuns)+2 { return error }) to make liveness failure observable.
Reply to this thread when addressed.
There was a problem hiding this comment.
Fixed - added context and retry bounds.
| emitWarnings(opts.Warnings, result.Warnings) | ||
| return result, false, nil | ||
| default: | ||
| state.releaseStaleLocks() |
There was a problem hiding this comment.
🟡 Medium (harness-engineering:harness-enforcement-reviewer): abortStaleRuns silently skips any run ID in decision.AbortStaleRunIDs that is absent from state.staleLocks. If buildLocalState and attachExternalState disagree on which runs are stale, or the kernel emits an ID that was never probed, the abort is silently dropped and the retry loop re-enters with identical state — making no progress. Either return an error when a requested abort target is not in staleLocks, or document and test the invariant that kernel IDs are always a subset of probed candidates.
Reply to this thread when addressed.
There was a problem hiding this comment.
Fixed - missing stale abort locks now error.
| } | ||
| return gate.PRSummary{State: gate.PRStateFresh} | ||
| } | ||
|
|
There was a problem hiding this comment.
🟡 Medium (harness-engineering:harness-architecture-reviewer): validateOptions checks Store, Provider, and StaleHeartbeatThreshold but not opts.Layout. A zero-value Layout (empty DataRoot) passes validation silently and causes currentLockPath / lockPathForRun to produce bare relative paths resolved against the process CWD, undermining the lock-authority guarantee the entire package depends on. Add a Layout.DataRoot != "" check to surface misconfiguration at the API boundary instead of deep in path helpers.
Reply to this thread when addressed.
There was a problem hiding this comment.
Fixed - validate layout data root.
| Run: run, | ||
| Decision: gate.Decision{ | ||
| Kind: gate.DecisionError, | ||
| Message: fmt.Sprintf("base moved from %s to %s", run.BaseSHA, pr.Base.SHA), |
There was a problem hiding this comment.
🟡 Medium (harness-engineering:harness-knowledge-reviewer): buildLocalState calls summarizeRun (which calls Store.ListPlannedActions) for every run including stale-base runs before the localDecisionApplies short-circuit is evaluated. A store failure on ListPlannedActions for a stale run blocks local resume, --rerun, and --retry-posts decisions even though those have local precedence over stale/PR state. The test TestEvaluateLocalResumeSkipsExternalFailures covers provider failures but not store failures on stale run summaries. Defer stale run summarization to attachExternalState, or add a test with a store error on a stale run's planned actions to verify local resume still proceeds.
Reply to this thread when addressed.
There was a problem hiding this comment.
Fixed - stale action summaries are deferred.
| } | ||
| rows, err := s.db.QueryContext(ctx, ` | ||
| SELECT run_id, pr_key, sha, base_sha, attempt, profile, posting_identity, post_mode, | ||
| started_at, heartbeat_at, completed_at, outcome, artifact_path, |
There was a problem hiding this comment.
🟡 Medium (harness-engineering:harness-knowledge-reviewer): ListRunsForHeadScope's SELECT omits pr_url. If the shared scanRun function scans a pr_url column (as GetRun's SELECT presumably includes given AllocateRunParams carries PRURL), this query will fail at runtime with a column count mismatch. Confirm scanRun's column list matches this SELECT exactly; if pr_url is needed on returned Run values, add it to the SELECT.
Reply to this thread when addressed.
There was a problem hiding this comment.
Disagree - scanRun and ListRunsForHeadScope use the same run-column list; PRURL belongs to PR rows, and the query is covered by ListRunsForHeadScope tests.
| if err != nil { | ||
| t.Fatalf("Evaluate: %v", err) | ||
| } | ||
| if result.Status != StatusRepairUnsupported || result.Decision.Kind != gate.DecisionRepair { |
There was a problem hiding this comment.
🟡 Medium (harness-engineering:harness-enforcement-reviewer): TestEvaluateStaleBaseLockAuthority asserts result.Decision.Kind == gate.DecisionFresh and result.Run.RunID != "" but does not verify (a) that the returned run's BaseSHA equals the current testBaseSHA rather than the stale one, or (b) that exactly one new run was allocated. An implementation that returns the stale run itself or allocates zero new runs would still pass. Add assertions that result.Run.RunID != staleRun.RunID and that the run count increases by exactly one for the 'free lock aborts' case.
Reply to this thread when addressed.
There was a problem hiding this comment.
Fixed - assert new current-base allocation.
| moved.Base.SHA = testOldBase | ||
| if err := fixture.provider.SetPR(fixture.req.PRRef, moved); err != nil { | ||
| t.Fatalf("SetPR moved: %v", err) | ||
| } |
There was a problem hiding this comment.
🟡 Medium (harness-engineering:harness-architecture-reviewer): TestEvaluateAbortsIfBaseMoved only covers the DecisionFresh path (no prior runs, so kernel allocates fresh then aborts). The DecisionResume branch in executeDecision also calls AbortIfBaseMoved, but no test exercises the case where a resumable exact-scope run exists and the PR base has since moved. A regression in that branch would not be caught by the current suite.
Reply to this thread when addressed.
There was a problem hiding this comment.
Fixed - added resumed base-moved abort coverage.
| } | ||
|
|
||
| result, retry, err := executeDecision(ctx, opts, req, state, decision, currentLock, &releaseCurrent) | ||
| if err != nil { |
There was a problem hiding this comment.
🔵 Low (harness-engineering:harness-architecture-reviewer): AbortIfBaseMoved is exported for standalone use but delegates to validateOptions, which requires opts.StaleHeartbeatThreshold > 0 — a field AbortIfBaseMoved never reads. Callers using the standalone path must supply a meaningless positive threshold or receive a confusing validation error. Consider a narrower validateOptionsForAbort or omitting threshold validation from AbortIfBaseMoved.
Reply to this thread when addressed.
There was a problem hiding this comment.
Fixed - abort helper uses narrower option validation.
| func validateRequest(req Request) error { | ||
| if err := req.PRRef.Validate(); err != nil { | ||
| return err | ||
| } |
There was a problem hiding this comment.
🔵 Low (harness-engineering:harness-enforcement-reviewer): newestSubmit unconditionally dereferences selected at the end (return *selected), which panics if the map is empty. The only call site guards with len(submits) > 0, so there is no current reachable crash, but the function itself is unsafe to call directly. Either return (markerRecord, bool) to make the empty case part of the contract, or add an explicit named panic guard so the precondition is visible at the function boundary.
Reply to this thread when addressed.
There was a problem hiding this comment.
Fixed - newestSubmit returns ok for empty input.
monit-reviewer
left a comment
There was a problem hiding this comment.
Automated PR Review
Reviewed commit: 1d5e925 | Previous: 4ca580b (incremental)
Summary
No issues found.
12 PR discussion threads considered.
Completed in 1m 24s | $1.50 | sonnet | daemon 0.2.121 | Glorfindel
| Field | Value |
|---|---|
| Model | sonnet |
| Mode | Re-review · Cycle 2 · Session resumed |
| Reviewers | hybrid-synthesis, harness-engineering:harness-architecture-reviewer, harness-engineering:harness-enforcement-reviewer, harness-engineering:harness-knowledge-reviewer, harness-engineering:harness-self-documenting-code-reviewer, security:security-code-auditor |
| Engine | claude · sonnet |
| Reviewed by | pr-review-daemon · monit-pr-reviewer |
| Duration | 1m 24s wall · 1m 42s compute (Reviewers: 48s · Synthesis: 16s) |
| Cost | $1.50 |
| Tokens | 451.5k in / 5.2k out |
| Turns | 6 |
Per-workstream usage
| Workstream | Model | In | Out | Cache read | Cache create | Cost |
|---|---|---|---|---|---|---|
| hybrid-synthesis | sonnet | 44.3k | 943 | 18.6k | 25.7k (1h) | $0.12 |
| harness-engineering:harness-architecture-reviewer | sonnet | 78.2k | 97 | 0 | 78.2k (1h) | $0.29 |
| harness-engineering:harness-enforcement-reviewer | sonnet | 74.6k | 124 | 0 | 74.6k (1h) | $0.28 |
| harness-engineering:harness-knowledge-reviewer | sonnet | 77.0k | 80 | 0 | 77.0k (1h) | $0.29 |
| harness-engineering:harness-self-documenting-code-reviewer | sonnet | 78.9k | 129 | 0 | 78.9k (1h) | $0.30 |
| security:security-code-auditor | haiku | 70.5k | 2.9k | 0 | 70.5k (1h) | $0.10 |
| discussion-summarizer | — | 28.0k | 898 | 0 | 28.0k (1h) | $0.12 |
Re-reviews only run when @monit-reviewer is re-requested as a reviewer — push as many commits as you need, then re-request when ready. PRs targeting branches other than main, master are skipped, even when @monit-reviewer is re-requested.
monit-reviewer
left a comment
There was a problem hiding this comment.
Automated PR Review
Reviewed commit: 413cd62 | Previous: 1d5e925 (incremental)
Summary
No issues found.
12 PR discussion threads considered.
Completed in 1m 15s | $2.01 | sonnet | daemon 0.2.121 | Glorfindel
| Field | Value |
|---|---|
| Model | sonnet |
| Mode | Re-review · Cycle 3 · Session resumed |
| Reviewers | hybrid-synthesis, harness-engineering:harness-architecture-reviewer, harness-engineering:harness-enforcement-reviewer, harness-engineering:harness-knowledge-reviewer, harness-engineering:harness-self-documenting-code-reviewer, security:security-code-auditor |
| Engine | claude · sonnet |
| Reviewed by | pr-review-daemon · monit-pr-reviewer |
| Duration | 1m 15s wall · 1m 31s compute (Reviewers: 51s · Synthesis: 5s) |
| Cost | $2.01 |
| Tokens | 607.4k in / 7.6k out |
| Turns | 6 |
Per-workstream usage
| Workstream | Model | In | Out | Cache read | Cache create | Cost |
|---|---|---|---|---|---|---|
| hybrid-synthesis | sonnet | 47.8k | 124 | 18.6k | 29.2k (1h) | $0.12 |
| harness-engineering:harness-architecture-reviewer | sonnet | 108.0k | 124 | 0 | 108.0k (1h) | $0.41 |
| harness-engineering:harness-enforcement-reviewer | sonnet | 104.5k | 86 | 0 | 104.5k (1h) | $0.39 |
| harness-engineering:harness-knowledge-reviewer | sonnet | 106.8k | 67 | 0 | 106.8k (1h) | $0.40 |
| harness-engineering:harness-self-documenting-code-reviewer | sonnet | 108.8k | 143 | 0 | 108.8k (1h) | $0.41 |
| security:security-code-auditor | haiku | 103.4k | 6.1k | 0 | 103.4k (1h) | $0.16 |
| discussion-summarizer | — | 28.0k | 933 | 0 | 28.0k (1h) | $0.12 |
Re-reviews only run when @monit-reviewer is re-requested as a reviewer — push as many commits as you need, then re-request when ready. PRs targeting branches other than main, master are skipped, even when @monit-reviewer is re-requested.
Summary
Tests
Closes #22