Push-rejection loop guard + SSH transport auto-detect by subsetpark · Pull Request #315 · flowglad/onton

subsetpark · 2026-05-23T22:31:25Z

Summary

A patch worktree entered a 54-cycle tight retry loop (Orchestrator.Start → Session_push_failed → Start every few seconds) because GitHub refused every push and onton had no way to tell a transient race apart from a permanent block. Root cause: the patch modified .github/workflows/ci.yml under an OAuth token without the workflow scope, and onton's managed clone always used HTTPS+OAuth, structurally forcing the rejection.

This PR addresses the incident at two levels:

Classify push rejections and stop looping on permanent ones. New pure module Push_reject_classify turns the porcelain ! reject into a named reason (workflow scope / branch protection / push pattern / lease / hook). New push_failure_count counter on Patch_agent flips needs_intervention at ≥3; a permanent reason short-circuits straight to Given_up. Activity log now reports the real reason and a follow-up detail line on permanent failures.
Auto-detect SSH transport via a reachability probe. Managed_repo.probe_ssh_available runs a non-interactive git ls-remote -h git@github.com:<owner>/<repo>.git with GIT_SSH_COMMAND="ssh -o BatchMode=yes -o ConnectTimeout=5". Exit 0 means SSH can authenticate to GitHub and reach this repo — the same condition SSH push needs. SSH wins on success (bypassing per-OAuth-scope restrictions entirely); HTTPS is the fallback. The choice is announced on stderr and persisted to config.json, so the probe only runs on first invocation. A --clone-scheme=https|ssh CLI flag forces a transport and skips the probe.

Plus three adjacent correctness fixes:

Reconciler.detect_main_freshness_drift + poller wiring: when origin/main advances past local main (e.g. a sibling patch squash-merges), every agent on main gets a Rebase enqueued. Without this, detect_stale_bases silently passes because both sides read the same stale branch name. The main-freshness check runs in its own fiber.
Patch_agent.branch_rebased_onto_sha + pure Rebase_decision.upstream + wiring in Worktree.rebase_onto: when find_old_base falls back to plain rebase, use the SHA the previous base resolved to at the last rebase as the --onto anchor instead. Drops commits whose content was absorbed into a squash-merge on origin.
Shared Env.fetch_mutex across runner + poller fibers, so concurrent git fetch calls can't race on the ref lock.

Plus authentication docs in AGENTS.md and README.md.

What changed (by commit)

Push-failure loop guard: classify rejections and escalate the permanent ones
Auto-detect SSH transport for managed clones from sibling user clones (superseded by CodeRabbit Generated Unit Tests: Add unit tests for changes #12 — see below)
OAuth scope pre-flight: warn at startup when the configured token lacks a scope a patch in the gameplan will need (superseded by [onton-port] Patch 15: Terminal UI #11 — see below)
Document the authentication model and SSH transport option
Use the recorded prev-base SHA as the rebase --onto anchor
Wire scope_preflight and persist the resolved transport scheme (scope_preflight wiring superseded by [onton-port] Patch 15: Terminal UI #11; persistence kept)
Poller: detect when origin/main advances and enqueue rebases
Add --clone-scheme=https|ssh flag to override the SSH auto-detect
Share fetch_mutex globally across runner + poller fibers
Run main-freshness check in its own fiber
Address PR 315 review comments (also drops the startup OAuth-scope preflight — see below)
Replace sibling-clone heuristic with an SSH reachability probe

Notes for reviewers

OAuth scope preflight removed. An earlier revision of this PR included a /user scope probe at startup. After review, that was dropped (commit [onton-port] Patch 15: Terminal UI #11): the real fix for the workflow-scope failure is having a working SSH transport, and the preflight added complexity without a documented in-the-wild failure mode where SSH wasn't the right answer.
Sibling-clone scanner replaced. The first SSH auto-detect implementation ([onton-port] Patch 1: Core types #2) scanned conventional directories for an existing user clone and inherited its transport. That was brittle (only worked when a sibling happened to exist) and unreliable (a sibling using SSH didn't guarantee SSH worked on this machine for this repo). Commit CodeRabbit Generated Unit Tests: Add unit tests for changes #12 replaces it with a real reachability probe.

Property tests

New pure modules each ship with their own QCheck2 suite:

Push_reject_classify: 14 PRC-* (totality, recognizers per fingerprint, permanence ↔ variant, label / excerpt bounds)
Rebase_decision: 5 RD-* (totality, Some-wins, None-falls-back, empty-SHA-falls-back, idempotence on well-formed input)
Github_target (SSH/scheme): 32 total — SSH clone_url round-trip, scheme_of_url totality + recognizers, resolve_scheme ~override ~ssh_available (override wins, true→Ssh, false→Https, total)
Reconciler (main-freshness): 5 MFD-* (totality, equal-SHA silent, distinct-SHA fires, empty-side silent, whitespace stripping)

Extensions to existing suites:

test_properties.ml — PSF-4..8 (push_failure_count increment / reset / threshold, permanent-rejection immediate Given_up, transient-rejection non-escalation)
test_interleaving_properties.ml — CV-6 (3-iteration convergence without Human-in-queue), CV-7 (1-iteration convergence on permanent rejection)

All tests pass: dune build clean, dune runtest exit 0, dune fmt clean.

Test plan

Re-trigger the incident: in a fresh project where the gameplan declares a patch that touches .github/workflows/*.yml, run onton with an OAuth token lacking workflow scope, on a machine without SSH access to the repo. Expected: HTTPS clone (probe falls back), push fails with activity-log line runner: push rejected after session — workflow_scope_missing + follow-up detail; push_failure_count flips to Given_up on the first event; agent appears in TUI intervention list.
SSH probe — happy path: on a machine with an SSH key authorized for the repo, start onton fresh. Expected: stderr onton: SSH probe to git@github.com succeeded — cloning managed repo via SSH; config.json records url_scheme: "ssh"; the same workflow-touching patch pushes successfully (SSH bypasses OAuth scopes).
SSH probe — fallback: on a machine without SSH set up, start onton fresh. Expected: stderr onton: SSH probe to git@github.com unavailable — cloning managed repo via HTTPS; config.json records url_scheme: "https"; first run takes ~5s longer (probe timeout) but subsequent runs skip the probe.
--clone-scheme=https and --clone-scheme=ssh override the probe; persisted config.json value also acts as override on subsequent runs.
Squash-merge a dep on origin: verify the dependent rebase replays only the dependent's own commits (not the merged dep's originals) when branch_rebased_onto_sha was recorded at the previous rebase.
Confirm origin/main advance triggers rebases: with two patches based on main, squash-merge one externally, watch the poller's next tick log origin/main advanced to <sha> and enqueue a Rebase for the other.

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

New Features
- Added SSH transport support for managed repository clones with automatic detection and persistence across runs.
- Implemented push rejection classification to better identify and handle server-side failures.
- Added automatic main branch freshness detection to trigger rebases when upstream advances.
- Enhanced push failure tracking to escalate intervention after repeated failures.
Documentation
- Added comprehensive authentication guide covering HTTPS+OAuth and SSH transport options.
- Documented SSH configuration and credential handling for git operations.

…nt ones Background: a patch worktree entered a tight retry loop — 54 cycles of Orchestrator.Start → Session_push_failed → Start in 3.5 minutes — because the remote refused every push and onton had no way to tell a transient race apart from a permanent block, so no counter ever fired needs_intervention. This change adds pure decision logic to (1) classify the porcelain '!' rejection from stderr into a small set of named reasons (workflow-scope, branch-protection, push-pattern, lease, hook) and (2) drive a new push_failure_count counter that flips needs_intervention at ≥3, with an immediate short-circuit to session_fallback = Given_up when the rejection is permanent (workflow-scope etc.). The full reject reason also surfaces in the activity log. Adjacent groundwork while the patch_agent / persistence / orchestrator files were already open: a branch_rebased_onto_sha field and a pure Rebase_decision.upstream helper, so a future worktree.ml integration can use the recorded old-base SHA as the rebase --onto anchor and drop stale-dep commits when a transitive dep squash-merges. Plus a tiny detect_main_freshness_drift predicate in Reconciler for the matching poller integration. Both are pure-side only; the effectful wiring lands in follow-up commits. New pure modules under lib_core/ (each with property tests): - push_reject_classify (14 PRC-* properties) - oauth_scopes (10 OSC-* properties — used by P0-E later) - rebase_decision (5 RD-* properties) Patch_agent: push_failure_count (threshold ≥3) and branch_rebased_onto_sha fields, increment_/reset_ helpers, threshold disjunct in needs_intervention, bulk reset in reset_intervention_state, restore parameter + persistence round-trip. Orchestrator: Session_push_failed now carries Push_reject_classify.rejection option (Some for server rejections, None for transport-level Push_error); apply_session_result increments the counter and, on is_permanent reasons, jumps straight to Given_up. Session_driver / runner_fiber_impl: activity-log line uses short_label, with a follow-up event carrying the captured stderr line on permanent rejections. Properties added or extended: PSF-4..8 (counter + permanent escalation), CV-6 (3-iteration convergence without Human), CV-7 (immediate Given_up on permanent reasons), MFD-1..5 (main-freshness predicate totality and boundaries). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Onton's managed clone at ~/.local/share/onton/<project>/repo was always HTTPS so the GIT_ASKPASS shim in lib/git_env.ml could inject the OAuth token. That structurally forced every push through the per-OAuth-scope restrictions GitHub enforces (workflow scope, etc.), even when the user's own clone of the same repo used SSH and would not have hit those restrictions. This commit makes the managed clone inherit the user's transport. Before cloning, scan a small set of conventional locations ($PWD/.., ~/code-src/, ~/src/, ~/code/, ~/dev/, ~/projects/) for a sibling clone of owner/repo. If any sibling's origin is SSH, pick git@github.com:owner/repo.git for the managed clone; otherwise default to https://github.com/owner/repo.git as before. The resolved transport is announced on stderr at first run and persisted to config.json so subsequent invocations are stable. An explicit override (intended for a future --clone-scheme CLI flag, or anyone editing config.json) wins over auto-detect. When a managed clone already exists on disk, its current origin URL is authoritative — we don't flip transports mid-life. Pure changes (lib_core/github_target): - new url_scheme variant (Https | Ssh) with yojson + show + eq - clone_url is now scheme-aware - new scheme_of_url parser and pure resolve_scheme decision - 9 SSH/scheme property tests added (32 total in the suite) Effectful changes (lib/managed_repo): - sibling_clone_candidates / read_remote_urls / discover_sibling_clones (best-effort, total on failure) - resolve_clone_scheme with the one-line stderr announce - ensure_managed_repo now returns (repo_root, resolved_scheme); the caller persists the resolved value to config.json Schema: - Project_store.stored_config gains url_scheme : string option (None on legacy configs; back-filled on next ensure_managed_repo) - save_config gains an optional ~url_scheme parameter Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

a scope a patch in the gameplan will need When a patch modifies .github/workflows/*.yml under a token without the workflow OAuth scope, GitHub silently refuses the push with a clear remote-rejected message. With the push-reject classifier in place we now route those failures straight to needs_intervention, but the user finds out at first push attempt — after the agent has already written the change. This commit adds a startup check that hits /user with the configured token, reads X-OAuth-Scopes, and compares against the scopes the gameplan files would require. Missing scopes emit a clear stderr warning with the suggested gh auth refresh fix. The check is gated on the resolved managed-clone transport: SSH doesn't consult OAuth scopes, so the warning is a false positive there and we skip it entirely. Github.fetch_oauth_scopes_for is a thin wrapper around the existing cohttp-eio plumbing that returns the parsed scope list, plus a public re-export from github.mli that mirrors make's argument shape so the preflight handler doesn't need to thread a Github.t. The pure Oauth_scopes module shipped in the previous commit; this is the effectful side. Scope_preflight.run is intended to be called from bin/main.ml after Managed_repo.ensure_managed_repo returns the resolved transport scheme and before any patch sessions start; that wiring is not part of this commit (bin/main.ml's Cmdliner dependency is currently broken in the dev env), but the public API is in place. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds an Authentication section to AGENTS.md covering the HTTPS+OAuth default, the workflow-scope requirement, the SSH auto-detect heuristic inherited from a sibling user clone, and the gh auth setup-git tip for running git by hand inside a worktree. Mirrors the same content into README.md as a sub-section under the existing GitHub authentication discussion, with the additional context that SSH bypasses per-OAuth-scope restrictions entirely (which is the actual escape hatch for users who hit workflow-scope errors and don't want to gh auth refresh). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Wires the [branch_rebased_onto_sha] field added earlier into the actual rebase machinery. Two halves: (1) [Worktree.rebase_onto] now takes [?prev_base_sha:string option] and threads it into the fallback path. The default branch (find_old_base returns Ok with cherry-pick-isolated commits) is unchanged. When find_old_base returns Error AND the orchestrator recorded a SHA the previous base resolved to at the last rebase / start, the supervisor now runs [git rebase --onto target sha] instead of the legacy plain [git rebase target]. That drops commits whose content was absorbed into a squash-merge on origin/main (the patch-6 case) — only the patch's own commits in [sha..HEAD] are replayed. When [prev_base_sha] is [None] the behavior is bit-for-bit the legacy plain rebase, so agents whose [branch_rebased_onto_sha] was never recorded keep working exactly as before. An optional argument terminator [()] is added because OCaml can't otherwise erase the optional; mechanical update at every call site. (2) After a successful rebase the runner fiber reads [origin/<new_base>]'s resolved SHA via the new [W.read_branch_sha] helper, and stores it on the agent via [Orchestrator.set_branch_rebased_onto_sha]. The next rebase therefore has the right anchor. [Worktree.S] gains [read_branch_sha] and the dependency-injection test stub gains a [None] implementation. The integration tests in [test/test_rebase_onto.ml] all keep passing — the new code path is inactive (prev_base_sha defaults to None there). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Threads the [url_scheme] resolved by [Managed_repo.ensure_managed_repo] through [repo_coords] / [finalize_run] into [Project_store.save_config] so subsequent runs read the same value from [config.json] rather than re-detecting from sibling clones (which can move). The resume path reads the persisted value via [Managed_repo.url_scheme_of_string]. At the top of [Eio_main.run], invokes [Scope_preflight.run] with the network + clock capabilities, the resolved scheme, and the gameplan. The scheme is re-read from [config.json] at the Eio entry rather than plumbed through [Resolved_config.t] — the persistence happened inside [finalize_run] which ran before [Eio_main.run], so the file is already up-to-date by the time the preflight needs it. Gating on [Github_target.Https] inside [Scope_preflight.run] means SSH clones skip the preflight (the warning would be a false positive there). Effect for users: on first push attempt that would have been rejected with [refusing to allow an OAuth App to create or update workflow …], they now see a clear stderr warning at startup with the suggested [gh auth refresh -h github.com -s workflow] fix. The push-time classifier still routes the failure to needs_intervention if they proceed regardless. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The pure [Reconciler.detect_main_freshness_drift] predicate shipped earlier compares local-main-sha to origin-main-sha. This commit wires it into the poller loop so that when a squash-merge lands on origin/main, every agent currently based on main gets a [Rebase] enqueued automatically — without it, [detect_stale_bases] passes because [base_branch] and [Graph.initial_base] both read the same stale branch name and never disagree. Per loop tick (before the per-patch poll fan-out): 1. [W.fetch_origin] on the managed repo root. 2. Read [main] and [origin/main] SHAs via [W.read_branch_sha]. 3. If [detect_main_freshness_drift] fires AND we haven't already announced this particular [origin_sha] (a [last_main_sha] ref prevents log + enqueue spam when the next loop iteration sees the same drift), log the transition and enqueue Rebase for every agent on main that has a PR, isn't merged, and doesn't already have a Rebase queued. [W.fetch_origin] races with per-worktree fetches in the runner fiber on shared ref storage — pre-existing condition I'm not fixing here. Best-effort: any "cannot lock ref" defers this check by one [poll_interval], and the rebase plan executor's [Fetch_origin] step still runs from the worktree side so the rebase target ends up fresh. The fast-forward step from the plan (advance local main to match origin) is intentionally not done here: rebases use [origin/<main>] directly via [Worktree_plan.for_rebase], so once the poller wakes the agents, the rebase targets are correct regardless of what local main resolves to in the managed-clone HEAD. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

vercel · 2026-05-23T22:31:31Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
onton	Ready	Preview, Comment	May 24, 2026 3:58pm

coderabbitai · 2026-05-23T22:32:02Z

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e37c1021-c9b8-4edf-8548-32421c8a8ec1

📥 Commits

Reviewing files that changed from the base of the PR and between 3ce016b and 2173002.

📒 Files selected for processing (7)

bin/main.ml
lib/managed_repo.ml
lib/managed_repo.mli
lib/poller_fiber.ml
lib_core/github_target.ml
lib_core/github_target.mli
test/test_github_target_properties.ml

📝 Walkthrough

Walkthrough

Adds push-rejection classification, scheme-aware managed clones (HTTPS/SSH) with persistence and CLI override, per-patch push-failure tracking and rebase-SHA anchoring, a shared fetch mutex for serialized git fetches, main-branch freshness detection, and corresponding rebase/push flow and tests updates.

Changes

Push Rejection Classification & Tests

Layer / File(s)	Summary
Push rejection classifier and tests `lib_core/push_reject_classify.{mli,ml}`, `test/test_push_reject_classify_properties.ml`	New `rejection` variant and helpers; `classify` maps git stderr to typed rejection; `short_label`, `detail_excerpt`, `is_permanent`; property tests for totality, correctness, and determinism.

GitHub target & Managed clone scheme

Layer / File(s)	Summary
GitHub target scheme helpers `lib_core/github_target.{mli,ml}`, `test/test_github_target_properties.ml`	Add `url_scheme` (Https/Ssh), `clone_url ~scheme`, `scheme_of_url`, and `resolve_scheme`; tests for formatting, round-trip, and override/availability behavior.
Managed repo scheme detection & CLI persistence `lib/managed_repo.{mli,ml}`, `bin/main.ml`, `lib/project_store.{ml,mli}`	Make managed clones scheme-aware, probe SSH when needed, `ensure_managed_repo` accepts optional `clone_scheme` and returns resolved scheme; add `--clone-scheme` CLI and persist `url_scheme` in project config.

Fetch mutex and environment wiring

Layer / File(s)	Summary
Fetch mutex infrastructure `lib/run_env.{ml,mli}`, `bin/main.ml`, `lib/runner_fiber_impl.ml`, `test/test_dependency_injection_functors.ml`	Introduce `fetch_mutex : Eio.Mutex.t` in run env and FIBER_ENV, create mutex in capabilities, and wire it into runner/poller/tui envs to serialize `git fetch` across worktrees.

Patch agent state and persistence

Layer / File(s)	Summary
Patch agent push-failure & rebase-SHA `lib_core/patch_agent.{ml,mli}`, `lib/persistence.ml`	Adds `push_failure_count` and `branch_rebased_onto_sha` fields, init/restore/persist them, add increment/reset helpers, and include in snapshots and tests.

Worktree rebase & push integration

Layer / File(s)	Summary
Worktree rebase & parser changes `lib/worktree.{ml,mli}`, `lib_core/worktree_parser.ml`, `lib/worktree_plan_executor.ml`	Add `read_branch_sha`; thread optional `?prev_base_sha` into `rebase_onto` and choose upstream via Rebase_decision.upstream; `Push_rejected` now carries a rejection payload and parser calls classifier.
Orchestrator session escalation & logging `lib/orchestrator.{ml,mli}`, `lib/session_driver.ml`	`Session_push_failed` now carries `Push_reject_classify.rejection option`; `apply_session_result` updates push-failure counters and forces escalation for permanent rejections; `combine_session_and_push` maps push outcomes into parameterized session failures; session driver logs short_label and excerpts.
Runner post-rebase capture `lib/runner_fiber_impl.ml`, `lib/worktree_plan_executor.ml`	Runner reads `origin/<base>` after successful rebase to capture `post_rebase_sha` and calls `Orchestrator.set_branch_rebased_onto_sha`; logging updated to use classified labels/excerpts.

Main freshness detection & tests

Layer / File(s)	Summary
Main freshness detection `lib_core/reconciler.{ml,mli}`, `lib/poller_fiber.ml`	Add detect_main_freshness_drift predicate; freshness_loop fetches `origin/<main>` under `fetch_mutex`, detects drift, logs advancement, and enqueues rebases for eligible agents.
Tests, fixtures, and docs `test/*`, `AGENTS.md`, `README.md`	Add/adjust many property tests and test fixtures for new types/fields and parameterized session failures; update AGENTS.md and README with Authentication / SSH transport docs and guidance.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

flowglad/onton#296: Prior work on fiber env wiring that this change extends by threading fetch_mutex through FIBER_ENV and Make_fibers.
flowglad/onton#307: Related runner env changes; both PRs touch fetch-mutex/resolver wiring between runner and environment.

"A rabbit's note on code that fetches and sings:
Schemes choose their path — SSH or HTTPS wings.
Pushes now classified, counters kept tight,
Mutex guards the fetches through day and night.
Tests hop in chorus: all green, take flight!"

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch push-reject-classify-and-ssh-auto-detect

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

flowglad-review-service

Code Review: 3 comments

flowglad-review-service · 2026-05-23T22:34:52Z

Review Summary

The PR is well-structured with thorough property coverage for all new pure modules and clean separation of pure/effectful code. Three findings:

must-fix (push_reject_classify.ml line 75): The Push_pattern_block recognizer matches "Repository rule violations found", which also appears verbatim in the canonical workflow_scope_missing stderr fixture (the GH013 header). Correctness depends on Workflow_scope_missing being checked first. A reorder would silently produce the wrong classification and suppress the gh auth refresh hint. Adding an explicit ordering comment or a targeted property that would falsify on reorder would lock in the invariant.
must-fix (orchestrator.ml line 713): For permanent rejections, push_failure_count is incremented to 1 before the Given_up escalation and is not reset during that escalation. needs_intervention is safe because it short-circuits on Given_up, and reset_intervention_state correctly zeros the counter. However, the counter's semantics ("consecutive transient failures") are muddied by incrementing it on a permanent rejection. The simpler fix is to skip the increment in the permanent-rejection arm — the counter's purpose is to break transient retry loops, and permanent rejections break the loop via a different mechanism.
should-fix (poller_fiber.ml line 173): The fetch_lock created locally inside the poller fiber never interacts with the runner fiber's separate fetch mutex. Since check_main_freshness is called serially inside loop (), the local lock serializes nothing. The comment acknowledges the inter-fiber race is best-effort, but the lock implies a guard that isn't actually there. Remove it or wire in the shared mutex.

Severity	Count
Must-fix	2
Should-fix	1
Note	0

Variant: convergence-v2

Candidates: 3 | Posted: 3 | Suppressed: 0

3 comments posted · Model: claude-sonnet-4-6 (sonnet) · Tokens: 67719 in / 6295 out · Cache: 321788 created / 1281993 read · Context: 5 items · Review mode: agentic · Turns: 15 · Tool calls: 24 · Forced submit: yes · Tool mix: read_file=21, search_codebase=3

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

lib/github.ml (1)
535-596: 💤 Low value

Consider extracting common HTTP client setup (optional refactor).

The TLS config and client initialization (lines 552-570) duplicate the same logic in the request function (lines 473-494). While the duplication is localized and the functions have different purposes (body vs header extraction), extracting a shared make_client helper would reduce maintenance burden if the setup ever needs updating.

Not blocking since the duplication is manageable and the use case is specialized.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/github.ml` around lines 535 - 596, The TLS config and Cohttp_eio client
creation are duplicated between fetch_oauth_scopes and the request function;
extract a shared helper (e.g. make_client) that calls https_config(), builds the
https_fun wrapper and returns a Cohttp_eio.Client.t (or a tuple of client and
tls_config if needed), then replace the inline TLS/client setup in
fetch_oauth_scopes and in request with calls to make_client so both reuse the
same initialization path (refer to functions fetch_oauth_scopes, request,
https_config, and https_fun to implement and wire the helper).

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@lib_core/oauth_scopes.ml`:
- Around line 27-35: required_for_files currently returns [Workflow] when
workflows are touched and [] otherwise, omitting the mandatory Repo scope;
update the function required_for_files to always include the Repo scope and add
Workflow only when paths touch .github/workflows (i.e., return [Repo; Workflow]
when touches_workflows is true, otherwise return [Repo]), referencing the
Workflow and Repo variants to locate the change.

In `@test/test_github_target_properties.ml`:
- Around line 409-417: Remove the unnecessary try...with wrappers around calls
to pure/total functions in the QCheck properties: specifically edit
prop_scheme_of_url_total (and the similar property at lines ~457-474) to stop
catching exceptions and instead directly call Github_target.scheme_of_url (and
the other pure function) in the property body, returning the boolean result
(e.g., call the function, ignore its result, and return true) so that any
unexpected raises surface as test failures rather than being swallowed.

---

Nitpick comments:
In `@lib/github.ml`:
- Around line 535-596: The TLS config and Cohttp_eio client creation are
duplicated between fetch_oauth_scopes and the request function; extract a shared
helper (e.g. make_client) that calls https_config(), builds the https_fun
wrapper and returns a Cohttp_eio.Client.t (or a tuple of client and tls_config
if needed), then replace the inline TLS/client setup in fetch_oauth_scopes and
in request with calls to make_client so both reuse the same initialization path
(refer to functions fetch_oauth_scopes, request, https_config, and https_fun to
implement and wire the helper).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7d61a156-3352-408d-bdad-8d5f5ac14db1

📥 Commits

Reviewing files that changed from the base of the PR and between 65ddc81 and d01c310.

📒 Files selected for processing (49)

AGENTS.md
README.md
bin/main.ml
lib/event_log.ml
lib/github.ml
lib/github.mli
lib/managed_repo.ml
lib/managed_repo.mli
lib/orchestrator.ml
lib/orchestrator.mli
lib/persistence.ml
lib/poller_fiber.ml
lib/project_store.ml
lib/project_store.mli
lib/runner_fiber_impl.ml
lib/scope_preflight.ml
lib/scope_preflight.mli
lib/session_driver.ml
lib/worktree.ml
lib/worktree.mli
lib/worktree_plan_executor.ml
lib_core/github_target.ml
lib_core/github_target.mli
lib_core/oauth_scopes.ml
lib_core/oauth_scopes.mli
lib_core/patch_agent.ml
lib_core/patch_agent.mli
lib_core/push_reject_classify.ml
lib_core/push_reject_classify.mli
lib_core/rebase_decision.ml
lib_core/rebase_decision.mli
lib_core/reconciler.ml
lib_core/reconciler.mli
lib_core/worktree_parser.ml
lib_test/test_generators.ml
test/dune
test/test_dependency_injection_functors.ml
test/test_github_target_properties.ml
test/test_interleaving_properties.ml
test/test_oauth_scopes_properties.ml
test/test_orchestrator_properties.ml
test/test_patch_agent.ml
test/test_patch_controller.ml
test/test_poll_log_properties.ml
test/test_properties.ml
test/test_push_reject_classify_properties.ml
test/test_rebase_decision_properties.ml
test/test_rebase_onto.ml
test/test_reconciler_properties.ml

The SSH auto-detect heuristic from the previous commit picks a transport from the surrounding filesystem; sometimes the user wants to force one explicitly (e.g. running onton in a context where the sibling clone is misleading, or pinning HTTPS in a script). This commit exposes a CLI flag. Precedence on each run: CLI --clone-scheme > persisted config.json url_scheme > auto-detect Both the fresh / gameplan path and the resume path honor the override. Invalid values (anything other than [https] or [ssh]) exit with a clear error. After [ensure_managed_repo] returns the resolved scheme, [finalize_run] still persists it to config.json so the override only needs to be passed once. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Before: the runner fiber created its own fetch_mutex (per-fiber lifetime) and the poller's main-freshness fetch added on the previous commit also created its own. Both serialize their own internal calls but the two mutexes don't see each other — two simultaneous [git fetch origin] on the same managed repo (one from a runner action, one from the poller tick) race on the shared [refs/remotes/origin/*] ref store and the loser gets "cannot lock ref ...: is at X but expected Y". Best-effort tolerance was added in the previous commit; this commit eliminates the race entirely. Promotes [fetch_mutex] into the [Run_env.S] module type so every fiber that includes Run_env (Runner, Poller, Tui, the per-fiber sub-envs for Worktree_setup / Session_driver) sees the same mutex. The mutex is created once at the top of [build_capabilities] in [bin/main.ml], stored on [constructed_capabilities], and threaded through [FIBER_ENV] to every sub-fiber env. Runner_fiber_impl and Poller_fiber both drop their local creation and read [Env.fetch_mutex] instead. The dependency-injection functor test stubs gain a matching field (three real mutexes + one [Obj.magic ()] for a stub that pre-existed that pattern). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Before: the check ran at the top of every poll loop iteration, blocking the per-patch poll fan-out behind a [git fetch] for origin/<main>. A slow fetch (e.g. cold connection, large repo, network hiccup) delayed every patch's GitHub poll by the same amount. This commit splits the poller into two parallel fibers via [Eio.Fiber.both]: - the existing patch-poll loop (PR state, CI, base updates, etc.) - a new freshness_loop that runs [check_main_freshness] then sleeps [poll_interval] The two share state through [Runtime.update_orchestrator] (used inside [check_main_freshness] to enqueue Rebase on detected drift) and serialize their [git fetch] calls through [Env.fetch_mutex] (the shared mutex added in the previous commit), so there's no ref-store race. Cadence for the freshness loop is the same [poll_interval] as the patch loop; could be tuned independently later (probably slower, since main usually moves less often than per-patch state) but the symmetry keeps the operational model simple for now. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

flowglad-review-service · 2026-05-23T23:07:31Z

Review Summary

This delta push wires together the previously reviewed pure modules (push-reject classification, SSH transport, OAuth scope preflight, rebase-decision, main-freshness drift) into the effectful fiber layer. The code is generally correct and the property-test coverage is comprehensive.

One pre-existing note: the previously-filed outstanding finding at lib/poller_fiber.ml:173 (the fetch lock) was actually resolved by this push — fetch_mutex is now correctly threaded as a shared mutex via Env.fetch_mutex rather than a per-function local. That is an improvement.

The remaining two previously-filed outstanding comments (lib/orchestrator.ml:713 on the push_failure_count increment before permanent-rejection escalation, and lib_core/push_reject_classify.ml:75 on the recognizer ordering dependency) were not addressed in this push and remain live on the PR thread.

No new must-fix or should-fix issues were found in the changed lines of this push.

Severity	Count
Must-fix	0
Should-fix	0
Note	0

Variant: convergence-v2

Candidates: 0 | Posted: 0 | Suppressed: 0

0 comments posted · Model: claude-sonnet-4-6 (sonnet) · Tokens: 72578 in / 3029 out · Cache: 303324 created / 1336503 read · Context: 5 items · Review mode: agentic · Turns: 15 · Tool calls: 20 · Forced submit: yes · Tool mix: read_file=16, search_codebase=4

flowglad-review-service · 2026-05-24T15:29:11Z

Review Summary

Severity	Count
Must-fix	0
Should-fix	0
Note	0

Variant: convergence-v2

Candidates: 0 | Posted: 0 | Suppressed: 0

0 comments posted · Model: claude-sonnet-4-6 (sonnet) · Tokens: 72581 in / 3418 out · Cache: 321118 created / 1313224 read · Context: 5 items · Review mode: agentic · Turns: 15 · Tool calls: 19 · Forced submit: yes · Tool mix: read_file=14, search_codebase=5

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

lib/runner_fiber_impl.ml (1)

909-917: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve the recorded upstream SHA on noop rebases.

Worktree.Noop still means the branch is already aligned with origin/<base>. Leaving post_rebase_sha as None there keeps the old/empty branch_rebased_onto_sha, so the next prev_base_sha-driven rebase can fall back to the wrong --onto anchor after squash-merges.

Suggested fix

                         let post_rebase_sha =
                           match rebase_result with
-                          | Worktree.Ok ->
+                          | Worktree.Ok | Worktree.Noop ->
                               W.read_branch_sha ~path:wt_path
                                 ~ref_name:("origin/" ^ Branch.to_string new_base)
-                          | Worktree.Noop | Worktree.Conflict _
-                          | Worktree.Error _ ->
+                          | Worktree.Conflict _ | Worktree.Error _ ->
                               None
                         in

Also applies to: 943-947

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/runner_fiber_impl.ml` around lines 909 - 917, The pattern match that sets
post_rebase_sha skips recording the branch SHA when rebase_result is
Worktree.Noop, which causes branch_rebased_onto_sha to remain empty and later
prev_base_sha-driven rebases to use the wrong --onto anchor; update the match
for rebase_result in the post-rebase logic so that both Worktree.Ok and
Worktree.Noop call W.read_branch_sha ~path:wt_path ~ref_name:("origin/" ^
Branch.to_string new_base) (i.e., preserve the recorded upstream SHA on noop
rebases), and apply the same change to the second occurrence of this pattern
(the block referenced around lines 943-947) so branch_rebased_onto_sha is
consistently set after noop or ok results.

bin/main.ml (1)

1259-1265: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Teach PR-op stripping that --clone-scheme consumes the next argv token.

extract_pr_ops_from_argv currently treats the token after --clone-scheme as a positional argument. onton PROJECT --clone-scheme ssh +123 will fail before Cmdliner gets control.

Suggested fix

-      (* Keep this in sync with Cmdliner [opt] arguments below:
-         --gameplan, --token, --backend, --model, --repo, --main-branch,
-         --poll-interval, and --max-concurrency. *)
+      (* Keep this in sync with Cmdliner [opt] arguments below:
+         --gameplan, --token, --backend, --model, --repo, --main-branch,
+         --poll-interval, --max-concurrency, and --clone-scheme. *)
       match s with
       | "--gameplan" | "--token" | "--backend" | "--model" | "--repo"
-      | "--main-branch" | "--poll-interval" | "--max-concurrency" ->
+      | "--main-branch" | "--poll-interval" | "--max-concurrency"
+      | "--clone-scheme" ->
           true

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@bin/main.ml` around lines 1259 - 1265, The PR-op extraction logic treats the
token after "--clone-scheme" as positional; update extract_pr_ops_from_argv to
mark "--clone-scheme" as an option that consumes the next argv token (similar to
"--gameplan"/"--token" handling) so the following value isn't mis-parsed as a PR
op; locate extract_pr_ops_from_argv and add "--clone-scheme" to the set of
recognized consuming options (or implement a branch that skips the next element
when encountering "--clone-scheme") so "onton PROJECT --clone-scheme ssh +123"
correctly keeps "ssh" as the clone scheme and "+123" as the PR op.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@bin/main.ml`:
- Around line 597-611: The resolved clone scheme returned by
Managed_repo.ensure_managed_repo is currently ignored so the code keeps
persisting the old stored Project_store.url_scheme; update the resume path to
capture the effective scheme returned by Managed_repo.ensure_managed_repo and
persist that value into repo_coords.url_scheme (and the stored Project_store)
instead of reusing stored_scheme or rebuild from old config; locate usages
around clone_scheme_override, stored_scheme, effective_scheme and ensure the
value produced by Managed_repo.ensure_managed_repo is threaded into subsequent
logic (including the OAuth preflight decision) and written back to persistent
store.

---

Outside diff comments:
In `@bin/main.ml`:
- Around line 1259-1265: The PR-op extraction logic treats the token after
"--clone-scheme" as positional; update extract_pr_ops_from_argv to mark
"--clone-scheme" as an option that consumes the next argv token (similar to
"--gameplan"/"--token" handling) so the following value isn't mis-parsed as a PR
op; locate extract_pr_ops_from_argv and add "--clone-scheme" to the set of
recognized consuming options (or implement a branch that skips the next element
when encountering "--clone-scheme") so "onton PROJECT --clone-scheme ssh +123"
correctly keeps "ssh" as the clone scheme and "+123" as the PR op.

In `@lib/runner_fiber_impl.ml`:
- Around line 909-917: The pattern match that sets post_rebase_sha skips
recording the branch SHA when rebase_result is Worktree.Noop, which causes
branch_rebased_onto_sha to remain empty and later prev_base_sha-driven rebases
to use the wrong --onto anchor; update the match for rebase_result in the
post-rebase logic so that both Worktree.Ok and Worktree.Noop call
W.read_branch_sha ~path:wt_path ~ref_name:("origin/" ^ Branch.to_string
new_base) (i.e., preserve the recorded upstream SHA on noop rebases), and apply
the same change to the second occurrence of this pattern (the block referenced
around lines 943-947) so branch_rebased_onto_sha is consistently set after noop
or ok results.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e2cfd56b-44f5-43fb-8985-7c058cf1c8d3

📥 Commits

Reviewing files that changed from the base of the PR and between d01c310 and a06aa1a.

📒 Files selected for processing (6)

bin/main.ml
lib/poller_fiber.ml
lib/run_env.ml
lib/run_env.mli
lib/runner_fiber_impl.ml
test/test_dependency_injection_functors.ml

flowglad-review-service

Code Review: 1 comment

flowglad-review-service · 2026-05-24T15:37:45Z

Review Summary

One new issue in this push: freshness_loop in lib/poller_fiber.ml calls check_main_freshness() immediately at startup before sleeping, causing a redundant git fetch at t=0 (right after ensure_managed_repo already fetched) and firing a freshness check before the main loop's first reconciler pass. The fetch_mutex prevents a data race but the eager first call is wasteful and the resulting Rebase enqueue is immediately superseded by the main loop. Adding an initial Eio.Time.sleep clock poll_interval before the first check_main_freshness() call aligns behavior with the main loop.\n\nAll three outstanding round-1 comments remain live and are not re-raised here.

Severity	Count
Must-fix	0
Should-fix	1
Note	0

Variant: convergence-v2

Candidates: 1 | Posted: 1 | Suppressed: 0

1 comment posted · Model: claude-sonnet-4-6 (sonnet) · Tokens: 64953 in / 3928 out · Cache: 288549 created / 1273467 read · Context: 5 items · Review mode: agentic · Turns: 15 · Tool calls: 17 · Forced submit: yes · Tool mix: read_file=16, search_codebase=1

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

bin/main.ml (1)

1236-1247: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Register --clone-scheme as a value-taking option in argv preprocessing.

extract_pr_ops_from_argv only preserves separate option values for names listed in cli_option_takes_value. Because --clone-scheme is missing there, onton PROJECT --clone-scheme ssh treats ssh as a stray positional argument instead of the option value.

Suggested fix

-      (* Keep this in sync with Cmdliner [opt] arguments below:
-         --gameplan, --token, --backend, --model, --repo, --main-branch,
-         --poll-interval, and --max-concurrency. *)
+      (* Keep this in sync with Cmdliner [opt] arguments below:
+         --gameplan, --token, --backend, --model, --repo, --main-branch,
+         --clone-scheme, --poll-interval, and --max-concurrency. *)
       match s with
       | "--gameplan" | "--token" | "--backend" | "--model" | "--repo"
-      | "--main-branch" | "--poll-interval" | "--max-concurrency" ->
+      | "--main-branch" | "--clone-scheme" | "--poll-interval"
+      | "--max-concurrency" ->
           true

Also applies to: 1351-1364

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@bin/main.ml` around lines 1236 - 1247, The argv preprocessing function
cli_option_takes_value currently omits "--clone-scheme", causing
extract_pr_ops_from_argv to treat its value as a positional argument; update
cli_option_takes_value to include the "--clone-scheme" option in the match arm
that returns true (the same arm that contains "--gameplan", "--token",
"--backend", "--model", "--repo", "--main-branch", "--poll-interval", and
"--max-concurrency") so that "--clone-scheme <value>" is preserved as a single
option/value pair during argv parsing.

♻️ Duplicate comments (1)

bin/main.ml (1)

596-633: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Persist the scheme returned by ensure_managed_repo on resume.

Line 612 discards the resolved scheme, and Lines 629-632 rebuild repo_coords.url_scheme from the old stored value. Resume can therefore refresh the checkout via SSH or --clone-scheme, then immediately re-save stale config and flip back on the next run.

Suggested fix

-                      match
-                        Managed_repo.ensure_managed_repo
-                          ~clone_scheme:effective_scheme ~project_name:proj
-                          ~token ~owner ~repo ()
-                      with
-                      | Ok _ -> ()
-                      | Error msg ->
-                          Printf.eprintf
-                            "onton: warning: %s (resuming with local state)\n\
-                             %!"
-                            msg);
+                      let resolved_scheme =
+                        match
+                          Managed_repo.ensure_managed_repo
+                            ~clone_scheme:effective_scheme ~project_name:proj
+                            ~token ~owner ~repo ()
+                        with
+                        | Ok (_repo_root, scheme) -> Some scheme
+                        | Error msg ->
+                            Printf.eprintf
+                              "onton: warning: %s (resuming with local state)\n\
+                               %!"
+                              msg;
+                            stored_scheme
+                      in
@@
-                      url_scheme =
-                        Managed_repo.url_scheme_of_string
-                          (Option.value stored.Project_store.url_scheme
-                             ~default:"");
+                      url_scheme = resolved_scheme;

Based on learnings Onton auto-detects SSH transport when a sibling clone of owner/repo exists and persists the managed clone transport as url_scheme in config.json.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@bin/main.ml` around lines 596 - 633, The call to
Managed_repo.ensure_managed_repo currently discards the resolved clone scheme
(match ... | Ok _ -> ()), then later repo_coords.url_scheme is rebuilt from the
old stored.Project_store.url_scheme, so any auto-detected or --clone-scheme
resolution is not persisted; change the ensure_managed_repo match to capture the
returned scheme (e.g. | Ok resolved_scheme -> ...), use that resolved_scheme
when constructing repo_coords.url_scheme (instead of calling
Managed_repo.url_scheme_of_string on stored.Project_store.url_scheme), and
update/persist stored.Project_store.url_scheme with the resolved value before
continuing so resume saves the effective transport (reference symbols:
Managed_repo.ensure_managed_repo, repo_coords.url_scheme,
stored.Project_store.url_scheme).

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@bin/main.ml`:
- Around line 1236-1247: The argv preprocessing function cli_option_takes_value
currently omits "--clone-scheme", causing extract_pr_ops_from_argv to treat its
value as a positional argument; update cli_option_takes_value to include the
"--clone-scheme" option in the match arm that returns true (the same arm that
contains "--gameplan", "--token", "--backend", "--model", "--repo",
"--main-branch", "--poll-interval", and "--max-concurrency") so that
"--clone-scheme <value>" is preserved as a single option/value pair during argv
parsing.

---

Duplicate comments:
In `@bin/main.ml`:
- Around line 596-633: The call to Managed_repo.ensure_managed_repo currently
discards the resolved clone scheme (match ... | Ok _ -> ()), then later
repo_coords.url_scheme is rebuilt from the old stored.Project_store.url_scheme,
so any auto-detected or --clone-scheme resolution is not persisted; change the
ensure_managed_repo match to capture the returned scheme (e.g. | Ok
resolved_scheme -> ...), use that resolved_scheme when constructing
repo_coords.url_scheme (instead of calling Managed_repo.url_scheme_of_string on
stored.Project_store.url_scheme), and update/persist
stored.Project_store.url_scheme with the resolved value before continuing so
resume saves the effective transport (reference symbols:
Managed_repo.ensure_managed_repo, repo_coords.url_scheme,
stored.Project_store.url_scheme).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 94bc132f-035a-4f7c-90f5-6465d5358464

📥 Commits

Reviewing files that changed from the base of the PR and between a06aa1a and 3ce016b.

📒 Files selected for processing (6)

bin/main.ml
lib/orchestrator.ml
lib_core/push_reject_classify.ml
test/dune
test/test_github_target_properties.ml
test/test_properties.ml

💤 Files with no reviewable changes (1)

test/dune

The sibling scan only worked when the user happened to have another GitHub clone in one of a handful of conventional directories, and could pick SSH even on machines where SSH to GitHub didn't actually work (no key, blocked port 22, passphrase + no agent). Swap it for a non-interactive [git ls-remote -h git@github.com:owner/repo.git] with [BatchMode=yes] and [ConnectTimeout=5] — exit 0 means SSH can authenticate to GitHub and reach this repo, which is the same condition SSH push needs. The resolved scheme is still persisted to config.json so the probe only runs on the first invocation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

flowglad-review-service

Code Review: 1 comment

flowglad-review-service · 2026-05-24T15:50:51Z

Review Summary

One new should-fix found in this push: ensure_managed_repo unconditionally calls resolve_clone_scheme (firing the 5-second SSH probe and printing a misleading "cloning managed repo via SSH/HTTPS" message) before checking whether the managed clone already exists on disk. On every resume with a legacy config.json (url_scheme = null), the probe runs and its result is immediately discarded in favour of the existing remote URL — wasting 5 seconds and showing the user a false "cloning" log line. The three outstanding prior comments (poller fetch_lock no-op, push_failure_count increment in permanent rejection arm, freshness_loop immediate first tick) are addressed or unchanged and are not re-raised here.

Severity	Count
Must-fix	0
Should-fix	1
Note	0

Variant: convergence-v2

Candidates: 1 | Posted: 1 | Suppressed: 0

1 comment posted · Model: claude-sonnet-4-6 (sonnet) · Tokens: 64791 in / 4006 out · Cache: 281614 created / 1249026 read · Context: 5 items · Review mode: agentic · Turns: 15 · Tool calls: 17 · Forced submit: yes · Tool mix: read_file=16, search_codebase=1

…fiber Three findings from the review of 116df0a, all confirmed against current code: 1. bin/main.ml resume path discarded the scheme returned by [ensure_managed_repo] (matched [Ok _]) and rebuilt [repo_coords.url_scheme] from [stored.url_scheme], so a CLI override or a probe-determined scheme on a legacy config never reached [finalize_run]'s [save_config] call. Capture the resolved scheme and thread it through. 2. [Managed_repo.ensure_managed_repo] called [resolve_clone_scheme] unconditionally before the [has_git_dir] check. On a resume with a pre-existing clone (the common case), this ran a 5-second SSH reachability probe whose result was then discarded in favor of the on-disk [origin] URL, and printed a misleading "cloning managed repo via X" line even though no clone was performed. Defer the probe into the two clone branches so it only runs when we're actually about to clone. 3. [Poller_fiber.freshness_loop] fired [check_main_freshness] at t=0 before sleeping, duplicating the fetch that [ensure_managed_repo] already ran at startup. Sleep first to match the main poll loop's cadence and avoid the redundant fetch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

flowglad-review-service

Code Review: 2 comments

flowglad-review-service · 2026-05-24T16:00:35Z

+  (`git@github.com:owner/repo.git`), the managed clone at
+  `~/.local/share/onton/<project>/repo` is created with the same SSH
+  transport, and pushes go through the user's ssh-agent / `~/.ssh/config`.
+  SSH bypasses the per-OAuth-scope restrictions entirely. The chosen


[SHOULD FIX] AGENTS.md:120 — The SSH section in the new ## Authentication block still describes the old sibling-clone heuristic (commit #2, which was superseded by commit #12). It says onton "auto-detects whether the user already maintains a clone of owner/repo under one of $PWD/.., ~/code-src/, …" and references the log message onton: detected SSH sibling clone at … — cloning managed repo via SSH.

The actual implementation introduced in this PR (resolve_clone_scheme / probe_ssh_available in lib/managed_repo.ml) does no sibling-clone scanning. It runs a non-interactive git ls-remote -h git@github.com:owner/repo.git probe and emits:

onton: SSH probe to git@github.com succeeded — cloning managed repo via SSH

onton: SSH probe to git@github.com unavailable — cloning managed repo via HTTPS

A user following these docs will look for a sibling clone that doesn't exist, and the log messages won't match. The section needs to describe the probe-based detection and the actual emitted messages.

Reviewed at 2173002

flowglad-review-service · 2026-05-24T16:00:35Z

+workflow … without 'workflow' scope` message — and onton now classifies that
+as `workflow_scope_missing` and routes the agent straight to
+`needs_intervention` instead of retrying.
+


[SHOULD FIX] README.md:200 — Same stale-docs issue as AGENTS.md: the #### SSH transport section added in this PR still documents the old sibling-clone heuristic ($PWD/.., ~/code-src/, ~/src/, ~/code/, ~/dev/, ~/projects/, and onton: detected SSH sibling clone at …). The implementation shipped in this PR is probe-based and emits onton: SSH probe to git@github.com succeeded — cloning managed repo via SSH — no sibling-clone directory scanning is involved. The section should be updated to reflect the actual probe mechanism, its fallback behaviour, and the correct log messages.

Reviewed at 2173002

flowglad-review-service · 2026-05-24T16:00:36Z

Review Summary

Delta review (round 5). Two should-fix findings, both on documentation sections newly added in this push:

AGENTS.md and README.md — The ## Authentication (AGENTS.md) and #### SSH transport (README.md) sections were written to describe commit #2's sibling-clone heuristic, but that commit was superseded by commit #12 (SSH reachability probe). Both sections still reference non-existent sibling-clone directory scanning and the stale log message onton: detected SSH sibling clone at …. The actual implementation emits onton: SSH probe to git@github.com succeeded/unavailable — cloning managed repo via SSH/HTTPS and does no directory scanning at all. Both sections should be updated to match the probe-based behaviour.

No previously-raised findings are re-raised. The lib/orchestrator.ml permanent-rejection counter bug (round 1) and the freshness_loop startup double-fetch (round 3) are both correctly addressed in this push.

Severity	Count
Must-fix	0
Should-fix	2
Note	0

Variant: convergence-v2

Candidates: 2 | Posted: 2 | Suppressed: 0

2 comments posted · Model: claude-sonnet-4-6 (sonnet) · Tokens: 64743 in / 3771 out · Cache: 347913 created / 865249 read · Context: 5 items · Review mode: agentic · Turns: 12 · Tool calls: 11 · Tool mix: read_file=9, search_codebase=2

--force-with-lease alone only verifies that the remote tracking ref matches actual remote — once a background fetch refreshes that tracking ref, the lease passes even when local is strictly behind, silently wiping remote commits. PR #315 was force-pushed empty and auto-closed this way. --force-if-includes additionally requires that the remote tip is in the local branch's reflog, which refuses pushes that would lose commits the local clone never observed. Requires git ≥ 2.30 (already a prerequisite for the worktree features onton relies on). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

[Worktree.create] used to take the local branch ref as authoritative when it existed in the user's working clone. The user's [refs/heads/<branch>] could be stale (left at the PR's base, behind the real PR head), so the worktree would start missing every PR commit — exactly the state that wiped PR #315 when the session pushed it. Split the decision into a pure module [Start_point_plan] in lib_core/ that takes observed local/remote-ref SHAs and a precomputed two-way ancestry, returns one of [Reset_and_use_remote_tracking], [Use_local_branch_unchanged], [Create_new_branch_from_base], or a [refusal] for the diverged/local-ahead cases. The effectful side in [Worktree.create] gathers the inputs, calls the planner, executes the chosen action via [git worktree add -B/-b]/[add], and returns [(t, refusal) Result.t]. [Worktree_setup.ensure_worktree] runs a new [fetch_origin_branch] before [create] so the planner observes a fresh remote ref. Property tests (SPP-1..9) cover totality, determinism, the "no-silent-clobber" invariant, remote authority, label bounds, and exhaustive variant reachability. An integration test exercises every arm against real git fixtures, including a direct reproduction of the #315 stale-local-ref state. Refusals are logged through the existing activity-log channel and return [None] from [ensure_worktree], so the next scheduling tick sees the agent idle. Future commits will route refusals through [needs_intervention] explicitly via [Push_reject_classify]. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

[Worktree.force_push_with_lease] used to gate on [git rev-list --count <base>..HEAD] (worktree HEAD) and then push the named branch — so an agent that [git switch]ed to a different local branch mid-session could pass the gate with commits the push command would never upload. Combined with --force-with-lease's blind spot for local commit loss (the lease passes whenever the local tracking ref matches actual remote, even if local strictly lacks remote commits), this put PR #315 in the position where a push of a stale branch wiped its remote. New pure module [Push_plan] in lib_core/ takes the observed worktree HEAD, branch ref SHA, remote tracking SHA, two-way ancestry, and [base..refs/heads/<branch>] commit count, and returns [Push (Force_push_if_includes | Initial_push)] or [Refuse (No_commits_ahead_of_base | Worktree_missing | Branch_ref_missing | Branch_switched | Local_missing_remote_commits)]. The named branch — never HEAD — is what gets measured and pushed. [Push_reject_classify.rejection] gains a [Local_state_unsafe] variant (permanent), and [Push_plan.to_push_reject_classify_rejection] maps the three "supervisor said no" refusals onto it so the orchestrator's existing [is_permanent] → [needs_intervention] escalation applies without new state. Property tests (PPP-1..11) cover totality, branch-switch refusal, local-missing-remote refusal, zero-commits skip, initial-push, happy-path, worktree-missing pre-emption, determinism, exhaustive variant reachability, refusal-to-rejection permanence, and label bounds. PRC tests extend their existing variant scans for Local_state_unsafe. An integration test drives real-git fixtures for the branch-switched and local-missing-remote refusal arms plus the happy-path force-push, asserting nothing reaches the remote when the planner refuses. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…#318) * Add --force-if-includes to session-end force-push --force-with-lease alone only verifies that the remote tracking ref matches actual remote — once a background fetch refreshes that tracking ref, the lease passes even when local is strictly behind, silently wiping remote commits. PR #315 was force-pushed empty and auto-closed this way. --force-if-includes additionally requires that the remote tip is in the local branch's reflog, which refuses pushes that would lose commits the local clone never observed. Requires git ≥ 2.30 (already a prerequisite for the worktree features onton relies on). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Plan worktree start point against fresh remote refs [Worktree.create] used to take the local branch ref as authoritative when it existed in the user's working clone. The user's [refs/heads/<branch>] could be stale (left at the PR's base, behind the real PR head), so the worktree would start missing every PR commit — exactly the state that wiped PR #315 when the session pushed it. Split the decision into a pure module [Start_point_plan] in lib_core/ that takes observed local/remote-ref SHAs and a precomputed two-way ancestry, returns one of [Reset_and_use_remote_tracking], [Use_local_branch_unchanged], [Create_new_branch_from_base], or a [refusal] for the diverged/local-ahead cases. The effectful side in [Worktree.create] gathers the inputs, calls the planner, executes the chosen action via [git worktree add -B/-b]/[add], and returns [(t, refusal) Result.t]. [Worktree_setup.ensure_worktree] runs a new [fetch_origin_branch] before [create] so the planner observes a fresh remote ref. Property tests (SPP-1..9) cover totality, determinism, the "no-silent-clobber" invariant, remote authority, label bounds, and exhaustive variant reachability. An integration test exercises every arm against real git fixtures, including a direct reproduction of the #315 stale-local-ref state. Refusals are logged through the existing activity-log channel and return [None] from [ensure_worktree], so the next scheduling tick sees the agent idle. Future commits will route refusals through [needs_intervention] explicitly via [Push_reject_classify]. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Plan pushes against worktree state, not just lease [Worktree.force_push_with_lease] used to gate on [git rev-list --count <base>..HEAD] (worktree HEAD) and then push the named branch — so an agent that [git switch]ed to a different local branch mid-session could pass the gate with commits the push command would never upload. Combined with --force-with-lease's blind spot for local commit loss (the lease passes whenever the local tracking ref matches actual remote, even if local strictly lacks remote commits), this put PR #315 in the position where a push of a stale branch wiped its remote. New pure module [Push_plan] in lib_core/ takes the observed worktree HEAD, branch ref SHA, remote tracking SHA, two-way ancestry, and [base..refs/heads/<branch>] commit count, and returns [Push (Force_push_if_includes | Initial_push)] or [Refuse (No_commits_ahead_of_base | Worktree_missing | Branch_ref_missing | Branch_switched | Local_missing_remote_commits)]. The named branch — never HEAD — is what gets measured and pushed. [Push_reject_classify.rejection] gains a [Local_state_unsafe] variant (permanent), and [Push_plan.to_push_reject_classify_rejection] maps the three "supervisor said no" refusals onto it so the orchestrator's existing [is_permanent] → [needs_intervention] escalation applies without new state. Property tests (PPP-1..11) cover totality, branch-switch refusal, local-missing-remote refusal, zero-commits skip, initial-push, happy-path, worktree-missing pre-emption, determinism, exhaustive variant reachability, refusal-to-rejection permanence, and label bounds. PRC tests extend their existing variant scans for Local_state_unsafe. An integration test drives real-git fixtures for the branch-switched and local-missing-remote refusal arms plus the happy-path force-push, asserting nothing reaches the remote when the planner refuses. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Address PR 318 review comments * Fix local-missing-remote integration fixture * Distinguish refused worktree setup * Route repo-root worktree refusal permanently --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

subsetpark and others added 7 commits May 23, 2026 15:15

flowglad-review-service Bot reviewed May 23, 2026

View reviewed changes

Comment thread lib_core/push_reject_classify.ml

Comment thread lib/orchestrator.ml Outdated

Comment thread lib/poller_fiber.ml

coderabbitai Bot reviewed May 23, 2026

View reviewed changes

Comment thread lib_core/oauth_scopes.ml Outdated

Comment thread test/test_github_target_properties.ml Outdated

subsetpark and others added 3 commits May 23, 2026 15:58

vercel Bot deployed to Preview May 23, 2026 23:05 View deployment

subsetpark commented May 24, 2026

View reviewed changes

Comment thread lib/scope_preflight.ml Outdated

Address PR 315 review comments

3ce016b

subsetpark closed this May 24, 2026

subsetpark force-pushed the push-reject-classify-and-ssh-auto-detect branch from a06aa1a to 94fda74 Compare May 24, 2026 15:19

vercel Bot deployed to Preview May 24, 2026 15:19 View deployment

subsetpark reopened this May 24, 2026

coderabbitai Bot reviewed May 24, 2026

View reviewed changes

Comment thread bin/main.ml Outdated

vercel Bot deployed to Preview May 24, 2026 15:35 View deployment

flowglad-review-service Bot reviewed May 24, 2026

View reviewed changes

Comment thread lib/poller_fiber.ml Outdated

coderabbitai Bot reviewed May 24, 2026

View reviewed changes

vercel Bot deployed to Preview May 24, 2026 15:48 View deployment

subsetpark changed the title ~~Push-rejection loop guard + SSH auto-detect + OAuth scope preflight~~ Push-rejection loop guard + SSH transport auto-detect May 24, 2026

flowglad-review-service Bot reviewed May 24, 2026

View reviewed changes

Comment thread lib/managed_repo.ml Outdated

vercel Bot deployed to Preview May 24, 2026 15:58 View deployment

subsetpark merged commit b9f08ce into main May 24, 2026
8 of 9 checks passed

flowglad-review-service Bot reviewed May 24, 2026

View reviewed changes

subsetpark deleted the push-reject-classify-and-ssh-auto-detect branch May 24, 2026 16:00

subsetpark mentioned this pull request May 24, 2026

Harden worktree start point + push safety against PR #315-class wipes #318

Merged

5 tasks

coderabbitai Bot mentioned this pull request May 24, 2026

Reliable stacked-PR rebase anchors #319

Merged

8 tasks

Conversation

subsetpark commented May 23, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed (by commit)

Notes for reviewers

Property tests

Test plan

Summary by CodeRabbit

Release Notes

Uh oh!

vercel Bot commented May 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented May 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Uh oh!

flowglad-review-service Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

flowglad-review-service Bot commented May 23, 2026

Review Summary

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

flowglad-review-service Bot commented May 23, 2026

Review Summary

Uh oh!

Uh oh!

flowglad-review-service Bot commented May 24, 2026

Review Summary

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

flowglad-review-service Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

flowglad-review-service Bot commented May 24, 2026

Review Summary

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

flowglad-review-service Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

flowglad-review-service Bot commented May 24, 2026

Review Summary

Uh oh!

Uh oh!

flowglad-review-service Bot left a comment

Choose a reason for hiding this comment

Uh oh!

flowglad-review-service Bot May 24, 2026

Choose a reason for hiding this comment

Uh oh!

flowglad-review-service Bot May 24, 2026

Choose a reason for hiding this comment

Uh oh!

flowglad-review-service Bot commented May 24, 2026

Review Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

subsetpark commented May 23, 2026 •

edited by coderabbitai Bot

Loading

vercel Bot commented May 23, 2026 •

edited

Loading

coderabbitai Bot commented May 23, 2026 •

edited

Loading