docs(mt#1603): Document reviewer.md as cousin of Railway reviewer service (mt#1603) by minsky-ai[bot] · Pull Request #964 · edobry/minsky

minsky-ai · 2026-05-06T17:38:16Z

Summary

Closes the last documentation gap on mt#1073's .claude/agents/reviewer.md success criterion (residual #5 from the 2026-05-05 hand-off). The local reviewer subagent and the deployed Railway reviewer service (services/reviewer/) both operate under the same Critic Constitution, but the agent definition didn't explicitly say so — readers could mistake the two surfaces for unrelated implementations.

The Chinese-wall constraints (read-only file access, no implementer-session memory, evidence-based output, severity classification) were already mechanically encoded in .claude/agents/reviewer.md — Mode 1 hard guard against direct posting, tools list excluding Edit/Write, evidence-before-flag rule, anchor-validation discipline. This PR closes the doc gap by naming the cousin relationship; nothing else changes.

Key Changes

.claude/agents/reviewer.md — adds a 3-sentence block-quote note immediately after the frontmatter (lines 19-30) identifying the Railway service as the cousin surface enforcing the same Critic Constitution. References services/reviewer/, docs/architecture/adr-005-forgebackend-subinterfaces.md (sub-interface shape), and docs/architecture/adr-006-agent-identity.md (bot identity).

Testing

This is a docs-only edit. No tests, no verification artifact (§7a counter-example: "Docs-only or config-only changes"). Acceptance tests verified post-edit:

AT1 — services/reviewer/ appears within first 30 lines after frontmatter (line 21). ✓
AT2 — Mode 1 hard guard sentence (Mode 1 hard guard: never call mcp__minsky__session_pr_review_submit yourself.) is byte-identical at its post-shift line. ✓
AT3 — Tools list in frontmatter byte-identical (insertion happened between lines 17-19; no edits to lines 1-17). ✓
SC build(deps-dev): bump lint-staged from 15.5.2 to 16.0.0 #4 — No protocol changes; Mode 1 / Mode 2 mechanics, severity rules, anchor-validation discipline all unchanged. ✓

Parent: mt#1073 (Adversarial reviewer App architecture umbrella) — closes residual chore(deps-dev): bump eslint from 8.57.1 to 9.27.0 #5 of the 2026-05-05 hand-off.
Sibling residuals on mt#1073: mt#1086 (timeouts, blocked-by mt#1255), mt#1510 (identity routing, IN-REVIEW as PR feat(mt#1510): Identity routing in session_pr_review_submit (mt#1510) #957), mt#1515 (seeded-bug harness), mt#1516 (position paper + mt#1065 deprecation).
The Railway service it points at was shipped via mt#1083 (DONE) and is currently active — see PR feat(mt#1510): Identity routing in session_pr_review_submit (mt#1510) #957 R1 review header (Reviewer: minsky-reviewer[bot] via openai:gpt-5) for live evidence.

Adds a 3-sentence block-quote note immediately after the frontmatter in .claude/agents/reviewer.md identifying the deployed Railway reviewer service (services/reviewer/) as the other surface enforcing the same Critic Constitution. References ADR-005 (sub-interface shape) and ADR-006 (bot identity) for the shared constraint contract. No protocol changes — Mode 1/Mode 2 mechanics, severity classification, anchor-validation discipline, and the tools list are byte-identical. Closes the last documentation gap on mt#1073's reviewer.md success criterion (Chinese-wall constraints were already mechanically encoded).

minsky-reviewer

Independent adversarial review (Chinese-wall)
Reviewer: minsky-reviewer[bot] via openai:gpt-5
Tier: unknown

Docs-only PR adding a cousin-note to .claude/agents/reviewer.md that links the local reviewer subagent to the deployed Railway reviewer service. The change is narrowly scoped and aligns with the stated intent. I noted several non-blocking documentation nits: potential ambiguity around “read-only file access” given the Bash tool presence, plain-code cross-references that could be clickable links, a minor phrasing improvement, a qualification around “differ only in how they are invoked,” and a process note on brittle line-number-based acceptance checks. No blocking issues for security, behavior, scope, or licensing were found. Recommend merging after optional wording/linking polish.

Findings

[NON-BLOCKING] .claude/agents/reviewer.md:19 — “Read-only file access” claim may be misleading given Bash tool is present
The new cousin-note asserts both surfaces share "read-only file access" constraints, but this subagent's frontmatter tools includes Bash alongside Read/Glob/Grep (see .claude/agents/reviewer.md frontmatter), and the body later allows running commands for smoke/adoption sweeps. While the narrative also says "Cannot modify code — posting a GitHub review is an allowed write (Mode 2 only)", the presence of Bash could allow unintended mutations if not explicitly scoped.

Suggestion: tighten the wording to "no workspace mutations; commands must be read-only" and/or explicitly state that any Bash usage must avoid file-modifying operations. If the Railway service enforces additional write guards, note them for parity so the "same constraints" claim remains accurate.

[NON-BLOCKING] .claude/agents/reviewer.md:19 — Cross-references rely on services/reviewer/ and ADR docs existing; linkability not verified in this doc
The block-quote references services/reviewer/, docs/architecture/adr-005-forgebackend-subinterfaces.md, and docs/architecture/adr-006-agent-identity.md. While these paths exist in-repo (verified), they are plain code ticks rather than clickable relative links. For discoverability, consider converting to Markdown links: [services/reviewer/](/services/reviewer/), [ADR-005](../architecture/adr-005-forgebackend-subinterfaces.md), [ADR-006](../architecture/adr-006-agent-identity.md).

Non-blocking docs nit, but improves navigation for readers landing in this file.

[NON-BLOCKING] .claude/agents/reviewer.md:30 — AT2/AT3 in PR description assert byte-identical lines; this file’s inserted block shifts line numbers and may invalidate those checks
The PR description claims acceptance tests verified specific post-shift line numbers (e.g., Mode 1 hard guard sentence remains byte-identical at its post-shift line). While the content indeed appears unchanged, the tests as described depend on absolute line numbers that will drift with future edits. Consider anchoring such checks to unique substrings rather than fixed line numbers in any automated acceptance scripts to avoid brittle tests.

Marking as a process/docs note; no code change required here.

[NON-BLOCKING] .claude/agents/reviewer.md:21 — Security/permissions nuance: stating “no implementer-session memory” without clarifying telemetry/logging may overpromise
The cousin note lists "no implementer-session memory" as a constraint. Elsewhere in the repo, reviewers may log findings or errors to service logs (e.g., the Railway service). To avoid overpromising, consider qualifying this as "no access to implementer user memory or prior chat context; transient service logs may exist but do not feed into review judgments." This preserves the Chinese-wall claim while being precise about operational telemetry.

Inline comments

.claude/agents/reviewer.md:23 — Tiny phrasing nit: "service in services/reviewer/ — webhook-driven, runs on PR events" might read smoother as "webhook-driven service that runs on PR events". Consider tightening for readability.
.claude/agents/reviewer.md:28 — Potential ambiguity: "differ only in how they are invoked" — the Railway service also differs in posting identity and operational context (headless, webhook-triggered). Maybe: "share the same Constitution and review protocol; they differ only in invocation mode and runtime surface (in-conversation vs webhook service posting as minsky-reviewer[bot])."

Tightens the cousin-surface note in .claude/agents/reviewer.md per the 4 NON-BLOCKING + 2 inline findings from minsky-reviewer bot R1 review: - read-only file access wording corrected to acknowledge that Bash is in the tools list while pinning the actual mutation envelope: no workspace mutations beyond the sanctioned review-posting write, with Bash scoped to read-only commands. - no implementer-session memory wording made precise: no access to the implementer agents prior conversation or working state. This describes the isolation surface without overpromising vs operational logging. - Plain-code refs converted to relative Markdown links: services/reviewer link, ADR-005 link, ADR-006 link. - Inline 1: webhook-driven, runs on PR events tightened to webhook-driven service that runs on pull_request events. - Inline 2: differ only in how they are invoked replaced with differ only in invocation mode and runtime surface, plus adds the minsky-reviewer bot identity dimension in sentence 2. - evidence-based output strengthened to evidence-based output anchored to source, mirroring the bodys read-source-before-flag rule. Still 4 sentences, still satisfies SC 1. Tools list and all protocol mechanics (Mode 1/2, severity, anchor validation) remain byte-identical.

minsky-reviewer

Independent adversarial review (Chinese-wall)
Reviewer: minsky-reviewer[bot] via openai:gpt-5
Tier: unknown

Docs-only change adding a clear cousin-surface note to .claude/agents/reviewer.md. The new block is accurate and improves orientation, and the referenced ADRs and services/reviewer/ path exist. I flagged a few non-blocking items: the statement that Bash is for read-only commands is policy but not mechanically enforced here; consider explicitly forbidding writeful commands and clarifying smoke-test constraints. Wording could better acknowledge identity/runtime context differences and telemetry nuances, and link styling for services/reviewer/ can be normalized. Also, acceptance checks relying on absolute line numbers are brittle. No blocking issues for this docs-only PR; proceed as-is if desired.

Findings

[NON-BLOCKING] .claude/agents/reviewer.md:27 — “Bash is for read-only commands” is policy but not mechanically enforced here
The new cousin note constrains Bash usage to read-only operations ("Bash is for read-only commands"), which is the right intent, but nothing in this agent definition enforces that mechanically. Given later sections instruct running smoke tests, please explicitly forbid filesystem/network mutations (e.g., git add/commit, bun test --updateSnapshot, rm -rf, curl -X POST), and state that any smoke test must be side-effect-free or skipped. Consider adding a one-liner guard in the body near smoke-test guidance to clarify that writeful commands are out-of-bounds for this surface.
[NON-BLOCKING] .claude/agents/reviewer.md:20 — Cross-references are present but could be clickable relative links for navigation
The new block links to services/reviewer/, ADR-005, and ADR-006. While the ADRs are linked with Markdown, services/reviewer/ is presented in code ticks inside a Markdown link, which renders fine but is unconventional; elsewhere we often use [services/reviewer/](../../services/reviewer/) without backticks. Consider normalizing the style and verifying all three paths render as clickable links in GitHub preview for consistency.
[NON-BLOCKING] .claude/agents/reviewer.md:31 — “They differ only in invocation mode and runtime surface” may understate identity/context differences
The text claims the two surfaces "differ only in invocation mode and runtime surface." The paragraph does mention posting under minsky-reviewer[bot], but consider acknowledging that the deployed service also differs in posting identity and execution context (headless, webhook-triggered, potentially different environment/network policies) to avoid readers inferring absolute equivalence beyond the Critic Constitution.
[NON-BLOCKING] .claude/agents/reviewer.md:19 — Prior acceptance tests in PR description rely on absolute line numbers; this insertion invalidates stability of that approach
Your PR description’s AT2/AT3 talk about verifying post-shift exact line numbers remaining byte-identical. With this block inserted, those line indices will drift again on the next doc change. If you have any scripts that enforce these, consider anchoring checks to unique substrings instead of absolute line numbers to avoid brittle tests. This is a process/doc concern only; no code change required.
[NON-BLOCKING] .claude/agents/reviewer.md:26 — “No access to the implementer agent’s prior conversation or working state” could clarify telemetry/logging exceptions
To avoid overpromising, consider qualifying the "no implementer-session memory" claim with an operational note like "transient service logs/metrics may exist but do not feed into review judgments" — especially since the service surface is webhook-driven and typically logs request metadata. This preserves the Chinese-wall guarantee while being precise about runtime telemetry.

Inline comments

.claude/agents/reviewer.md:22 — Minor: We now state the service is triggered on pull_request events — if it also reacts to pull_request_target or review-comment webhooks for retries/threading, consider broadening the phrasing to "GitHub webhook events (e.g., pull_request)" to avoid over-specificity. If pull_request is the only one by design, ignoring this is fine.
.claude/agents/reviewer.md:18 — Nice addition. For parity with the Constitution claims, maybe append a sentence like: "Both surfaces run the same review protocol; the service enforces anchor validation and severity rules identically via the shared aggregator." That would reassure readers that semantics, not just philosophy, are shared.

minsky-ai

Review: mt#1603 reviewer.md cousin-note

CI status: PASS — 2/2 checks succeeded (build, Prevent Placeholder Tests).

Summary

Docs-only PR adding a 16-line block-quote note to .claude/agents/reviewer.md that names the deployed Railway reviewer service as the cousin surface enforcing the same Critic Constitution. Diff is one file, +16 lines, 0 deletions; current cumulative diff reflects R1 review feedback applied. All four spec success criteria are met and all three acceptance tests pass. Zero blocking findings; the two prior minsky-reviewer[bot] review rounds (R1 + R2) raised only NON-BLOCKING wording/styling nits, R1 findings were addressed in commit e04b372bf, R2 raised follow-on nits in the same areas (bikeshedding pattern per feedback_reviewer_bot_self_reversal_signal), and R2 explicitly states "no blocking issues for this docs-only PR; proceed as-is if desired."

Spec verification

Task: mt#1603

Criterion	Status	Evidence
Note ≤ 4 sentences, names Railway service as cousin, identifies Critic Constitution as shared contract	Met	`.claude/agents/reviewer.md:19-33`. 4-sentence block-quote: opens "one of two surfaces enforcing the same Critic Constitution", names "deployed Railway reviewer service in `services/reviewer/`".
Links to `services/reviewer/` and ADR-005 + ADR-006	Met	`.claude/agents/reviewer.md:21` (`[`services/reviewer/`](../../services/reviewer/)`), `:24` (ADR-005), `:25` (ADR-006). All three are clickable relative links.
Note placed near top (after frontmatter, before Mode-selection rules)	Met	Frontmatter closes at line 17; cousin-note block occupies lines 19-33; "You are a code review analyst" opening shifts to line 35; Mode-selection bullets follow.
No protocol changes — Mode 1/2 mechanics, severity rules, anchor-validation discipline, tools list byte-identical	Met	`parsedDiff` shows a single hunk with insertions only (no deletions). Tools list at lines 11-13 untouched. Mode 1 hard-guard sentence ("Mode 1 hard guard: never call `mcp__minsky__session_pr_review_submit` yourself") preserved at its post-shift position.

Acceptance tests

Test	Result	Evidence
`services/reviewer/` literal within first 30 lines after frontmatter	Pass	Appears at line 21 (within line range 18-47).
Mode 1 hard-guard sentence still appears unchanged	Pass	Confirmed by reading the file post-edit.
Tools list in frontmatter byte-identical	Pass	Diff confirms no edits to lines 1-17.

Documentation impact

Updated .claude/agents/reviewer.md in this PR. No other documentation needs updating — the cousin-note content is self-contained and doesn't change architectural behavior described in docs/architecture.md, docs/theory-of-operation.md, CONTRIBUTING.md, or README.md. The ADR cross-references in the note point at existing canonical docs (ADR-005, ADR-006) which require no edits.

Notes on prior bot-reviewer iteration

R1 (commit 8e60222dd): 4 NON-BLOCKING + 2 inline. Findings: tighten "read-only file access" wording, convert plain-code refs to clickable links, qualify "no implementer-session memory", phrasing nit, "differ only in how they are invoked" missing dimensions. All addressed in R1→R2 commit e04b372bf.
R2 (commit e04b372bf): 5 NON-BLOCKING + 2 inline. Same areas R1 flagged, asking for incremental refinement (e.g. R1 "tighten read-only file access" → R2 "Bash policy not mechanically enforced here"; R1 "make refs clickable" → R2 "code ticks inside Markdown link is unconventional"). Both rounds explicitly green-light merge ("Recommend merging after optional wording/linking polish" / "proceed as-is if desired").

Per feedback_bot_pr_convergence_via_bypass: self-authored bot PRs cannot reach APPROVE in-tool because of GitHub's self-approval block plus the same-App-identity Chinese-wall constraint. R1+R2 substantive iteration is complete; R3+ is low-signal bikeshedding. Convergence is reached.

(Had Claude look into this — AI-assisted review.)

minsky-ai Bot added the authorship/co-authored Co-authored by human and AI agent label May 6, 2026

minsky-reviewer Bot reviewed May 6, 2026

View reviewed changes

minsky-ai Bot commented May 6, 2026

View reviewed changes

minsky-ai Bot added the review:bot-commented Reviewer bot has commented on this PR (informational) label May 6, 2026

edobry merged commit 956566a into main May 6, 2026
2 checks passed

edobry deleted the task/mt-1603 branch May 6, 2026 18:02

This was referenced May 6, 2026

feat(mt#1510): Identity routing in session_pr_review_submit (mt#1510) #957

Merged

docs(mt#1516): Identity-routing note in mcp-tools rule + Phase-2 docs closeout (mt#1516) #965

Merged

feat(mt#1086): Network-call timeouts for reviewer service (mt#1086) #969

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(mt#1603): Document reviewer.md as cousin of Railway reviewer service (mt#1603)#964

docs(mt#1603): Document reviewer.md as cousin of Railway reviewer service (mt#1603)#964
edobry merged 2 commits into
mainfrom
task/mt-1603

minsky-ai Bot commented May 6, 2026

Uh oh!

minsky-reviewer Bot left a comment

Uh oh!

minsky-reviewer Bot left a comment

Uh oh!

minsky-ai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

minsky-ai Bot commented May 6, 2026

Summary

Key Changes

Testing

Related

Uh oh!

minsky-reviewer Bot left a comment

Choose a reason for hiding this comment

Findings

Inline comments

Uh oh!

minsky-reviewer Bot left a comment

Choose a reason for hiding this comment

Findings

Inline comments

Uh oh!

minsky-ai Bot left a comment

Choose a reason for hiding this comment

Review: mt#1603 reviewer.md cousin-note

Summary

Spec verification

Acceptance tests

Documentation impact

Notes on prior bot-reviewer iteration

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant