Skip to content

docs(mt#1603): Document reviewer.md as cousin of Railway reviewer service (mt#1603)#964

Merged
edobry merged 2 commits into
mainfrom
task/mt-1603
May 6, 2026
Merged

docs(mt#1603): Document reviewer.md as cousin of Railway reviewer service (mt#1603)#964
edobry merged 2 commits into
mainfrom
task/mt-1603

Conversation

@minsky-ai
Copy link
Copy Markdown
Contributor

@minsky-ai minsky-ai Bot commented May 6, 2026

Summary

Closes the last documentation gap on mt#1073's .claude/agents/reviewer.md success criterion (residual #5 from the 2026-05-05 hand-off). The local reviewer subagent and the deployed Railway reviewer service (services/reviewer/) both operate under the same Critic Constitution, but the agent definition didn't explicitly say so — readers could mistake the two surfaces for unrelated implementations.

The Chinese-wall constraints (read-only file access, no implementer-session memory, evidence-based output, severity classification) were already mechanically encoded in .claude/agents/reviewer.md — Mode 1 hard guard against direct posting, tools list excluding Edit/Write, evidence-before-flag rule, anchor-validation discipline. This PR closes the doc gap by naming the cousin relationship; nothing else changes.

Key Changes

  • .claude/agents/reviewer.md — adds a 3-sentence block-quote note immediately after the frontmatter (lines 19-30) identifying the Railway service as the cousin surface enforcing the same Critic Constitution. References services/reviewer/, docs/architecture/adr-005-forgebackend-subinterfaces.md (sub-interface shape), and docs/architecture/adr-006-agent-identity.md (bot identity).

Testing

This is a docs-only edit. No tests, no verification artifact (§7a counter-example: "Docs-only or config-only changes"). Acceptance tests verified post-edit:

  • AT1services/reviewer/ appears within first 30 lines after frontmatter (line 21). ✓
  • AT2 — Mode 1 hard guard sentence (Mode 1 hard guard: never call mcp__minsky__session_pr_review_submit yourself.) is byte-identical at its post-shift line. ✓
  • AT3 — Tools list in frontmatter byte-identical (insertion happened between lines 17-19; no edits to lines 1-17). ✓
  • SC build(deps-dev): bump lint-staged from 15.5.2 to 16.0.0 #4 — No protocol changes; Mode 1 / Mode 2 mechanics, severity rules, anchor-validation discipline all unchanged. ✓

Related

Adds a 3-sentence block-quote note immediately after the frontmatter in
.claude/agents/reviewer.md identifying the deployed Railway reviewer
service (services/reviewer/) as the other surface enforcing the same
Critic Constitution. References ADR-005 (sub-interface shape) and ADR-006
(bot identity) for the shared constraint contract.

No protocol changes — Mode 1/Mode 2 mechanics, severity classification,
anchor-validation discipline, and the tools list are byte-identical.
Closes the last documentation gap on mt#1073's reviewer.md success
criterion (Chinese-wall constraints were already mechanically encoded).
@minsky-ai minsky-ai Bot added the authorship/co-authored Co-authored by human and AI agent label May 6, 2026
Copy link
Copy Markdown

@minsky-reviewer minsky-reviewer Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Independent adversarial review (Chinese-wall)
Reviewer: minsky-reviewer[bot] via openai:gpt-5
Tier: unknown


Docs-only PR adding a cousin-note to .claude/agents/reviewer.md that links the local reviewer subagent to the deployed Railway reviewer service. The change is narrowly scoped and aligns with the stated intent. I noted several non-blocking documentation nits: potential ambiguity around “read-only file access” given the Bash tool presence, plain-code cross-references that could be clickable links, a minor phrasing improvement, a qualification around “differ only in how they are invoked,” and a process note on brittle line-number-based acceptance checks. No blocking issues for security, behavior, scope, or licensing were found. Recommend merging after optional wording/linking polish.

Findings

  • [NON-BLOCKING] .claude/agents/reviewer.md:19 — “Read-only file access” claim may be misleading given Bash tool is present
    The new cousin-note asserts both surfaces share "read-only file access" constraints, but this subagent's frontmatter tools includes Bash alongside Read/Glob/Grep (see .claude/agents/reviewer.md frontmatter), and the body later allows running commands for smoke/adoption sweeps. While the narrative also says "Cannot modify code — posting a GitHub review is an allowed write (Mode 2 only)", the presence of Bash could allow unintended mutations if not explicitly scoped.

Suggestion: tighten the wording to "no workspace mutations; commands must be read-only" and/or explicitly state that any Bash usage must avoid file-modifying operations. If the Railway service enforces additional write guards, note them for parity so the "same constraints" claim remains accurate.

  • [NON-BLOCKING] .claude/agents/reviewer.md:19 — Cross-references rely on services/reviewer/ and ADR docs existing; linkability not verified in this doc
    The block-quote references services/reviewer/, docs/architecture/adr-005-forgebackend-subinterfaces.md, and docs/architecture/adr-006-agent-identity.md. While these paths exist in-repo (verified), they are plain code ticks rather than clickable relative links. For discoverability, consider converting to Markdown links: [services/reviewer/](/services/reviewer/), [ADR-005](../architecture/adr-005-forgebackend-subinterfaces.md), [ADR-006](../architecture/adr-006-agent-identity.md).

Non-blocking docs nit, but improves navigation for readers landing in this file.

  • [NON-BLOCKING] .claude/agents/reviewer.md:30 — AT2/AT3 in PR description assert byte-identical lines; this file’s inserted block shifts line numbers and may invalidate those checks
    The PR description claims acceptance tests verified specific post-shift line numbers (e.g., Mode 1 hard guard sentence remains byte-identical at its post-shift line). While the content indeed appears unchanged, the tests as described depend on absolute line numbers that will drift with future edits. Consider anchoring such checks to unique substrings rather than fixed line numbers in any automated acceptance scripts to avoid brittle tests.

Marking as a process/docs note; no code change required here.

  • [NON-BLOCKING] .claude/agents/reviewer.md:21 — Security/permissions nuance: stating “no implementer-session memory” without clarifying telemetry/logging may overpromise
    The cousin note lists "no implementer-session memory" as a constraint. Elsewhere in the repo, reviewers may log findings or errors to service logs (e.g., the Railway service). To avoid overpromising, consider qualifying this as "no access to implementer user memory or prior chat context; transient service logs may exist but do not feed into review judgments." This preserves the Chinese-wall claim while being precise about operational telemetry.

Inline comments

  • .claude/agents/reviewer.md:23 — Tiny phrasing nit: "service in services/reviewer/ — webhook-driven, runs on PR events" might read smoother as "webhook-driven service that runs on PR events". Consider tightening for readability.
  • .claude/agents/reviewer.md:28 — Potential ambiguity: "differ only in how they are invoked" — the Railway service also differs in posting identity and operational context (headless, webhook-triggered). Maybe: "share the same Constitution and review protocol; they differ only in invocation mode and runtime surface (in-conversation vs webhook service posting as minsky-reviewer[bot])."

Tightens the cousin-surface note in .claude/agents/reviewer.md per the
4 NON-BLOCKING + 2 inline findings from minsky-reviewer bot R1 review:

- read-only file access wording corrected to acknowledge that Bash is in
  the tools list while pinning the actual mutation envelope: no workspace
  mutations beyond the sanctioned review-posting write, with Bash scoped
  to read-only commands.
- no implementer-session memory wording made precise: no access to the
  implementer agents prior conversation or working state. This describes
  the isolation surface without overpromising vs operational logging.
- Plain-code refs converted to relative Markdown links: services/reviewer
  link, ADR-005 link, ADR-006 link.
- Inline 1: webhook-driven, runs on PR events tightened to webhook-driven
  service that runs on pull_request events.
- Inline 2: differ only in how they are invoked replaced with differ only
  in invocation mode and runtime surface, plus adds the minsky-reviewer
  bot identity dimension in sentence 2.
- evidence-based output strengthened to evidence-based output anchored
  to source, mirroring the bodys read-source-before-flag rule.

Still 4 sentences, still satisfies SC 1. Tools list and all protocol
mechanics (Mode 1/2, severity, anchor validation) remain byte-identical.
Copy link
Copy Markdown

@minsky-reviewer minsky-reviewer Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Independent adversarial review (Chinese-wall)
Reviewer: minsky-reviewer[bot] via openai:gpt-5
Tier: unknown


Docs-only change adding a clear cousin-surface note to .claude/agents/reviewer.md. The new block is accurate and improves orientation, and the referenced ADRs and services/reviewer/ path exist. I flagged a few non-blocking items: the statement that Bash is for read-only commands is policy but not mechanically enforced here; consider explicitly forbidding writeful commands and clarifying smoke-test constraints. Wording could better acknowledge identity/runtime context differences and telemetry nuances, and link styling for services/reviewer/ can be normalized. Also, acceptance checks relying on absolute line numbers are brittle. No blocking issues for this docs-only PR; proceed as-is if desired.

Findings

  • [NON-BLOCKING] .claude/agents/reviewer.md:27 — “Bash is for read-only commands” is policy but not mechanically enforced here
    The new cousin note constrains Bash usage to read-only operations ("Bash is for read-only commands"), which is the right intent, but nothing in this agent definition enforces that mechanically. Given later sections instruct running smoke tests, please explicitly forbid filesystem/network mutations (e.g., git add/commit, bun test --updateSnapshot, rm -rf, curl -X POST), and state that any smoke test must be side-effect-free or skipped. Consider adding a one-liner guard in the body near smoke-test guidance to clarify that writeful commands are out-of-bounds for this surface.
  • [NON-BLOCKING] .claude/agents/reviewer.md:20 — Cross-references are present but could be clickable relative links for navigation
    The new block links to services/reviewer/, ADR-005, and ADR-006. While the ADRs are linked with Markdown, services/reviewer/ is presented in code ticks inside a Markdown link, which renders fine but is unconventional; elsewhere we often use [services/reviewer/](../../services/reviewer/) without backticks. Consider normalizing the style and verifying all three paths render as clickable links in GitHub preview for consistency.
  • [NON-BLOCKING] .claude/agents/reviewer.md:31 — “They differ only in invocation mode and runtime surface” may understate identity/context differences
    The text claims the two surfaces "differ only in invocation mode and runtime surface." The paragraph does mention posting under minsky-reviewer[bot], but consider acknowledging that the deployed service also differs in posting identity and execution context (headless, webhook-triggered, potentially different environment/network policies) to avoid readers inferring absolute equivalence beyond the Critic Constitution.
  • [NON-BLOCKING] .claude/agents/reviewer.md:19 — Prior acceptance tests in PR description rely on absolute line numbers; this insertion invalidates stability of that approach
    Your PR description’s AT2/AT3 talk about verifying post-shift exact line numbers remaining byte-identical. With this block inserted, those line indices will drift again on the next doc change. If you have any scripts that enforce these, consider anchoring checks to unique substrings instead of absolute line numbers to avoid brittle tests. This is a process/doc concern only; no code change required.
  • [NON-BLOCKING] .claude/agents/reviewer.md:26 — “No access to the implementer agent’s prior conversation or working state” could clarify telemetry/logging exceptions
    To avoid overpromising, consider qualifying the "no implementer-session memory" claim with an operational note like "transient service logs/metrics may exist but do not feed into review judgments" — especially since the service surface is webhook-driven and typically logs request metadata. This preserves the Chinese-wall guarantee while being precise about runtime telemetry.

Inline comments

  • .claude/agents/reviewer.md:22 — Minor: We now state the service is triggered on pull_request events — if it also reacts to pull_request_target or review-comment webhooks for retries/threading, consider broadening the phrasing to "GitHub webhook events (e.g., pull_request)" to avoid over-specificity. If pull_request is the only one by design, ignoring this is fine.
  • .claude/agents/reviewer.md:18 — Nice addition. For parity with the Constitution claims, maybe append a sentence like: "Both surfaces run the same review protocol; the service enforces anchor validation and severity rules identically via the shared aggregator." That would reassure readers that semantics, not just philosophy, are shared.

Copy link
Copy Markdown
Contributor Author

@minsky-ai minsky-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: mt#1603 reviewer.md cousin-note

CI status: PASS — 2/2 checks succeeded (build, Prevent Placeholder Tests).

Summary

Docs-only PR adding a 16-line block-quote note to .claude/agents/reviewer.md that names the deployed Railway reviewer service as the cousin surface enforcing the same Critic Constitution. Diff is one file, +16 lines, 0 deletions; current cumulative diff reflects R1 review feedback applied. All four spec success criteria are met and all three acceptance tests pass. Zero blocking findings; the two prior minsky-reviewer[bot] review rounds (R1 + R2) raised only NON-BLOCKING wording/styling nits, R1 findings were addressed in commit e04b372bf, R2 raised follow-on nits in the same areas (bikeshedding pattern per feedback_reviewer_bot_self_reversal_signal), and R2 explicitly states "no blocking issues for this docs-only PR; proceed as-is if desired."

Spec verification

Task: mt#1603

Criterion Status Evidence
Note ≤ 4 sentences, names Railway service as cousin, identifies Critic Constitution as shared contract Met .claude/agents/reviewer.md:19-33. 4-sentence block-quote: opens "one of two surfaces enforcing the same Critic Constitution", names "deployed Railway reviewer service in services/reviewer/".
Links to services/reviewer/ and ADR-005 + ADR-006 Met .claude/agents/reviewer.md:21 ([services/reviewer/](../../services/reviewer/)), :24 (ADR-005), :25 (ADR-006). All three are clickable relative links.
Note placed near top (after frontmatter, before Mode-selection rules) Met Frontmatter closes at line 17; cousin-note block occupies lines 19-33; "You are a code review analyst" opening shifts to line 35; Mode-selection bullets follow.
No protocol changes — Mode 1/2 mechanics, severity rules, anchor-validation discipline, tools list byte-identical Met parsedDiff shows a single hunk with insertions only (no deletions). Tools list at lines 11-13 untouched. Mode 1 hard-guard sentence ("Mode 1 hard guard: never call mcp__minsky__session_pr_review_submit yourself") preserved at its post-shift position.

Acceptance tests

Test Result Evidence
services/reviewer/ literal within first 30 lines after frontmatter Pass Appears at line 21 (within line range 18-47).
Mode 1 hard-guard sentence still appears unchanged Pass Confirmed by reading the file post-edit.
Tools list in frontmatter byte-identical Pass Diff confirms no edits to lines 1-17.

Documentation impact

Updated .claude/agents/reviewer.md in this PR. No other documentation needs updating — the cousin-note content is self-contained and doesn't change architectural behavior described in docs/architecture.md, docs/theory-of-operation.md, CONTRIBUTING.md, or README.md. The ADR cross-references in the note point at existing canonical docs (ADR-005, ADR-006) which require no edits.

Notes on prior bot-reviewer iteration

  • R1 (commit 8e60222dd): 4 NON-BLOCKING + 2 inline. Findings: tighten "read-only file access" wording, convert plain-code refs to clickable links, qualify "no implementer-session memory", phrasing nit, "differ only in how they are invoked" missing dimensions. All addressed in R1→R2 commit e04b372bf.
  • R2 (commit e04b372bf): 5 NON-BLOCKING + 2 inline. Same areas R1 flagged, asking for incremental refinement (e.g. R1 "tighten read-only file access" → R2 "Bash policy not mechanically enforced here"; R1 "make refs clickable" → R2 "code ticks inside Markdown link is unconventional"). Both rounds explicitly green-light merge ("Recommend merging after optional wording/linking polish" / "proceed as-is if desired").

Per feedback_bot_pr_convergence_via_bypass: self-authored bot PRs cannot reach APPROVE in-tool because of GitHub's self-approval block plus the same-App-identity Chinese-wall constraint. R1+R2 substantive iteration is complete; R3+ is low-signal bikeshedding. Convergence is reached.

(Had Claude look into this — AI-assisted review.)

@minsky-ai minsky-ai Bot added the review:bot-commented Reviewer bot has commented on this PR (informational) label May 6, 2026
@edobry edobry merged commit 956566a into main May 6, 2026
2 checks passed
@edobry edobry deleted the task/mt-1603 branch May 6, 2026 18:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

authorship/co-authored Co-authored by human and AI agent review:bot-commented Reviewer bot has commented on this PR (informational)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant