fix(mt#1122): Grant reviewer subagent MCP tools for Chinese-wall posting#720
Conversation
The reviewer agent was defined for large-PR sectioning only (Mode 1), but the Chinese-wall review practice introduced later requires it to fetch context and post findings independently (Mode 2). Changes: - .claude/agents/reviewer.md: add three MCP tools to allowlist (session_pr_review_context, session_pr_review_submit, tasks_spec_get); update description; add Mode 2 protocol section for whole-PR reviews - memory/feedback_review_before_asking.md: update to reflect reviewer now posts directly via session_pr_review_submit (gap closed)
There was a problem hiding this comment.
Independent adversarial review (Chinese-wall)
Reviewer: minsky-reviewer[bot] via openai:gpt-5
Tier: unknown
Findings
-
[BLOCKING] Spec/diff mismatch: memory file update missing
- File: memory/feedback_review_before_asking.md
- Evidence: The PR description claims this file was updated “to reflect reviewer now posts directly,” but the provided diff shows no changes to this path. This leaves stale guidance in place, creating conflicting instructions about who posts reviews and undermining the stated goal of the PR.
-
[BLOCKING] Ambiguous mode switching can cause unintended direct posting
- File: .claude/agents/reviewer.md:8–15, 31–41
- Evidence: The agent “operates in two modes depending on what the parent agent gives you.” Mode 2 triggers when “Parent gives you a task ID or PR number” (line ~35). Many Mode 1 invocations pass a PR number for context alongside a diff path/range. With this change, the agent may self-select Mode 2 and post a full review directly, bypassing the parent’s aggregator and risking double-posts or premature posting. There is no explicit mode flag or precedence rule to disambiguate; selection is inferred solely from presence of a task ID/PR number, which is too loose.
-
[BLOCKING] Input type inconsistency: “task ID or PR number” vs tool contract
- File: .claude/agents/reviewer.md:35, 39–41
- Evidence: The Mode 2 input says “task ID (e.g., mt#847) or PR number” (line ~35), but Step 1 then instructs to call mcp__minsky__session_pr_review_context “with the task ID” (line ~39). If the parent supplies only a PR number, the tool may reject it (schema mismatch) or fetch the wrong record. The contract needs to define accepted parameter(s) and how to map PR number to task ID (or support both in the tool).
-
[BLOCKING] Tool name mismatch in instructions vs allowlist
- File: .claude/agents/reviewer.md:14
- Evidence: The body refers to “session_pr_review_submit” (bare) as the tool to call, while the allowlisted tool name is “mcp__minsky__session_pr_review_submit” (frontmatter tools line ~4) and all other references use the fully prefixed name (lines ~39, ~43). This inconsistency is a likely source of tool selection errors by the LLM (attempting to call a non-existent alias).
-
[BLOCKING] Event selection relies on unspecified identity introspection
- File: .claude/agents/reviewer.md:45–49
- Evidence: Guidance says “Use event: 'COMMENT' when you are reviewing a PR authored by the same bot identity,” but does not specify how the subagent can determine its own GitHub identity vs the PR author. session_pr_review_context may provide the author identity, but the agent also needs knowledge of its own identity for comparison. Without that, the agent may mistakenly try APPROVE on a self-authored PR (which GitHub will reject) or over-conservatively COMMENT. This is an undocumented assumption that affects correctness of event selection.
-
[BLOCKING] Event enum values and payload schema are undocumented
- File: .claude/agents/reviewer.md:45–49
- Evidence: The instruction uses uppercase strings "COMMENT", "REQUEST_CHANGES", "APPROVE" for the event. There is no reference to the MCP tool schema for mcp__minsky__session_pr_review_submit, and no example payload. If the tool expects different casing or field names (e.g., review_event, body, task_id), calls will fail at runtime. Given this is a cross-agent behavioral change (now posting directly), schema drift here is high-impact.
-
[NON-BLOCKING] Mode 2 can exceed model context limits on large PRs
- File: .claude/agents/reviewer.md:39–43
- Evidence: Mode 2 fetches “PR metadata, diff, CI check runs, and the task spec in a single call” and then instructs a whole-PR review in one run. For large PRs, this likely exceeds the context window for model: sonnet. There is no guidance to page/chunk the diff or to iteratively process sections, risking truncation or incomplete analysis.
-
[NON-BLOCKING] Incomplete failure handling for missing spec/context
- File: .claude/agents/reviewer.md:39–43
- Evidence: Step 2 says to fall back to mcp__minsky__tasks_spec_get if spec missing, but there’s no instruction for when both APIs fail, or if the returned spec is inconsistent with the PR. The agent should be told how to proceed (e.g., mark spec criteria N/A and continue; choose COMMENT vs REQUEST_CHANGES) to avoid hanging or silent failure.
-
[NON-BLOCKING] Privilege escalation not accompanied by concurrency/duplication guardrails
- File: .claude/agents/reviewer.md:3–5, 14, 43
- Evidence: The agent moves from read-only reporting to directly posting reviews. There is no mention of rate limiting, idempotency, or coordination to prevent multiple reviewer subagents from posting to the same PR simultaneously in Mode 2, which could spam the PR with duplicate reviews.
-
[NON-BLOCKING] Naming clarity
- File: .claude/agents/reviewer.md:3–5, 14, 39–43
- Evidence: The frontmatter description and body alternate between bare and fully prefixed tool names. Unify references to exact tool names as listed in the allowlist to reduce LLM confusion.
-
[PRE-EXISTING] Bash tool vs “Cannot modify code”
- File: .claude/agents/reviewer.md:4, 14
- Evidence: The agent includes Bash in tools but the body forbids modifying code. This policy/tooling mismatch predates this PR, but remains a footgun for accidental file edits or git operations during review.
Spec verification table
- Frontmatter includes all three MCP tools (mcp__minsky__session_pr_review_context, mcp__minsky__session_pr_review_submit, mcp__minsky__tasks_spec_get): Met (see .claude/agents/reviewer.md:4)
- Agent body describes both Mode 1 (sectioning) and Mode 2 (Chinese-wall): Met (see .claude/agents/reviewer.md:10–15, 31–49)
- Memory file no longer references the posting gap or mt#1122 workaround: Not Met (no diff for memory/feedback_review_before_asking.md)
- Confirm no other agent definitions or hooks were disturbed: N/A in this review; not verifiable from provided diff
- On next bot-authored PR, dispatch reviewer subagent in Mode 2 and confirm it posts directly: N/A (post-merge operational test)
Documentation impact
- Yes. This PR changes reviewer agent behavior and privileges. Update:
- docs on Chinese-wall review workflow to include Mode 2 invocation and event selection rules
- the parent “review-pr” skill/runbook to pass an explicit mode flag to avoid accidental Mode 2
- tool schema documentation for mcp__minsky__session_pr_review_context and mcp__minsky__session_pr_review_submit with exact parameter names and event enums
- memory/feedback_review_before_asking.md to remove outdated workaround references (currently missing from diff)
- governance/permissions notes to reflect the reviewer’s new ability to post reviews directly
REQUEST_CHANGES
There was a problem hiding this comment.
Review: PR #720 (mt#1122) — reviewer agent MCP tool grant
Files reviewed: 5 (.claude/agents/reviewer.md, feedback_review_before_asking.md, .claude/skills/review-pr/SKILL.md, plus confirmation of three other agent files unchanged)
Findings
No issues found.
Checked and clear
.claude/agents/reviewer.md — all four changes confirmed:
tools:line now readsRead, Glob, Grep, Bash, mcp__minsky__session_pr_review_context, mcp__minsky__session_pr_review_submit, mcp__minsky__tasks_spec_get(seven tools total)description:rewritten from "Read-only code review agent for analyzing diff sections..." to the Mode 2-aware variant with "Cannot modify code — posting a GitHub review is an allowed write" clarification- Body restructured with Mode 1 / Mode 2 sections, each with separate Input and Protocol subsections
- Mode 2 Protocol includes correct event-selection guidance (
COMMENTfor same-identity,REQUEST_CHANGESfor blocking,APPROVEonly when not author) - "What to check", severity classification, output format, and anti-patterns sections from the original prompt preserved intact
feedback_review_before_asking.md memory — updated correctly. No mention of any "main agent posts" workaround or mt#1122 gap. Line 11 now states the reviewer-posts-directly flow: "The reviewer agent's tool allowlist includes session_pr_review_context, session_pr_review_submit, tasks_spec_get — it fetches context and posts in one self-contained run." Consistent with the agent file changes.
.claude/skills/review-pr/SKILL.md — no changes. The skill still correctly references the reviewer agent as a dispatcher for large PRs (Mode 1), with its own event-selection guidance in §7. No Mode 2 content needed here since the skill is the parent-side flow, not the reviewer-internal protocol.
Other agent files — refactor.md, verify-completion.md, fixture.md all unchanged. No scope creep.
Spec verification
Task: mt#1122 (Option A)
| Criterion | Status | Evidence |
|---|---|---|
Add session_pr_review_context to reviewer tools: |
Met | reviewer.md:4 |
Add session_pr_review_submit to reviewer tools: |
Met | reviewer.md:4 |
Add tasks_spec_get to reviewer tools: |
Met | reviewer.md:4 |
Update description: to reflect review-posting |
Met | reviewer.md:3 — "posting a GitHub review is an allowed write" |
| Body covers Mode 1 + Mode 2 | Met | two protocol sections with distinct Input / Protocol blocks |
| Memory reflects new direct-post flow | Met | feedback_review_before_asking.md:11 |
| No other agent definitions touched | Met | refactor.md, verify-completion.md, fixture.md unchanged |
Documentation impact
No update needed — this is an agent configuration change and a memory update. No architectural docs, theory-of-operation, CONTRIBUTING, or README affected.
(Had Claude look into this — Chinese-wall review retry, AI-assisted)
…ol-name consistency, event-selection fallback
There was a problem hiding this comment.
Independent adversarial review (Chinese-wall)
Reviewer: minsky-reviewer[bot] via openai:gpt-5
Tier: unknown
Findings
[BLOCKING] .claude/agents/reviewer.md:12–15 vs 49–53 — Contradictory Mode 2 trigger guidance
- At lines 12–13: “Mode 2 if the parent provides a task ID or PR number AND no diff file path.”
- At lines 49–53: “If the parent gives you only a bare PR number, ask the parent to resolve it to a task ID… do not attempt to map PR number to task ID yourself.”
- These two statements conflict. One says a PR number alone (no diff) is sufficient to enter Mode 2; the other says a PR number alone is not sufficient and should be rejected. This ambiguity can cause inconsistent mode selection and execution paths.
[BLOCKING] .claude/agents/reviewer.md:56 and 60 — Mode 2 return semantics conflict with Chinese-wall requirement
- Line 56: “If both … fail, submit a COMMENT review … and return BLOCKING-FOUND.”
- Line 60: “Do not return findings to the parent for posting; post them yourself.”
- Mode 2 is defined to keep findings isolated and to post directly, yet the instruction explicitly tells the agent to “return BLOCKING-FOUND.” That reintroduces a parent-visible signal and is undefined elsewhere (no consumer for a BLOCKING-FOUND sentinel is documented). This is a protocol incoherence that will cause orchestration bugs or leak status across the wall.
[BLOCKING] PR description vs diff — Missing change to memory/feedback_review_before_asking.md
- The PR summary and test plan state the memory file was updated to remove the “main-agent-posts” workaround. The provided diff contains no changes to memory/feedback_review_before_asking.md. This is a spec/diff mismatch and leaves the system-level guidance inconsistent with the new Mode 2 behavior.
[NON-BLOCKING] .claude/agents/reviewer.md:58 — Escalation path for “200+ files” threshold is undefined
- Line 58 instructs: “For large PRs (200+ files), request Mode 1 sectioning from the parent…” There is no defined mechanism in this agent’s protocol for initiating that escalation (who to call, what tool to use, or what output signal triggers the parent to re-dispatch). Without a concrete handshake, this guidance is likely to be ignored by the orchestrator.
[NON-BLOCKING] .claude/agents/reviewer.md:4, 60, 70 — Unvalidated assumptions about MCP tool schemas and parameters
- The doc adopts exact param names for mcp__minsky__session_pr_review_context (“task:”, “sessionId:”, line 49–50) and mcp__minsky__session_pr_review_submit (“task”, “body”, “event”, “comments[]”, line 70) without linking to or validating against the actual tool schema in-repo. If any names differ (e.g., taskId vs task, review_comments vs comments), runtime failures will occur. Please link to the authoritative schema or reflect it verbatim to prevent drift.
[NON-BLOCKING] .claude/agents/reviewer.md:68 — Retry guidance may cause unbounded retries
- “If a submit call is rejected with ‘Review cannot request changes on your own pull request’ or similar, retry with event: ‘COMMENT’.” If the rejection reason differs or the call repeatedly fails for another reason, this could prompt repeated attempts without a bound or backoff strategy. Consider specifying a single retry with a downgrade to COMMENT, then stop.
[NON-BLOCKING] .claude/agents/reviewer.md:4, 12–21 — No technical guard preventing Mode 1 from posting directly
- The posting-capable MCP tools are now on the allowlist for all modes. The doc says “never post directly unless the parent clearly requested Mode 2,” but there’s no technical enforcement (e.g., an execution-time assertion based on input shape). This increases risk of accidental direct posting from a Mode 1 dispatch and system-level incoherence with the aggregator.
[NON-BLOCKING] .claude/agents/reviewer.md:10 — Wording “input shape, not presence of identifiers” conflicts with later identifier-based rules
- The doc claims selection is based on “input shape, not presence of identifiers,” yet Mode 2 relies on presence of a task or PR identifier to trigger. Clarify to avoid implementer confusion.
[NON-BLOCKING] .claude/agents/reviewer.md:65 — Assumption that pr.author is exposed in context response
- The event-selection guidance depends on pr.author being returned by mcp__minsky__session_pr_review_context. If that field isn’t provided, the guidance is unusable. Consider explicitly stating the required fields from the context response or adding a fallback rule.
Spec verification table
- Frontmatter includes all three MCP tools (mcp__minsky__session_pr_review_context, mcp__minsky__session_pr_review_submit, mcp__minsky__tasks_spec_get): Met
- Agent body describes both Mode 1 (sectioning) and Mode 2 (Chinese-wall): Met
- Event-selection guidance for Mode 2 included: Met
- Memory file no longer references the posting gap or mt#1122 workaround: Not Met (no change present in diff for memory/feedback_review_before_asking.md)
- Confirm no other agent definitions or hooks were disturbed: N/A (cannot verify from this diff alone)
- On next bot-authored PR: dispatch reviewer subagent in Mode 2 and confirm it posts directly: N/A (manual/integration step)
Documentation impact
- The contradictions and undefined handshakes in reviewer.md (mode triggers, return semantics, escalation path) require edits to this doc before merging.
- If memory/feedback_review_before_asking.md is intended to steer orchestration away from the old “main-agent-posts” workaround, that doc must be updated in this PR to avoid process-level confusion.
- Consider adding links or references to the authoritative MCP tool schemas to prevent parameter-name drift.
Event: REQUEST_CHANGES
…tocol return signal)
There was a problem hiding this comment.
Incremental review: PR #720 (mt#1122) — commit bde2d6a01
Covers only changes since the prior review (3c109d437).
Changes in this commit
Addresses two legitimate contradictions flagged by the adversarial reviewer:
-
Mode 2 trigger disambiguation — removed "or PR number" from the Mode 2 trigger description. Trigger is now "task ID (or session ID) AND no diff file path." Consistent with the existing Mode 2 Input section that rejects bare PR numbers.
-
Protocol return-signal coherence — removed the "return BLOCKING-FOUND" sentinel from the context-fetch failure path in step 1. Instructions now say "submit a COMMENT review documenting the context-fetch failure, then stop. Do not return findings to the parent — Mode 2 is self-contained." Matches step 5's "post them yourself" directive. No parent-visible signal from Mode 2, per Chinese-wall principle.
Note on unaddressed finding
The bot's other blocking finding — "Spec/diff mismatch: memory file update missing from diff" — is structurally unfixable: feedback_review_before_asking.md lives at ~/.claude/projects/... (Claude user-memory directory), not under git. It cannot appear in a PR diff. The PR body already documents this; the memory file WAS updated (verified by reading it directly). The adversarial reviewer's model has no visibility into paths outside the repo diff, which is a known limitation being tracked separately. This finding will recur on any PR that updates user-level memory alongside repo code.
Documentation impact
No update needed — doc-only fix to reviewer agent definition; no architectural or workflow change beyond the Mode 2 clarifications already in the underlying PR.
(Had Claude look into this — incremental review after bot iteration, AI-assisted)
Dismissed: two legitimate contradictions (Mode 2 trigger, protocol return-signal) addressed in bde2d6a. Third finding (memory-file-not-in-diff) is structurally unfixable — that file lives outside git in the Claude user-memory directory. Documented in PR body.
Summary
The gap: The
revieweragent was defined for the original large-PR sectioning mode (Read, Glob, Grep, Bashonly). The Chinese-wall review practice — introduced later — requires the reviewer to fetch PR context and post findings independently, but the agent lacked the MCP tools to do so. The main agent had to copy-paste findings, breaking the Chinese wall at posting time.Option A chosen: Grant the three MCP tools needed for self-contained Chinese-wall reviews, preserving full context isolation. The reviewer fetches context and posts in one self-contained run without returning findings to the parent.
What changed per file:
.claude/agents/reviewer.md: addedmcp__minsky__session_pr_review_context,mcp__minsky__session_pr_review_submit, andmcp__minsky__tasks_spec_getto thetools:allowlist; updateddescription:; added Mode 2 protocol section (Chinese-wall whole-PR review) alongside the pre-existing Mode 1 (large-PR sectioning); updated event-selection guidance for Mode 2memory/feedback_review_before_asking.md(user global config): updated to reflect reviewer now posts directly — removed reference to mt#1122 gap and the main-agent-posts workaroundTest plan
.claude/agents/reviewer.mdfrontmatter includes all three MCP toolsReviewer-bot false positive note
The minsky-reviewer[bot] flagged a missing
memory/feedback_review_before_asking.mdupdate in the diff. This is a false positive: that file lives in the Claude user-memory directory (~/.claude/projects/-Users-edobry-Projects-minsky/memory/), which is not tracked by git. The update was made, but it cannot appear in any PR diff because it is not a repo-tracked file. Future reviewer-bot runs may flag this class of concern repeatedly unless the bot is taught to distinguish user-memory files from repo-tracked files.🤖 Generated with Claude Code — had Claude implement this