Skip to content

feat(mt#1340): Update reviewer prompt to emit line-anchored comments[] from parsedDiff#886

Merged
edobry merged 7 commits into
mainfrom
task/mt-1340
May 1, 2026
Merged

feat(mt#1340): Update reviewer prompt to emit line-anchored comments[] from parsedDiff#886
edobry merged 7 commits into
mainfrom
task/mt-1340

Conversation

@minsky-ai
Copy link
Copy Markdown
Contributor

@minsky-ai minsky-ai Bot commented Apr 30, 2026

Summary

Updates .claude/agents/reviewer.md and .claude/skills/review-pr/SKILL.md to consume parsedDiff (available from session_pr_review_context per mt#1336) and emit line-anchored comments[] entries for every location-bearing finding, instead of putting them as Markdown text in the review body.

Motivation & Context

Previously, all findings were placed in the review body as Markdown text with file:line references. GitHub's inline comment UI (the red-box thread pinned to specific diff lines) is the primary surface reviewers read — body-only findings are buried and easy to miss. This change routes location-bearing findings through comments[] so they appear inline in the diff.

See mt#1337 for the multi-line range capability (startLine + startSide) that this change leverages.

Design/Approach

  • Reserve the review body for: executive summary, spec-verification table, CI status, cross-cutting concerns, and findings that failed anchor validation.
  • Each inline comment carries a severity prefix ([BLOCKING] / [NON-BLOCKING]) so downstream tooling can classify without parsing the body.
  • Anchor validation is required before submitting: invalid anchors cause GitHub to 422-reject the entire review (all comments, not just the bad one). The protocol is: look up (path, line, side) in parsedDiff before building the entry; if not found, fall back to the body under "Unanchored findings".
  • CONTEXT lines (unchanged context in the diff) must be mapped to LEFT or RIGHT before being used as GitHub anchors — "CONTEXT" is not a valid GitHub side value.

Key Changes

.claude/agents/reviewer.md

  • Mode 2 Protocol: added anchor-validation step (step 4) that walks parsedDiff to confirm each (path, line, side) before building a comments[] entry
  • Added parsedDiff shape section documenting DiffFile, DiffHunk, DiffLine fields relevant to anchor selection
  • Added side-mapping rule table: RIGHT/LEFT/CONTEXT → GitHub side values + which line number field to use
  • Added "Structured finding output and comments[]" section: rule that location-bearing findings MUST be comments[] entries, severity-prefix requirement, ReviewComment interface, GitHub 422-rejection constraint for mismatched startSide/side
  • Added three worked examples: single-line NON-BLOCKING RIGHT, multi-line BLOCKING RIGHT range (with startLine/startSide), single-line BLOCKING LEFT deletion
  • Mode 1 output format: extended to include a comments[] JSON block for parent aggregation
  • Mode 2 review body format: updated to reflect body-vs-comments split (summary, spec table, CI status, cross-cutting, unanchored)
  • Anti-patterns: added CONTEXT side-mapping, anchor validation, and body-only-findings anti-patterns

.claude/skills/review-pr/SKILL.md

  • Step 7: updated comments[] shape to include startLine?/startSide?, added anchor validation requirement and side-mapping table
  • Step 8: updated review body format to match new body-vs-comments split

Testing

This task modifies prompt text for the reviewer subagent — there are no unit tests to run. Verification expectation per spec: the next reviewer-bot review on a non-trivial PR should post anchored comments[] entries for each location-bearing finding, with the body reserved for summary content, spec-verification table, CI status, and cross-cutting concerns.

Ancillary Changes

None — the changes are strictly confined to the two prompt/skill files listed in the spec's "Files of interest."

…] from parsedDiff

- Add parsedDiff shape documentation to reviewer.md (DiffFile/DiffHunk/DiffLine types)
- Add side-mapping rule table (RIGHT/LEFT/CONTEXT -> GitHub side values + line number selection)
- Add structured comments[] output rule: every location-bearing finding must be a comments[] entry
- Reserve review body for executive summary, spec-verification table, CI status, cross-cutting concerns
- Add anchor validation step in Mode 2 Protocol: validate (path, line, side) against parsedDiff before building comment
- Add three worked examples: single-line NON-BLOCKING RIGHT, multi-line BLOCKING RIGHT range, single-line BLOCKING LEFT deletion
- Update Mode 1 output format to include comments[] JSON block alongside findings text
- Update review body format for Mode 2 to reflect new body-vs-comments split
- Update anti-patterns section with CONTEXT side-mapping and anchor validation rules
- Update review-pr/SKILL.md step 7 with matching anchor validation, side-mapping table, and body format update
@minsky-ai minsky-ai Bot added the authorship/co-authored Co-authored by human and AI agent label Apr 30, 2026
Copy link
Copy Markdown

@minsky-reviewer minsky-reviewer Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Independent adversarial review (Chinese-wall)
Reviewer: minsky-reviewer[bot] via openai:gpt-5
Tier: unknown


⚠️ Reviewer did not emit a conclude_review call. Event derived from severity counts: REQUEST_CHANGES (2 BLOCKING / 3 NON-BLOCKING / 0 PRE-EXISTING findings). Executive summary unavailable.

Findings

  • [BLOCKING] .claude/skills/review-pr/SKILL.md:1 — Step 1 does not mention parsedDiff even though later steps require it
    In the section "1. Gather context", the text says the session tool returns "PR metadata, CI check runs, the diff, and the task spec" but omits parsedDiff. However, later guidance in this same file (anchor validation and side-mapping) depends on parsedDiff being present. This is inconsistent with .claude/agents/reviewer.md which explicitly documents parsedDiff availability from mcp__minsky__session_pr_review_context. Please update Step 1 to include parsedDiff in the returned payload to align with the new protocol and avoid implementers missing this requirement.
  • [BLOCKING] .claude/agents/reviewer.md:245 — Malformed markdown fence uses four backticks, breaking the rendered format
    At the start of the "Output format — Mode 1 MANDATORY" example block, the fence opens with four backticks () but closes with four as well inside nested code blocks. The opening fence directly after the sentence "return a structured object..." uses which likely mis-nests with the inner triple-backtick JSON fence. This risks breaking rendering or confusing the agent's parsing. Standardize outer fences to triple backticks and escape or indent inner code blocks appropriately to maintain clarity.
  • [NON-BLOCKING] .claude/agents/reviewer.md:114 — Ambiguity: how to validate multi-line ranges across hunks or when start/end lines are not contiguous
    The anchor validation step describes confirming a DiffLine exists with the target line number, but multi-line comments (with startLine/line) may span only within a single hunk. GitHub disallows ranges across separate hunks. The guidance should state that both start and end anchors must exist within the same DiffHunk and on the same side; otherwise, fallback to single-line or body. Add this constraint to avoid 422 rejections.
  • [NON-BLOCKING] .claude/agents/reviewer.md:169 — Missing guidance for handling deleted files (status: "deleted") during anchor validation
    parsedDiff documents status: "deleted", but the anchor-validation and side-mapping sections don't state that only LEFT-side anchors are possible for deleted files. Adding a note to skip RIGHT anchors and to validate against oldLine only would prevent invalid anchors on deleted files.
  • [NON-BLOCKING] .claude/agents/reviewer.md:81 — Potential sequencing confusion between steps 4–6 in Mode 2
    You added step 4 (Anchor-validate findings), step 5 (Verify against task spec), step 6 (Post). However, practical flow is: analyze diff → produce findings → verify anchors → build comments → select event → post. Consider adding an explicit step ordering note that anchor validation applies to each location-bearing finding just before constructing comments[], to avoid agents performing validation too early or losing association with built findings.

Inline comments

  • .claude/agents/reviewer.md:233 — Consider explicitly stating that file.oldPath should be used for evidence text when commenting on renames, but anchors must target file.path. Right now we note oldPath exists but don't guide how to handle rename anchors.
  • .claude/skills/review-pr/SKILL.md:145 — Nit: In the tool signature example, consider clarifying that comments is omitted entirely when empty to reduce payload size, and that default side is RIGHT if omitted.

Reviewer-bot finding round 1: 2 BLOCKING in the reviewer prompt and skill.

1. SKILL.md step 1 listed only metadata, CI checks, diff, and task spec
   in the session_pr_review_context payload. Added parsedDiff (the
   hunks structure used for line-anchored comment selection) and
   reviewThreads (existing thread state) to the description.

2. The Mode 1 output template in reviewer.md used a 4-backtick outer
   fence that closed prematurely, leaving the trailing Checked-and-clear
   section outside the fence and an orphan opening 4-backtick after it.
   Switched to a 5-backtick outer markdown fence that contains the
   inner 3-backtick json fence cleanly with a single close. Added
   guidance that implementers copy the inside of the outer fence.
Copy link
Copy Markdown

@minsky-reviewer minsky-reviewer Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Independent adversarial review (Chinese-wall)
Reviewer: minsky-reviewer[bot] via openai:gpt-5
Tier: unknown


⚠️ Reviewer did not emit a conclude_review call. Event derived from severity counts: REQUEST_CHANGES (3 BLOCKING / 3 NON-BLOCKING / 0 PRE-EXISTING findings). Executive summary unavailable.

Findings

  • [BLOCKING] .claude/agents/reviewer.md:1 — Mode 1 requires emitting comments[] but provides no parsedDiff or anchor-validation path, risking invalid anchors
    In the Mode 1 Protocol (step 4) you mandate returning a comments[] array with line-anchored entries for every location-bearing finding, and the Mode 1 output format adds a comments[] block. However, Mode 1 inputs only provide a raw diff file path and line range; there is no parsedDiff available in Mode 1, nor any guidance on how to validate or even derive GitHub-valid (path, line, side) anchors from a diff chunk. This is inconsistent with the Mode 2 requirement to validate anchors against parsedDiff to avoid GitHub 422 failures. As written, Mode 1 reviewers may fabricate anchors that the parent later posts — a single bad anchor will 422-reject the entire review. Please either: (a) add parsedDiff to Mode 1 inputs and require anchor validation there, or (b) change Mode 1 to emit only body findings or provisional file:line references and delegate anchor construction/validation to the parent aggregator.
  • [BLOCKING] .claude/agents/reviewer.md:126 — ReviewComment schema omits required GitHub path semantics for renamed files and does not clarify oldPath usage
    The parsedDiff section documents DiffFile.oldPath for renames, but the ReviewComment interface and guidance require path to "match DiffFile.path" without specifying behavior for renamed files. GitHub's review comment API expects the current filename for RIGHT-side anchors and the previous filename for LEFT-side anchors on renames; using only DiffFile.path for both can produce invalid anchors or attach to the wrong side. Provide explicit rules: for LEFT-side (deletions/old version) on renames, path must be the oldPath; for RIGHT-side (additions/new version), use path. Include examples for a renamed file to prevent 422s or misplaced comments.
  • [BLOCKING] .claude/agents/reviewer.md:152 — Severity-prefix rules exclude PRE-EXISTING despite being a supported classification elsewhere
    This doc establishes severity classes including PRE-EXISTING (see "Severity classification"), and mandates that "Every location-bearing finding MUST be emitted as a comments[] entry." However, the inline comment body rule only permits two prefixes — "[BLOCKING]" or "[NON-BLOCKING]" — with no guidance for PRE-EXISTING. This creates an enforcement gap: PRE-EXISTING findings with concrete locations either violate the MUST-anchor rule or lack a valid prefix for downstream tooling. Please either: (a) introduce and document a [PRE-EXISTING] prefix for inline comments, or (b) explicitly exempt PRE-EXISTING from comments[] and require they go in the body under a separate section.
  • [NON-BLOCKING] .claude/agents/reviewer.md:83 — Anchor-validation step lacks handling for OUTDATED hunks and previously-commented threads
    You require validating (path, line, side) against parsedDiff.hunks[].lines[], but real-world reviews often need to reference lines that GitHub marks as OUTDATED after subsequent pushes. The SKILL.md mentions "existing review threads with resolved/outdated state" in the context payload, but the agent doc doesn't mention avoiding duplicate inline comments or gracefully moving a finding to body if the only matching line is in an outdated hunk. Consider adding: (1) de-duplication against existing open threads; (2) fallback behavior when a target line is no longer in the latest diff due to force-push (move to Unanchored findings rather than attempting a comment that will 422).
  • [NON-BLOCKING] .claude/skills/review-pr/SKILL.md:19 — Step 1 adjusted to include parsedDiff, but fallback path does not reconstruct parsedDiff for anchor validation
    You updated Step 1 to state the session call returns the structured parsedDiff, but the fallback path (direct GitHub reads) only mentions metadata, check runs, and changed files — not reconstructing a parsed diff. Later steps require anchor validation against parsedDiff before emitting comments[]. Without a reconstruction step, reviews executed via fallback cannot safely emit anchors and risk 422 rejection. Suggest adding: when falling back, either (a) do not emit comments[] and put findings in the body under Unanchored, or (b) build a parsedDiff structure from the raw diff (unified format) before proceeding.
  • [NON-BLOCKING] .claude/agents/reviewer.md:103 — CONTEXT mapping guidance may mislead on unchanged lines lacking newLine/oldLine in truncated hunks
    You advise mapping CONTEXT lines to LEFT or RIGHT and using newLine or oldLine accordingly. Some diff parsers omit newLine for leading/trailing context beyond the displayed hunk window or collapse large context blocks. If parsedDiff carries a warning for truncated paths (you mention skipping such files), clarify behavior for hunks with elided context ranges: only anchor to context lines that are explicitly present in hunks[].lines[]; do not synthesize line numbers for omitted context.

Inline comments

  • .claude/agents/reviewer.md:164 — Nit: The worked JSON examples include backticked identifiers (e.g., resolveSession) inside a JSON string. While valid, it risks copy/paste confusion. Consider escaping backticks or using plain quotes to avoid readers thinking it's Markdown code fencing.

Round 2 surfaced 3 architectural BLOCKINGs in reviewer.md plus 1 in
SKILL.md. Round 3 ships an interim Mode 1 reshape (raw observations
from subagents, parent aggregator validates anchors), with the proper
architectural reshape filed as mt#1485 follow-up to mt#1340.

1. Mode 1 anchor-validation paradox (reviewer.md step 4 + Output format
   section + SKILL.md proportionality rule + Anti-patterns):
   Mode 1 subagents previously had to emit a fully-anchored comments
   array with no parsedDiff to validate against. Reshaped Mode 1 to
   emit raw observations only; the parent aggregator (which holds
   parsedDiff, spec, CI, body) validates each anchor, dedupes, assigns
   severity, picks event, writes body, posts. The deeper architectural
   reshape (subagents as readers, parent as judge) lives in mt#1485.

2. Renamed-file path semantics: added a Renamed files subsection to
   the comments parameter shape. RIGHT-side anchors use DiffFile.path
   (current filename); LEFT-side anchors on renames use DiffFile.oldPath
   (previous filename). Validate against newLine vs oldLine accordingly.

3. PRE-EXISTING prefix gap: added PRE-EXISTING to the Severity
   classification block with explicit body-only routing, and updated
   the inline-comment body rule to state only BLOCKING and NON-BLOCKING
   are valid inline prefixes. PRE-EXISTING never goes into comments.

The bot also raised a NON-BLOCKING about graphql in the prior round
that was a false positive on the Octokit v22 surface.
Copy link
Copy Markdown

@minsky-reviewer minsky-reviewer Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Independent adversarial review (Chinese-wall)
Reviewer: minsky-reviewer[bot] via openai:gpt-5
Tier: unknown


⚠️ Reviewer did not emit a conclude_review call. Event derived from severity counts: REQUEST_CHANGES (4 BLOCKING / 2 NON-BLOCKING / 0 PRE-EXISTING findings). Executive summary unavailable.

Findings

  • [BLOCKING] .claude/agents/reviewer.md:1 — Anchor-validation lookup ignores oldPath for LEFT-side anchors on renames
    In the Mode 2 Protocol step 4 (Anchor-validate findings), the first bullet says: “Find the DiffFile in parsedDiff where file.path === path (skip warning-flagged files — file.warning set).” For LEFT-side anchors on renamed files, your own “Renamed files” section instructs reviewers to use DiffFile.oldPath for the comment path. If validation only matches on file.path, it will fail to locate the DiffFile when given oldPath, causing legitimate LEFT anchors on renames to be treated as unanchorable (and demoted to the body) or, worse, lead to mismatched anchors. The lookup must consider file.oldPath === path for LEFT-side anchors (and possibly both path and oldPath generically) to be consistent with the rename rules.
  • [BLOCKING] .claude/agents/reviewer.md:120 — Mode 1 “raw observations” protocol conflicts with top-of-file description claiming direct posting via submit tool
    At the header description (lines 1–12), the agent is described as “posts findings directly via mcp__minsky__session_pr_review_submit,” which suggests direct posting is the default capability. However, the Mode 1 Protocol (line ~48 onward) explicitly states Mode 1 subagents must NOT commit anchored comments[] and only emit raw observations for a parent aggregator to validate/post. This inconsistency can cause subagents to incorrectly post directly in Mode 1, bypassing the intended aggregation and anchor validation. The top-level description should be narrowed (e.g., “posts directly in Mode 2; Mode 1 returns observations to parent”) or the Mode selection rules should be clarified to avoid contradictory guidance.
  • [BLOCKING] .claude/agents/reviewer.md:110 — Multi-line anchor validation omits startLine/startSide checks, risking GitHub 422 rejections
    In Mode 2 step 4 (Anchor-validate findings), the procedure only verifies that “a DiffLine exists with the target line number,” which covers the end line, but it does not require validating startLine/startSide for multi-line comments. GitHub enforces that both endpoints of the range exist on the specified side and that startSide === side; a missing or invalid startLine will 422-reject the entire review even if line is valid. The validation protocol should explicitly require: (1) confirming a DiffLine exists for both startLine and line on the chosen side, (2) ensuring all lines in the inclusive range exist (or at least both endpoints) within the same file, and (3) asserting startSide === side before building comments[].
  • [BLOCKING] .claude/skills/review-pr/SKILL.md:108 — Parent aggregator instructions omit rename path rule when validating Mode 1 subagent anchors
    In the Mode 1 aggregator paragraph (around lines 60–76) and the Anchor validation section (lines ~108–128), the guidance says to match file.path === path and then validate newLine/oldLine based on side. However, the “Renamed files” rule exists only in the agent doc, not in this skill. Without stating that LEFT-side anchors on renames must use oldPath while RIGHT uses path, the parent may reject valid subagent anchors or attach to wrong filenames. Please add a ‘Renamed files’ subsection mirroring the agent guidance so the parent’s validation logic handles oldPath correctly.
  • [NON-BLOCKING] .claude/agents/reviewer.md:83 — Ambiguity on handling DiffFile.warning files: skip entirely or allow body-only findings?
    Step 4 says “skip warning-flagged files — file.warning set.” The Structured output section later says that findings failing anchor validation should go to the body under “Unanchored findings.” It would help to clarify whether reviewers should still analyze changes in warning-flagged files (and emit body-only findings) versus skipping analysis altogether. Today’s phrasing could be interpreted as not reviewing those files at all, which would reduce coverage.
  • [NON-BLOCKING] .claude/skills/review-pr/SKILL.md:19 — Step 1 claims existing review threads are returned by session_pr_review_context without documenting shape or how to use them
    The updated Step 1 adds “existing review threads with resolved/outdated state” to the returned payload, but the rest of the skill does not document the structure of those threads or instruct how to use them (e.g., avoiding duplicate comments, linking to previous discussions). Either remove the claim or add a brief section detailing the relevant fields and how to de-duplicate or reference prior threads when posting new comments[].

Inline comments

  • .claude/agents/reviewer.md:64 — Consider adding an explicit note in Mode selection that “Mode 1 never posts to GitHub; Mode 2 always posts,” to prevent subagents from inferring they can post directly when they happen to also have a task ID in context.

Copy link
Copy Markdown

@minsky-reviewer minsky-reviewer Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Independent adversarial review (Chinese-wall)
Reviewer: minsky-reviewer[bot] via openai:gpt-5
Tier: unknown


⚠️ Reviewer did not emit a conclude_review call. Event derived from severity counts: REQUEST_CHANGES (3 BLOCKING / 3 NON-BLOCKING / 0 PRE-EXISTING findings). Executive summary unavailable.

Findings

  • [BLOCKING] .claude/agents/reviewer.md:120 — Anchor-validation step fails for LEFT-side anchors on renamed files (matches only file.path, ignores oldPath)
    In the Mode 2 Protocol anchor-validation step, bullet 1 says: “Find the DiffFile in parsedDiff where file.path === path (skip warning-flagged files — file.warning set).” For LEFT-side anchors on renamed files, the correct anchor path is DiffFile.oldPath per your own “Renamed files” section. Restricting lookup to file.path will incorrectly treat valid LEFT-side anchors that use oldPath as missing and demote them to the body or, worse, tempt implementers to use the wrong path. Update the validation logic to match either file.path or file.oldPath depending on side (RIGHT ⇒ path; LEFT ⇒ oldPath). Cite: this file’s “Anchor-validate findings” step vs. the later “Renamed files” guidance.
  • [BLOCKING] .claude/skills/review-pr/SKILL.md:113 — Anchor validation instructions omit renamed-file path selection (oldPath vs path) leading to 422 or mis-anchored comments
    In Step 7 under “Anchor validation before submitting,” the 3-step checklist validates only by matching DiffFile where file.path === path and checking newLine/oldLine. There is no guidance for renamed files where LEFT-side anchors must use DiffFile.oldPath and RIGHT-side anchors must use DiffFile.path. Without this, valid LEFT-side anchors on renames will fail validation or lead authors to use the wrong path. Mirror theReviewer agent’s “Renamed files (DiffFile.oldPath !== DiffFile.path)” section: choose path based on side (RIGHT ⇒ DiffFile.path; LEFT ⇒ DiffFile.oldPath) and validate against the corresponding line number space.
  • [BLOCKING] .claude/agents/reviewer.md:55 — Mode selection rule conflicts with Mode 1 prohibition on posting comments[] directly; ambiguity risks subagents posting without validation
    Top section states: “When ambiguous, default to Mode 1 (never post directly unless the parent clearly requested Mode 2).” But Mode 1 Protocol step 4 now says Mode 1 subagents do NOT commit anchored comments[] and should return raw observations for the parent to aggregate. There’s no explicit prohibition against Mode 1 subagents calling mcp__minsky__session_pr_review_submit, which could happen when an agent misparses inputs. Add an explicit guard: in Mode 1, you MUST NOT call submit; return observations only. Include a short 'Do not post in Mode 1' rule to remove ambiguity.
  • [NON-BLOCKING] .claude/agents/reviewer.md:170 — Side-mapping guidance for CONTEXT lines lacks an explicit rule for multi-line ranges spanning mixed CONTEXT/RIGHT/LEFT lines
    The “Side-mapping rule” advises mapping CONTEXT to LEFT or RIGHT and which line number to use, but multi-line ranges can span both addition and context lines (e.g., startLine on a CONTEXT line, end line on a RIGHT line). GitHub requires startSide == side; an anchor that straddles differing sides is invalid. Recommend adding guidance: for ranges, ensure all covered lines share the same effective side; if not, split into multiple comments. Also advise validating that each line from startLine..line exists with the same side in parsedDiff before constructing range anchors.
  • [NON-BLOCKING] .claude/agents/reviewer.md:94 — Mode 2 step order may cause body demotion before spec verification; clarify that spec verification happens regardless of anchor status
    You moved anchor validation to step 4 and spec verification to step 5. The narrative says “If no matching line exists, record the finding in the review body instead of comments[].” Please clarify that spec verification (step 5) is independent of anchor status and must still be performed before posting. Otherwise, an implementer might short-circuit once anchors fail and skip spec verification, which is mandated later. A one-line clarification in step 4 or 5 would resolve this ambiguity.
  • [NON-BLOCKING] .claude/skills/review-pr/SKILL.md:91 — Coverage gate lacks explicit instruction to include coverage numbers in the posted body
    The coverage requirement states: “State explicitly: 'Coverage: X/Y files reviewed.' If coverage is not 100%, do NOT post.” However, the Review body format (section 8) doesn’t include a place to surface the coverage figure, risking omission from the posted review. Suggest adding a 'Coverage' line in the body template under Summary, e.g., 'Coverage: 23/23 files reviewed' to make enforcement observable to readers.

Spec verification

Criterion Status Evidence
Reserve the review body for summary/spec/CI/cross-cutting; emit line-anchored comments[] for every location-bearing finding. Met .claude/agents/reviewer.md:188-233 mandates all location-bearing findings be in comments[] with severity prefixes; .claude/skills/review-pr/SKILL.md:146-179 and 8. Review body format sections reinforce body-vs-comments split.

Round 3 surfaced 4 BLOCKINGs across two consecutive bot reviews on the
same commit (55530ba) at 18:53 and 18:58 UTC. Findings overlap but
are distinct; addressing all four:

1. LEFT-side rename anchor lookup ignored oldPath (reviewer.md Mode 2
   step 4 + SKILL.md anchor validation block):
   The Renamed files subsection added in round 2 says LEFT-side anchors
   on renames use DiffFile.oldPath, but the validation steps still
   matched only on file.path === path. RIGHT anchors use path; LEFT
   anchors match path OR oldPath. Updated both docs to reflect the
   asymmetric lookup.

2. Top-of-file frontmatter description claimed posts findings directly
   for the agent in general, conflicting with the Mode 1 reshape.
   Updated description to say Mode 2 posts directly; Mode 1 returns
   raw observations and MUST NOT call submit. Added an explicit Mode 1
   hard-guard paragraph to the Mode 1 Protocol section so subagents
   that have a task ID in context do not reach for the submit tool.

3. Multi-line anchor validation only checked the line endpoint, not
   startLine. Updated both reviewer.md and SKILL.md anchor-validation
   blocks to require: startLine exists on the same side, both endpoints
   fall within the SAME DiffHunk (GitHub 422s ranges that span hunks),
   and startSide === side.

4. SKILL.md anchor-validation block missing rename oldPath rule. Mirrored
   reviewer.md change so the parent aggregator picks the right field for
   each side.

Bot also asked for OUTDATED-hunk handling, dedup against existing open
threads, and warning-flag analysis-vs-skip clarification (NON-BLOCKING).
Skipping for round 3; can roll into mt#1485 if surfaced again.
Copy link
Copy Markdown

@minsky-reviewer minsky-reviewer Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Independent adversarial review (Chinese-wall)
Reviewer: minsky-reviewer[bot] via openai:gpt-5
Tier: unknown


Solid progress: the prompts now validate anchors against parsedDiff and route location-bearing findings into comments[], with clear side-mapping and worked examples. However, I found blocking issues: (1) incorrect guidance stating oldPath === path for non-renames though oldPath is absent when not renamed; (2) missing explicit handling for deleted files (LEFT-only anchors and correct path usage); and (3) a mode-selection default that can deadlock posting (defaults to Mode 1 with a hard no-submit guard even when a task/session ID implies Mode 2). I also noted two non-blocking clarifications around side derivation before lookup and stricter multi-line range validation. Please address these before merge.

Findings

  • [BLOCKING] .claude/agents/reviewer.md:1 — Contradictory guidance: claims oldPath === path for non-renames, but oldPath is only set for renames
    In the "Renamed files (DiffFile.oldPath !== DiffFile.path)" section, the text states: "For non-renames, oldPath === path and the distinction is moot." However, the parsedDiff interface above defines oldPath?: string; // only set for renamed files. For non-renamed files, oldPath is undefined, not equal to path. This is incorrect guidance and can mislead implementers constructing anchors or validators to expect an oldPath on non-renamed files. Please correct the statement to reflect that oldPath is absent (undefined) for non-renamed files, and that for LEFT-side anchors on non-renamed deletions/modifications, path should be used (with oldLine for line selection).
  • [BLOCKING] .claude/skills/review-pr/SKILL.md:1 — Missing handling for deleted files in anchor validation and side/path selection
    The updated anchor-validation and side-mapping rules cover RIGHT and LEFT, and renames via oldPath, but the guidance omits explicit handling for files with status: "deleted". For a pure deletion, GitHub only permits LEFT-side anchors, and the correct comments[].path must still be the current PR path value GitHub exposes for the deletion (often the old path). The instructions should: (a) call out that deleted files accept only LEFT-side anchors (no RIGHT since there's no new file), (b) clarify which path value to use for deletions in parsedDiff (for non-renames, use file.path; for renames-deleted scenarios, use oldPath), and (c) ensure the validation step checks that the file status permits the chosen side to avoid 422s. As written, practitioners could try a RIGHT-side anchor on a deleted file or choose the wrong path field.
  • [BLOCKING] .claude/agents/reviewer.md:61 — Mode selection default conflicts with Mode 1 hard guard and could cause non-posting when parent expects Mode 2
    At the top, "When ambiguous, default to Mode 1 (never post directly unless the parent clearly requested Mode 2)." Combined with the new Mode 1 hard guard (never submit), this risks deadlock: if the parent accidentally omits the diff file path yet intends a whole-PR review (Mode 2) but passes a PR number or task ID along with some context, the agent will default to Mode 1 and refuse to post, contrary to the PR’s stated goal of having the reviewer post inline comments in Mode 2. Consider tightening the mode-selection rule to: if a task ID or session ID is provided and no diff path is provided, treat it as Mode 2; only default to Mode 1 when a diff path is explicitly present. The current text creates ambiguity and inconsistent behavior.
  • [NON-BLOCKING] .claude/agents/reviewer.md:65 — Ambiguity: lookup rule implies side known before validation, but side often derives from target line
    In Mode 2 step 4 (Anchor-validate findings), the lookup algorithm branches on LEFT vs RIGHT to decide whether to match file.path vs file.oldPath. However, in practice, when mapping a human-provided file:line without explicit side, you determine side by searching the file's hunks for that line number (newLine vs oldLine) or by examining DiffLine.side including CONTEXT mapping. The doc should clarify whether the finding already carries a side (and if so, how it was determined), or provide a side-derivation step before file lookup. Otherwise, implementers may perform a LEFT-only or RIGHT-only lookup prematurely and misclassify valid anchors as unanchored.
  • [NON-BLOCKING] .claude/agents/reviewer.md:120 — Range validation requires more than endpoints-in-same-hunk; GitHub also requires contiguous coverage
    The multi-line range validation says to "confirm a DiffLine exists at startLine on the same side, AND that both endpoints fall within the SAME DiffHunk." In practice, GitHub's review API will 422 if the selected start/end lines are not both present on that side within the same hunk, but there are pathological cases where lines are present but the span crosses gaps (e.g., only context lines with missing side-specific lines). Consider tightening guidance: require that every line between startLine and line (inclusive) is representable on the chosen side within that hunk, or at least that both endpoints correspond to actual DiffLine entries with non-null oldLine/newLine and the span doesn't include an interleaved opposite-side-only region. Add a caution and recommend preferring single-line comments when in doubt.
  • [NON-BLOCKING] .claude/skills/review-pr/SKILL.md:139 — Event selection guidance may still produce 422 on self-approval attempts; suggest pre-check
    The skill notes GitHub blocks self-approval and advises fallback to COMMENT if the submit call is rejected. To reduce noisy retries, suggest adding a pre-check using the author identity from session_pr_review_context.pr.author vs the token identity, and selecting COMMENT up front if they match. This avoids an intentional 4xx and simplifies logs. The reviewer.md mentions a similar rule; mirroring it here would align guidance across files.

Inline comments

  • .claude/agents/reviewer.md:97 — Consider documenting how to handle files with status: "added" vs "deleted". For added files, only RIGHT-side anchors make sense; for deleted files, only LEFT-side. Making this explicit alongside the side-mapping would prevent invalid anchor attempts.

@edobry
Copy link
Copy Markdown
Owner

edobry commented Apr 30, 2026

@minsky-reviewer please re-review — fixes pushed addressing prior round's findings.

Three fixes for the round-4 reviewer-bot findings on commit db9241e:

1. Mode-selection deadlock (reviewer.md frontmatter section).
   Round 3 introduced a Mode 1 hard-guard against direct posting plus
   When ambiguous, default to Mode 1. The bot flagged that combination
   could deadlock a Mode 2 review when the parent supplied a task ID
   without a diff path. Replaced the ambiguity rule with an explicit
   disambiguation: the diff-file-path slot is the discriminant. Diff
   path present = Mode 1 regardless of other identifiers; diff path
   absent + task or session ID present = Mode 2 (must post). Only halt
   on misconfiguration when neither is present.

2. oldPath-equals-path doc claim (Renamed files subsection).
   Bot correctly flagged that DiffFile.oldPath is only set for renames;
   on non-renamed files it is undefined, not equal to path. Replaced
   the claim with the accurate statement: for non-renamed files,
   oldPath is absent and both sides use path.

3. Deleted-file handling (new File status subsection in reviewer.md
   plus a new step 2 in SKILL.md anchor-validation block).
   Bot flagged missing explicit guidance for status: deleted (LEFT-only,
   path not oldPath since deletions are not renames) and status: added
   (RIGHT-only). Added a File status subsection enumerating each
   status value's valid sides and path-source. Added a status-permits-
   side check as step 2 in the SKILL.md validation sequence.

Skipping NON-BLOCKINGs: side-derivation-before-lookup (validation
is always given a side; finding emission already produces it),
contiguous-coverage stricter range validation (the same-hunk check
covers the practical case), and pre-event-check on author identity
(the reviewer.md already mentions it; mirroring to SKILL.md is doc-only).
Copy link
Copy Markdown

@minsky-reviewer minsky-reviewer Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Independent adversarial review (Chinese-wall)
Reviewer: minsky-reviewer[bot] via openai:gpt-5
Tier: unknown


The PR significantly improves parsedDiff-driven inline comments with clear Mode 1 vs Mode 2 responsibilities and robust validation guidance. However, there are blocking inconsistencies: the LEFT-side lookup permits using the current path on renames (contradicting the renamed-files rule) and the deleted-file guidance is overly absolute, risking invalid anchors and 422 rejections. Minor clarity gaps also exist around parent-side range validation in Mode 1 aggregation and coverage gate fallbacks. Please tighten LEFT/rename and deleted-file rules and add the noted clarifications before merging.

Findings

  • [BLOCKING] .claude/agents/reviewer.md:1 — LEFT-side anchor lookup allows file.path === path on renames, contradicting later rule that LEFT must use oldPath — risks 422 rejections
    In the Mode 2 Protocol anchor-validation step, the LEFT-side lookup says: “LEFT-side anchor: file.path === path OR file.oldPath === path. For renamed files (DiffFile.oldPath !== DiffFile.path), LEFT anchors must use oldPath per the Renamed files rule below — so the lookup must consider both fields.” Allowing file.path === path to validate a LEFT anchor on a renamed file contradicts the later “Renamed files” section which mandates using oldPath for LEFT-side anchors. This ambiguity can validate a LEFT anchor using the current filename, which GitHub will 422-reject. Tighten the validation to: for renames, accept LEFT only when path === oldPath; for non-renames, oldPath is undefined so path is the only valid match. Please update the LEFT-side lookup bullet to disallow file.path === path on renames.
  • [BLOCKING] .claude/agents/reviewer.md:129 — Deleted files guidance conflicts: claims deletions are not renames and oldPath undefined, but status:"deleted" can be result of rename+delete
    Under “File status (DiffFile.status)”, the bullet for "deleted" states: “Use DiffFile.path (deletions are not renames; oldPath is undefined).” However, earlier the doc models renames explicitly with status: "renamed". In real diffs, a file move+delete can present as a rename in one file and a deletion for another path depending on GitHub API representation. More importantly, the rule here removes necessary nuance: for a pure deletion event within a rename sequence, GitHub expects anchors on the pre-image path. The safe rule is: for status: "deleted", only LEFT-side anchors are valid; use the filename GitHub exposes for that deletion in DiffFile.path, but do not assert that deletions can never have oldPath or that they are “not renames.” The current absolute statement can mislead implementers and cause 422s on repositories/servers that populate oldPath for deletions. Please soften and clarify the guidance and ensure the SKILL.md mirrors it.
  • [BLOCKING] .claude/skills/review-pr/SKILL.md:119 — Deleted file handling still ambiguous: path selection and rename interplay not crisply defined for LEFT-side anchors
    In the Anchor validation block (step 7), the bullet for status: "deleted" says: “only LEFT anchors valid (no post-image). Use DiffFile.path (deletions are not renames).” This mirrors the agents doc’s problematic absolutism. In practice, GitHub’s parsed diff for deletions may include both path and oldPath fields, and the path exposed to comments must match the filename shown in the PR for that deletion. For repositories or providers that set oldPath on delete entries, instructing implementers to always ignore oldPath can lead to mismatched anchors and 422s. Tighten to: “deleted: only LEFT; use the path value present on the deletion entry as shown in the diff UI; if both path and oldPath exist, prefer the one that matches the UI badge (usually path), and validate by checking file.hunks[].lines[].oldLine.” Also explicitly state that RIGHT anchors must be rejected at validation-time to avoid constructing invalid comments.
  • [NON-BLOCKING] .claude/agents/reviewer.md:94 — Missing explicit instruction to reject multi-line ranges spanning different hunks in Mode 1 parent aggregator section
    You correctly state in Mode 2 step 4 that GitHub 422s ranges that span hunks and both endpoints must be within the same DiffHunk. However, in the Mode 1 output/parent-aggregation narrative (lines ~70–110 and the Mode 1 Output section), you don't restate the parent-side obligation to check ranges are within a single hunk when constructing comments[]. Since Mode 1 agents do not have parsedDiff and the parent aggregates across slices, this validation belongs prominently in SKILL.md (parent responsibilities) and as a reminder here under Mode 1. Suggest adding a bullet to the Mode 1 parent aggregator description: “When converting provisional anchors to comments[], verify multi-line ranges do not cross hunks; if they do, demote to body.”
  • [NON-BLOCKING] .claude/skills/review-pr/SKILL.md:77 — Coverage gate lacks enforcement mechanism or fallbacks when agents report partial coverage or fail — risk of deadlock or silent undercoverage
    The Coverage gate mandates 100% coverage before posting but does not specify what to do when dispatched Mode 1 agents time out, crash, or return partial observations. Without a retry/escalation rule, the parent may either violate the gate or deadlock. Suggest adding: retry failed slices once; if still failing, either (a) reassign to remaining agents, or (b) fall back to single-agent Mode 2 for remaining files, with explicit “Coverage: X/Y files” and do not post until 100% is attained. Also log/report when coverage gating prevents posting to avoid silent stalls.

Inline comments

  • .claude/agents/reviewer.md:63 — Nice improvement on the disambiguation rule — this addresses the earlier deadlock risk where Mode 1 was the default. Consider also adding a one-line reminder under Mode 2 Input that a bare PR number should be rejected unless the parent provides a task/session ID, to prevent accidental Mode selection drift.
  • .claude/agents/reviewer.md:115 — The DiffLine interface includes content. Since we now instruct to skip warning-flagged files, should we also caution that content may be truncated for very large lines or binary patches and thus not suitable for textual evidence? Might help prevent agents from relying on content for validation vs. just for context.

edobry added 2 commits May 1, 2026 09:53
Round 5 surfaced 3 BLOCKING + 2 NON-BLOCKING. Of these, two are
substantive and addressed here; the rest are reviewer self-reversal
or out-of-scope and are intentionally not addressed (see below).

Substantive fixes:

1. LEFT-side anchor lookup on renames (R5 BLOCKING #1).
   Prior text said: LEFT-side anchor matches file.path === path OR
   file.oldPath === path, with a follow-up note that renames must
   use oldPath. The OR formulation lets a renamed files LEFT anchor
   incorrectly succeed against the post-rename name. Tightened the
   lookup to be conditional on rename status: renames match oldPath
   only; non-renames match path only. Mirrored in SKILL.md.

2. Mode 1 parent-aggregator same-hunk reminder (R5 NON-BLOCKING #4).
   The same-hunk constraint for multi-line anchors was specified for
   Mode 2 only. Added an explicit reminder that the parent aggregator
   constructing comments from Mode 1 subagent observations must apply
   the same constraint, demoting span-hunks ranges to body.

Intentionally not addressed (reviewer self-reversal — bikeshedding):

3. R5 BLOCKING #2 + #3 reverse R3/R4: round 3 + 4 reviewer accepted
   the deletions are not renames, oldPath undefined framing as the
   correct fix to a prior BLOCKING. Round 5 now flags that exact
   wording as overly absolute. Per memory feedback_reviewer_bot_self_reversal_signal,
   that is the convergence-via-bypass signal. The current wording is
   technically correct for the GitHub diff representation we observe.

4. R5 NON-BLOCKING #5 (coverage-gate fallback when subagents fail) is
   genuinely out of scope for this prompt-shape PR. Tracked separately
   if it becomes a real failure mode.

Status: substantive fixes landed; bypass-merging after this push per
feedback_bot_pr_convergence_via_bypass since the PR has had ~17
findings addressed across 5 rounds and is now in self-reversal territory.
Copy link
Copy Markdown

@minsky-reviewer minsky-reviewer Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Independent adversarial review (Chinese-wall)
Reviewer: minsky-reviewer[bot] via openai:gpt-5
Tier: unknown


Strong progress toward anchor-validated inline comments, but deletion/rename guidance is overly absolute and inconsistent across docs, risking GitHub 422s. reviewer.md asserts deleted files are “not renames” and to use only path; SKILL.md repeats it — both need softening to reflect providers that supply oldPath for deletions and to direct anchoring to the displayed filename. Also, reviewer.md’s “every location-bearing finding MUST be comments[]” conflicts with the PRE-EXISTING carve-out, and the Mode 2 algorithm should explicitly include status→side validation to match SKILL.md. Please address these blocking issues; minor clarity nits and examples are noted inline.

Findings

  • [BLOCKING] .claude/agents/reviewer.md:1 — Deleted-file guidance asserts 'deletions are not renames; oldPath is undefined' — risks wrong anchors and 422s
    In the File status guidance under “deleted”, the text says: “only LEFT-side anchors… Use DiffFile.path (deletions are not renames; oldPath is undefined).” This is overly absolute and contradicts real-world diffs where servers/tooling may populate both path fields for deletions and where rename+delete sequences can appear. The safe rule is: status:"deleted" → only LEFT-side anchors are valid; anchor to the filename GitHub exposes for that deletion entry, and do not assert that oldPath is always undefined or that deletions are “not renames.” The current statement can mislead implementers and cause anchor/path selection errors that 422-reject the entire review. Please soften and clarify this section to avoid forbidding valid oldPath usage and ensure SKILL.md mirrors it.
  • [BLOCKING] .claude/skills/review-pr/SKILL.md:1 — SKILL.md repeats absolute 'deleted files are not renames' rule; path selection for LEFT anchors on deletions remains ambiguous
    In Step 7 Anchor validation (posting section), bullet 2 for status:"deleted" states: “only LEFT anchors valid (no post-image). Use DiffFile.path (deletions are not renames).” This mirrors the agents doc’s problematic absolutism. Some providers include oldPath for deletions, and the display filename for the deletion entry is what GitHub expects in comments[].path. Over-constraining to path-only and declaring 'not renames' risks bad guidance and 422s when oldPath is necessary. Please clarify: for deleted files, only LEFT anchors; use the filename presented for that deletion in parsedDiff; do not assert oldPath is always absent; and include an example of a deletion anchor. Ensure consistency with reviewer.md.
  • [NON-BLOCKING] .claude/agents/reviewer.md:1 — Inconsistency: 'Every location-bearing finding MUST be comments[]' conflicts with later rule that PRE-EXISTING must not be inline
    The 'Structured finding output and comments[]' section states: “Every location-bearing finding MUST be emitted as a comments[] entry,” but the 'Severity classification' section explicitly says PRE-EXISTING findings are never emitted inline and must go in the body. This is contradictory guidance for implementers. Recommend amending the MUST statement to carve out the PRE-EXISTING exception (e.g., “Every location-bearing PR-introduced finding…”) and add PRE-EXISTING to the list of body-reserved categories to keep the rules coherent.
  • [NON-BLOCKING] .claude/agents/reviewer.md:85 — Anchor validation step omits explicit check that chosen side is permitted by DiffFile.status before line lookup
    In Mode 2 Protocol step 4, the bullet sequence validates file match and line existence, and later a separate File status section describes side constraints. However, the step-by-step algorithm does not explicitly instruct to reject RIGHT anchors on deleted files or LEFT on added files before scanning hunks, which could lead to unnecessary iteration or subtle mistakes. Suggest adding an explicit sub-step: verify DiffFile.status permits the chosen side (added→RIGHT only; deleted→LEFT only; modified/renamed→both) prior to line lookup, mirroring SKILL.md’s posting step which includes this check.

Inline comments

  • .claude/agents/reviewer.md:50 — Minor wording nit: “Mode 1 subagents do NOT commit anchored comments[] — section subagents lack parsedDiff (which is whole-PR)” — consider noting explicitly that the parent aggregator must also validate multi-line range endpoints lie within the same hunk to avoid 422s, to mirror the later range constraint guidance.
  • .claude/agents/reviewer.md:164 — Edge case suggestion: for multi-line ranges, also clarify that startLine < line is required by GitHub and that inclusive ranges are expected (both endpoints included). The interface comment mentions it, but the worked example text could reiterate to reduce mistakes.
  • .claude/skills/review-pr/SKILL.md:120 — Consider adding a short worked example JSON for a deletion (LEFT-side) anchor, analogous to the examples in reviewer.md, to reduce ambiguity for operators following the skill doc alone.

@edobry edobry merged commit 28f3ed7 into main May 1, 2026
2 checks passed
@edobry edobry deleted the task/mt-1340 branch May 1, 2026 13:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

authorship/co-authored Co-authored by human and AI agent

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant