feat(mt#1190): Reviewer prompt: out-of-repo file awareness#763
Conversation
There was a problem hiding this comment.
Review: out-of-repo file awareness clause
CI status: 1/2 green (Prevent Placeholder Tests ✅), build still in progress at time of review.
Findings
No blocking findings. Clean, focused change: one prompt clause + 4 tests, 51 net additions across 2 files.
[NON-BLOCKING] services/reviewer/src/prompt.ts:73 — the clause reuses the existing [NON-BLOCKING] NEEDS VERIFICATION: marker format, which is good (consistency with the cross-file-claims convention established in mt#1126). However, when the reviewer has tools wired (OpenAI path), it might reflexively try read_file("~/.claude/foo.md") before applying this rule; that tool only reads from the PR's HEAD ref and will error. Not blocking because the TOOL_ACCESS_SECTION already scopes tool paths to the repo root, but a future tightening could explicitly say "do not attempt read_file on these paths." Filing this as a note rather than a finding — the current wording is sufficient.
Checked and clear
- Section placement: The new
## Out-of-repo referencesH2 sits insideCRITIC_CONSTITUTION_FAILURE_MODES, so it renders in bothbuildCriticConstitution(true)and(false)variants — verified against the composition atprompt.ts:27-35and confirmed by theout-of-repo clause appears in no-tools variant tootest. - In-repo carve-out:
prompt.ts:75— "This rule does NOT apply to in-repo paths" preserves existing behavior. Test atprompt.test.ts:74-79asserts the string is present. - Path pattern coverage: All four spec-required patterns enumerated (
~/.claude/...,$HOME/.../~/..., absolute system paths, session workspace paths) atprompt.ts:66-69. - Backslash escaping: The inline backticks inside the template literal are properly escaped (```).
- No behavioral change outside the prompt:
review-worker.ts,providers/,tier-routing.tsuntouched. Pure prompt change.
Spec verification
Task: mt#1190
| Criterion | Status | Evidence |
|---|---|---|
| Prompt classifies out-of-repo references as non-blocking by default | Met | prompt.ts:71-73 — "NON-BLOCKING by default — mark it [NON-BLOCKING] NEEDS VERIFICATION: out-of-repo path — reviewer cannot verify" |
Recognized patterns documented: ~/.claude/..., $HOME/..., absolute system paths |
Met | prompt.ts:66-69 enumerates all four categories |
| Live data on #720-class PRs no longer shows BLOCKING false positives | N/A | Requires post-deploy production observation, not verifiable pre-merge |
| Genuine in-repo-path findings still flagged | Met | prompt.ts:75 carve-out + test prompt.test.ts:72-78 |
| Unit test covering both cases | Met | prompt.test.ts:57-85 — 4 tests including in-repo-path carve-out and no-tools-variant coverage |
Documentation impact
No update needed — prompt-tuning change internal to the reviewer service. No architecture, API, command-registry, or user-facing surface affected.
(Had Claude look into this — AI-assisted review.)
There was a problem hiding this comment.
Independent adversarial review (Chinese-wall)
Reviewer: minsky-reviewer[bot] via openai:gpt-5
Tier: unknown
Findings
[BLOCKING] services/reviewer/src/prompt.ts:76 vs 89–93 — Conflicting guidance between the “Out-of-repo references” carve-out and the no-tools section
- Evidence:
- Line 76: “This rule does NOT apply to in-repo paths. If the PR description claims it modified src/foo.ts but that file is not in the diff, that remains a legitimate finding and may be BLOCKING.”
- Lines 89–93 (NO_TOOLS_SECTION): “Any claim about a file or directory that is not directly in the diff MUST be marked non-blocking … Do NOT mark such claims as BLOCKING …”
- Failure mode: In the no-tools variant, the instruction “Any claim … that is not directly in the diff MUST be marked non-blocking” can be interpreted to include the “PR claims it modified src/foo.ts but that file is not in the diff” case, which your new carve-out (line 76) says may be BLOCKING. These directives are contradictory in the no-tools context and can produce inconsistent reviewer behavior.
- Requested change: Amend NO_TOOLS_SECTION (lines 89–93) to explicitly exempt “diff-vs-description mismatches on file presence” from the “MUST be non-blocking” rule (e.g., “Exception: if the PR description claims a repo path was modified but that file is not present in the diff, that absence is verifiable from the diff and may be BLOCKING.”). Alternatively, adjust the carve-out to explicitly say it applies in both tool and no-tools contexts and add an exception sentence in NO_TOOLS_SECTION.
[NON-BLOCKING] services/reviewer/src/prompt.ts:74 — Ambiguity in “You have no local filesystem access” may appear to conflict with the Tool access section
- Evidence: Line 74 states “You have no local filesystem access,” while TOOL_ACCESS_SECTION (lines 78–87) lists read_file/list_directory tools.
- Concern: Although technically correct (tools read the repo, not arbitrary local FS), the blanket phrasing can be misread as “no file-reading tools exist,” especially in the toolsAvailable=true variant.
- Suggestion: Tighten the sentence to “You cannot read the runner’s local filesystem; your tools only read files from the repository at the PR’s HEAD.” This reduces the chance of perceived conflict with the Tool access section.
[NON-BLOCKING] services/reviewer/src/prompt.test.ts:56–85 — Test coverage doesn’t assert all enumerated path examples
- Evidence: The new tests assert presence of “~/.claude” and “$HOME” (lines 58–61) but do not assert the other enumerated patterns present in the prompt (e.g., “/etc/…”, “/usr/…”, “/var/…”, and the session workspace example).
- Risk: Future edits could drop or alter those examples without test breakage.
- Suggestion: Add contains assertions for “/etc/”, “/usr/”, “/var/”, and “/Users/…/minsky/sessions/” to lock down the full enumeration.
[NON-BLOCKING] services/reviewer/src/prompt.ts:69–72 — Windows path patterns are not mentioned
- Evidence: Enumerated examples cover POSIX-style paths only.
- Concern: Reviewers seeing Windows-style absolute paths (e.g., C:\Users..., %USERPROFILE%...) may not realize they fall under the same “out-of-repo” rule.
- Suggestion: Add a Windows example or a parenthetical (e.g., “On Windows: C:\Users..., %USERPROFILE%...”) to make the guidance platform-agnostic.
[NON-BLOCKING] PR description vs. tests — Mismatch in stated test changes
- Evidence: The description says “adds 4 new assertions,” but the diff adds four new test cases with multiple expect() assertions each (services/reviewer/src/prompt.test.ts:56–85).
- Impact: Minor doc accuracy. Consider rewording to “adds 4 tests” or “adds 4 test cases.”
[PRE-EXISTING] Repo hygiene — Unexpected “~” directory at repo root
- Evidence: list_directory("") shows a “~” directory at the repository root.
- Impact: Not related to this PR, but likely an artifact that could confuse tooling or developers.
Spec verification
- Add “Out-of-repo references” section to the Critic Constitution failure modes: Met (services/reviewer/src/prompt.ts:65–76)
- Enumerate recognized out-of-repo path patterns (
/.claude, $HOME//, absolute system paths like /etc/, session workspace paths): Met (lines 69–72) - Instruct reviewer to mark claimed-but-not-in-diff out-of-repo path findings as NON-BLOCKING with a specific tag: Met (line 74)
- Preserve in-repo path behavior (e.g., src/foo.ts claimed in description but missing from diff may be BLOCKING): Met (line 76)
- Add tests to verify clause presence in both toolsAvailable=true and toolsAvailable=false variants: Met (services/reviewer/src/prompt.test.ts:56–85)
Documentation impact
- Update reviewer docs (e.g., AGENTS.md or CLAUDE.md) to mention the new “Out-of-repo references” rule and clarify the intended behavior in both tool and no-tools contexts.
- Consider documenting platform-agnostic path examples (POSIX and Windows).
- Clarify the no-tools exception for “diff-vs-description mismatches” to resolve the prompt-level contradiction noted above.
Conclusion
REQUEST_CHANGES
Primary reason: conflicting instructions between the “Out-of-repo references” carve-out (allowing BLOCKING for in-repo path mismatches) and the NO_TOOLS_SECTION directive (forbidding any BLOCKING findings for claims about files not in the diff) in the no-tools variant. This should be reconciled in the prompt text. The remaining items are non-blocking improvements to clarity and test coverage.
There was a problem hiding this comment.
Independent adversarial review (Chinese-wall)
Reviewer: minsky-reviewer[bot] via openai:gpt-5
Tier: unknown
Findings
[BLOCKING] Contradictory guidance when toolsAvailable=false (no-tools variant) about in-repo paths not in the diff
- File: services/reviewer/src/prompt.ts:76 and 89–93
- Evidence:
- At 76, the new Out-of-repo references clause states: “This rule does NOT apply to in-repo paths. If the PR description claims it modified src/foo.ts but that file is not in the diff, that remains a legitimate finding and may be BLOCKING.”
- At 89–93 (NO_TOOLS_SECTION), the no-tools guidance says: “Any claim about a file or directory that is not directly in the diff MUST be marked non-blocking… Do NOT mark such claims as BLOCKING…”
- Failure mode: In the no-tools prompt variant, these instructions directly conflict. One section says an in-repo-but-not-in-diff path “may be BLOCKING”; the other says any such cross-file claim MUST be non-blocking without tools. Models will receive inconsistent instructions and may oscillate or disregard one of the rules.
- Requested change: Reconcile the guidance. For the no-tools variant, explicitly override the in-repo carveout (e.g., “With no tools, even in-repo paths not in the diff must be NON-BLOCKING, NEEDS VERIFICATION”) or condition the carveout text in the Out-of-repo section on toolsAvailable. Tests should assert consistency across both variants.
[NON-BLOCKING] Regex may match URL schemes other than http/https (false positives)
- File: services/reviewer/src/prompt.ts:140
- Evidence: The absolute path regex uses (?<![\w:]) before "/" to avoid http(s) URLs like https://example.com/etc/docs (covered in tests), but it will still match file:///etc/hosts, sftp://host/etc/passwd, or ssh-style URLs where the preceding character is not a word char or colon near the matching slash. There is no test covering these schemes.
- Impact: Spurious “Out-of-repo references observed” entries from URLs that are not actual local filesystem references.
- Suggestion: Expand the scheme guard to cover scheme:// generically (e.g., negative lookbehind for [A-Za-z][A-Za-z0-9+.-]*://) or prefilter URLs before regex scanning. Add tests for file://, sftp://, and ssh-like forms.
[NON-BLOCKING] Missing Windows path coverage (C:..., \server\share)
- File: services/reviewer/src/prompt.ts:136–142
- Evidence: Patterns cover ~/.claude, $HOME, and forward-slash absolute paths (/etc, /usr, /var, /opt, /tmp, /root, /home, /Users). No patterns match common Windows absolute forms like C:\Users\name..., D:\data..., or UNC paths \server\share\dir.
- Impact: On Windows-heavy repos/teams, out-of-repo path references in PRs won’t be detected and enumerated in the prompt, reducing the utility of the section.
- Suggestion: Add patterns for Windows drive-letter and UNC paths. Add tests.
[NON-BLOCKING] Potential duplication of identical paths across sources
- File: services/reviewer/src/prompt.ts:161–173
- Evidence: Deduplication is per-source only (Map key is just the path, but the maps are built per pattern, then merged across sources without cross-source de-dupe). If the same path appears in both PR description and task spec, it will be listed twice as two bullets.
- Impact: Noisy / redundant “Out-of-repo references observed” section.
- Suggestion: Deduplicate across both inputs by path and aggregate sources in a single bullet (e.g., “(PR description, task spec)”).
[NON-BLOCKING] Punctuation stripping is narrow; misses common trailing characters
- File: services/reviewer/src/prompt.ts:152
- Evidence: Trailing cleanup uses replace(/[.,;:)]+$/, ""). This does not remove '!' '?' ']' '"' "'" or closing backticks if somehow inside the match (e.g., odd quoting), and can leave stray punctuation in the extracted path.
- Impact: Minor false negatives/formatting oddities in enumeration.
- Suggestion: Broaden to something like /[)].,;:!?'"]+$/ and consider trimming paired punctuation around matches if needed.
[NON-BLOCKING] Wording ambiguity in a tools-enabled context
- File: services/reviewer/src/prompt.ts:74–76
- Evidence: “You have no local filesystem access. You cannot verify whether these paths exist…” In the tools-enabled variant, the Tool access section teaches that read_file/list_directory are available. While technically correct (tools can’t read out of repo), “no local filesystem access” can be read as “no tools/filesystem access at all,” increasing the chance the model underuses tools.
- Suggestion: Clarify scope, e.g., “You cannot access the local machine filesystem outside the repository’s HEAD ref; tools can only read files in-repo.” Add a test asserting this phrasing if you rely on it for behavior.
[NON-BLOCKING] Test coverage gaps for listed absolute_system prefixes
- File: services/reviewer/src/prompt.test.ts (various)
- Evidence: Regex includes /opt, /tmp, /root, /home (prompt.ts:140), but tests only cover /etc, /usr, /var, and /Users.
- Impact: Unvalidated portions of your pattern surface area.
- Suggestion: Add positive tests for /opt/foo, /tmp/x, /root/.bashrc, /home/alice/.config to lock intended behavior.
Spec verification table
- Task spec: N/A (no task spec was provided in the PR; intent derived from PR description)
Documentation impact
- The reviewer prompt semantics changed materially. Consider updating services/reviewer/README.md or architecture notes to document:
- The new Out-of-repo references clause and its intended behavior (especially the tools/no-tools interplay once the contradiction above is resolved).
- The pre-check scanner that injects an “Out-of-repo references observed” section into the review prompt.
- Any limitations (current pattern coverage, OS-specific paths, URL handling).
Event
REQUEST_CHANGES
There was a problem hiding this comment.
Review: out-of-repo file awareness — absolute-path heuristic is over-broad
CI status: All checks pass (2/2).
Findings
[BLOCKING] services/reviewer/src/prompt.ts (new OUT_OF_REPO_PATH_PATTERNS, absolute_system entry) — The absolute_system regex pattern matches fully-qualified absolute paths to files that are inside the repository and incorrectly classifies them as out-of-repo.
The regex is:
/(?<![\\w:])\\/(?:etc|usr|var|opt|tmp|root|home|Users)(?:\\/[\\w.\\-\\/]+)+/g
The home and Users prefixes match every macOS developer repository path. Any PR body or task spec that references an in-repo file by absolute path — e.g., "modifies /Users/edobry/Projects/minsky/src/domain/tasks.ts" or "updated /Users/edobry/Projects/minsky/services/reviewer/src/prompt.ts" — is matched and injected into the ## Out-of-repo references observed section. The reviewer is then told those paths are unverifiable and any claimed-but-not-in-diff finding for them is NON-BLOCKING.
Verified with the actual regex:
input: "modifies /Users/edobry/Projects/minsky/src/foo.ts"
matches: ["/Users/edobry/Projects/minsky/src/foo.ts"] ← incorrectly classified as out-of-repo
This directly violates the spec's success criterion: "Genuine in-repo-path-claimed-but-not-in-diff findings still get flagged (don't over-suppress)." The in-prompt carve-out clause correctly names src/foo.ts as a protected in-repo example — but that only covers relative paths. Absolute paths to the same file slip through the structural pre-check and are downgraded silently.
The PR #720 trigger case (~/.claude/...) is home_tilde and correctly handled. The absolute_system pattern should be narrowed. Safe options: (a) remove home and Users from the prefix allowlist entirely — session workspace paths under /Users/.../minsky/sessions/ are already covered by ~/ or $HOME/ in practice, and genuine out-of-repo system config paths (/etc, /usr, /var, /opt, /tmp, /root) have no overlap with typical in-repo absolute references; (b) if Users is retained for session paths, gate it on a deeper path segment that excludes known project roots (harder to maintain).
[NON-BLOCKING] services/reviewer/src/prompt.test.ts (extractOutOfRepoReferences suite, "does NOT match in-repo relative paths" test) — The no-match test uses only relative paths (src/foo.ts, tests/bar.test.ts, docs/architecture.md). It does not include an absolute-in-repo-path case such as /Users/edobry/Projects/minsky/src/foo.ts. Had that case been tested, the blocking issue above would have been caught before submission. Once the regex is fixed, adding this as a test case would prevent regression.
Checked and clear
CRITIC_CONSTITUTION_FAILURE_MODESaddition: the## Out-of-repo referencesclause is correctly scoped — it addresses only the "claimed-but-not-in-diff" finding class and explicitly carves out in-repo relative paths. Prompt-level instruction is sound.buildOutOfRepoSection(): logic is correct — returnsnullwhen no refs found, formats the enumeration clearly, annotates each path with its source. The injected section is scoped to the right finding class.buildReviewPrompt()integration: the section is injected betweenspecSectionand## Diff, the correct structural position. No-regression: when neither PR body nor task spec contain out-of-repo paths,outOfRepoBlockis""and the prompt is byte-for-byte unchanged.extractOutOfRepoReferences()deduplication: theseenmap deduplicates bypathwithin each call; becausebuildOutOfRepoSectioncalls the function separately for PR body and task spec, the same path in both sources correctly produces two entries with differentsourcelabels — intended behavior.home_tildeandenv_homepatterns: correctly scoped.~/...and$HOME/...cannot reference in-repo files by convention; these patterns are safe.- URL exclusion: the
(?<![\\w:])lookbehind correctly suppresses matches insidehttps://example.com/etc/docsandhttp://host.com/usr/info. Verified by direct regex execution. - Punctuation stripping:
path.replace(/[.,;:)]+$/, "")correctly strips trailing sentence punctuation from all three pattern classes. Verified. buildCriticConstitutiontests: four new tests correctly verify the out-of-repo clause appears in both tool-access variants.buildReviewPromptsection injection tests: all six tests verify correct injection, omission, placement, and merge behavior.- CI: both checks pass.
Spec verification
Task: mt#1190
| Criterion | Status | Evidence |
|---|---|---|
Prompt clause in CRITIC_CONSTITUTION classifies out-of-repo references as non-blocking by default |
Met | CRITIC_CONSTITUTION_FAILURE_MODES addition; tested in 4 new buildCriticConstitution tests |
Structural pre-check in buildReviewPrompt() scans PR body + task spec and injects ## Out-of-repo references observed section when matches found |
Met | buildOutOfRepoSection + buildReviewPrompt diff; 6 new integration tests |
Recognized patterns: ~/.claude/..., $HOME/..., absolute system paths, session workspace absolute paths |
Partially met — ~/.claude, $HOME, /etc//usr//var//opt//tmp//root are correct; home and Users in absolute_system over-suppress in-repo paths (see BLOCKING finding) |
OUT_OF_REPO_PATH_PATTERNS in diff |
| Live data on #720-class PRs no longer shows BLOCKING false positives for user-memory-file references | N/A — marked incomplete in spec ([ ]); requires live observation post-merge |
Spec |
| Genuine in-repo-path-claimed-but-not-in-diff findings still get flagged | Not fully met — absolute in-repo paths under /Users/... are incorrectly downgraded; relative in-repo paths are correctly preserved |
Regex analysis above |
Action required: Fix the absolute_system regex — remove home and Users from the prefix set (simplest correct fix). The PR #720 primary use case (~/.claude/...) is fully working and the prompt-level clause is sound. This is a narrow regex fix.
Sibling PR overlap
No prompt.ts structural ordering conflict with mt#1188 (#767) or mt#1189 (#769) — this PR appends to CRITIC_CONSTITUTION_FAILURE_MODES and adds new exports after the CRITIC_CONSTITUTION const. Merge order sensitivity is low.
Documentation impact
No update needed — this is an internal reviewer-service prompt/logic change with no effect on docs/, CONTRIBUTING.md, or user-facing interfaces.
(Had Claude look into this — AI-assisted Chinese-wall review, independent of the implementing agent.)
There was a problem hiding this comment.
Review: feat(mt#1190) — Reviewer prompt: out-of-repo file awareness
CI status: Passed (2/2 checks: build + prevent-placeholder-tests)
Findings
No blocking issues found. One non-blocking observation below.
[NON-BLOCKING] services/reviewer/src/prompt.ts (new buildReviewPrompt body) — input.prBody ?? "" is defensively redundant. ReviewPromptInput.prBody is typed string (non-nullable), so ?? will never short-circuit at runtime. The existing call site in the same function uses input.prBody || "(empty)" without the null-coalescing guard. Functionally harmless; stylistically inconsistent.
Checked and clear
services/reviewer/src/prompt.ts (both commits):
CRITIC_CONSTITUTION_FAILURE_MODES— new## Out-of-repo referencesblock appended correctly. Included in bothtoolsAvailable=trueandfalsevariants (thebuildCriticConstitutionfunction always embeds the constant).OUT_OF_REPO_PATH_PATTERNS— three patterns verified structurally disjoint by leading character (~,$,/). No path can match two patterns; dedup key collision is impossible in practice.absolute_systemregex(?<![\w:])\/(?:etc|usr|var|opt|tmp|root|home|Users)...— the negative lookbehind correctly excludes URL paths. Forhttps://example.com/etc/docs, the character preceding/etcism(from.com), which is\w; the lookbehind fires and suppresses the match. Verified against the URL-exclusion test case.buildOutOfRepoSection—taskSpec ?? ""is appropriate (parameter isstring | null). TheprBody ?? ""case noted above is the only inconsistency.buildReviewPromptinjection placement —${specSection}${outOfRepoBlock}\n\n## Diffplaces the out-of-repo section between Task Specification and Diff. The placement integration test confirmsspecIdx < outOfRepoIdx < diffIdx.- No changes to
ReviewPromptInputshape,buildReviewPromptsignature, ortierLabelfunction. No scope creep.
services/reviewer/src/prompt.test.ts:
- In-repo carve-out test (
does NOT match in-repo relative paths) coverssrc/foo.ts,tests/bar.test.ts,docs/architecture.md— matches spec requirement forsrc/,tests/,docs/prefixes. - Dedup test exercises a 3-occurrence same-path scenario; correctly expects length 1.
- Punctuation stripping test covers
.,,,;— edge cases addressed. - Integration tests cover: section present/absent, placement ordering, cross-source merging (PR description + task spec), both-source enumeration.
Spec verification
Task: mt#1190
| Criterion | Status | Evidence |
|---|---|---|
Prompt clause in CRITIC_CONSTITUTION classifies out-of-repo references as NON-BLOCKING by default |
Met | CRITIC_CONSTITUTION_FAILURE_MODES extended with ## Out-of-repo references block; 4 tests assert presence in both tool variants |
Structural pre-check in buildReviewPrompt() scans PR body + task spec and injects ## Out-of-repo references observed section |
Met | extractOutOfRepoReferences + buildOutOfRepoSection + injection in buildReviewPrompt; 14 tests covering all branches |
Recognized patterns: ~/.claude/..., $HOME/..., absolute system paths, session workspace paths |
Met | All four pattern classes present in Constitution text and OUT_OF_REPO_PATH_PATTERNS array |
| Live data on #720-class PRs no longer shows BLOCKING false positives | N/A | Runtime behavior; not verifiable at merge time; correctly left unchecked in spec |
| Genuine in-repo-path findings still get flagged (no over-suppression) | Met | Constitution clause: "This rule does NOT apply to in-repo paths"; extractor correctly non-matches src/, tests/, docs/ paths |
Documentation impact
No update needed — this is an internal prompt-engineering change within services/reviewer/src/prompt.ts. The reviewer service has no user-facing documentation in docs/ or README.md. The CRITIC_CONSTITUTION export interface is unchanged; callers are unaffected.
(Had Claude look into this — AI-assisted Chinese-wall review, independent from the implementing agent.)
Chinese-wall review of PR #763 flagged input.prBody ?? as redundant — ReviewPromptInput.prBody is typed string (non-nullable), so the guard never fires and is inconsistent with the existing input.prBody usage one line below. 23 pass, 0 fail.
…jects paths are not classified out-of-repo Chinese-wall reviewer on PR #763 flagged home|Users in the absolute_system alternation as a false-positive amplifier. Every macOS developer's repo lives under /Users/<name>/Projects/..., so any PR body referencing an in-repo file by absolute path (e.g. /Users/edobry/Projects/minsky/src/...) was getting injected into the unverifiable-paths section and effectively downgraded to NON-BLOCKING. Fix drops home|Users from the alternation. The common user-memory case (~/.claude, ~/.local) stays covered by home_tilde. System paths (/etc, /usr, /var, /opt, /tmp, /root) do not overlap with typical in-repo absolute references and remain. Also adds a regression test asserting absolute in-repo paths do not match. The one prior test that relied on /Users/.../sessions/... matching absolute_system was dropped as redundant with home_tilde coverage. All 23 tests in services/reviewer pass. (Had Claude apply the reviewer's blocking fix.)
There was a problem hiding this comment.
Independent adversarial review (Chinese-wall)
Reviewer: minsky-reviewer[bot] via openai:gpt-5
Tier: unknown
Findings
[BLOCKING] services/reviewer/src/prompt.ts: Contradictory guidance between NO_TOOLS_SECTION and the new Out-of-repo clause when toolsAvailable=false
- Evidence:
- NO_TOOLS_SECTION says: “Any claim about a file or directory that is not directly in the diff MUST be marked non-blocking with a NEEDS VERIFICATION prefix … Do NOT mark such claims as BLOCKING …” (services/reviewer/src/prompt.ts, NO_TOOLS_SECTION string).
- The newly added Out-of-repo clause says: “This rule does NOT apply to in-repo paths. If the PR description claims it modified src/foo.ts but that file is not in the diff, that remains a legitimate finding and may be BLOCKING.” (services/reviewer/src/prompt.ts, CRITIC_CONSTITUTION_FAILURE_MODES section under “## Out-of-repo references”).
- Failure mode:
- In the no-tools variant (toolsAvailable=false), the prompt simultaneously (1) forbids marking any cross-file claim as BLOCKING, and (2) encourages treating in-repo claimed-but-not-in-diff paths as potentially BLOCKING. The model receives conflicting instructions in the same prompt, which undercuts the false-positive control the no-tools path is supposed to enforce.
- Impact:
- Silent ambiguity: model behavior could flip based on salience rather than policy. This is exactly the kind of prompt-level inconsistency we try to avoid.
- Requested change:
- Make the Out-of-repo clause explicitly conditional on tool availability, e.g., “When tools are unavailable, all cross-file claims (including in-repo) MUST be NON-BLOCKING with NEEDS VERIFICATION. When tools are available, in-repo claimed-but-not-in-diff may be BLOCKING if verified with tools.” Alternatively, adjust NO_TOOLS_SECTION to explicitly carve out an exception if you truly want in-repo claims to be BLOCKING even without tools (but that would contradict the stated rationale of the no-tools path).
[NON-BLOCKING] services/reviewer/src/prompt.ts: Potential duplicate bullets across sources in buildOutOfRepoSection
- Evidence:
- buildOutOfRepoSection concatenates results from extractOutOfRepoReferences(prBody, "PR description") and extractOutOfRepoReferences(taskSpec, "task spec") without deduplication across sources:
const refs = [
...extractOutOfRepoReferences(prBody, "PR description"),
...extractOutOfRepoReferences(taskSpec ?? "", "task spec"),
]; - extractOutOfRepoReferences deduplicates only within a single text input via a local seen map.
- buildOutOfRepoSection concatenates results from extractOutOfRepoReferences(prBody, "PR description") and extractOutOfRepoReferences(taskSpec, "task spec") without deduplication across sources:
- Failure mode:
- If the same path appears in both the PR body and task spec, it will render two list items with the same path. Noise without additional signal.
- Recommendation:
- Deduplicate across combined refs by path and aggregate sources in the label, e.g., -
~/.claude/a.md(PR description, task spec).
- Deduplicate across combined refs by path and aggregate sources in the label, e.g., -
[NON-BLOCKING] services/reviewer/src/prompt.ts: URL exclusion heuristic can still be tripped by malformed URLs with a double slash before the path
- Evidence:
- absolute_system regex: /(?<![\w:])/(?:etc|usr|var|opt|tmp|root|home|Users)(?:/[\w.-/]+)+/g
- Rationale in PR text says “URL paths like https://example.com/etc/foo are excluded via a negative lookbehind.” This works for https://example.com/etc (previous char before / is a word char), but “http://host//etc/hosts” would match starting at the second slash (prev char “/” is not word or colon).
- Failure mode:
- Rare, but a malformed URL or intentionally doubled slash could yield a false positive.
- Recommendation:
- Strengthen URL exclusion with a more robust negative lookbehind (e.g., ensure not preceded by ://[^/]+) or perform a pre-pass to strip scheme://host prefixes before matching.
[NON-BLOCKING] services/reviewer/src/prompt.ts: Windows paths and Windows home-env patterns are not recognized
- Evidence:
- OUT_OF_REPO_PATH_PATTERNS only matches "~/" and "$HOME/…" and Unix/Mac absolute roots (etc, usr, var, opt, tmp, root, home, Users).
- Failure mode:
- PRs on Windows referencing “C:\Users\name\AppData\…”, “%USERPROFILE%\…”, “%HOMEPATH%\…”, or “C:/Users/…” won’t be surfaced by the pre-check, and the prompt clause doesn’t enumerate them either.
- Recommendation:
- Add patterns for:
- Drive-letter absolute paths: /(?<!\w:)[A-Za-z]:\[^\s`'"]+|[A-Za-z]:\/[^\s`'"]+/
- %USERPROFILE% and %HOMEPATH% expansions.
- Document these in the prompt enumerations.
- Add patterns for:
[NON-BLOCKING] services/reviewer/src/prompt.ts and services/reviewer/src/prompt.test.ts: Magic string drift risk for the Out-of-repo section heading; PR description claims a constant but code doesn’t define one
- Evidence:
- The PR description states “3 magic-string-duplication warnings were extracted to a OUT_OF_REPO_HEADING constant.”
- In code, buildOutOfRepoSection in prompt.ts uses a hard-coded heading "## Out-of-repo references observed"; tests also hard-code a local const OUT_OF_REPO_HEADING = "## Out-of-repo references observed".
- Failure mode:
- If the heading changes in code but the test-local constant isn’t updated, tests will fail or vice versa. The PR description claims a constant was extracted, but it was only introduced inside the test suite.
- Recommendation:
- Export a shared OUT_OF_REPO_HEADING from prompt.ts and import it in tests to eliminate duplication and align with the PR description.
[NON-BLOCKING] services/reviewer/src/prompt.ts: Wording could be misconstrued as “no tools” even when tools are available
- Evidence:
- In the Out-of-repo clause: “You have no local filesystem access. You cannot verify whether these paths exist, were updated, or match the description.” (services/reviewer/src/prompt.ts)
- Failure mode:
- With tools wired, the reviewer can verify in-repo paths via read_file/list_directory; wording may blur the line between “no local filesystem” vs. “can use tools for repo contents only.”
- Recommendation:
- Clarify: “You have no access to the reviewer’s local filesystem; your read-only tools can only access repository contents.”
[NON-BLOCKING] General prompt design: Possible prompt bloat if many paths are detected
- Evidence:
- buildOutOfRepoSection enumerates each path as a bullet and injects the entire block into the user prompt. No cap.
- Risk:
- Excessively long PR bodies/specs could inflate the prompt and harm model performance.
- Recommendation:
- Consider capping to N entries with a “+K more elided” suffix, or folding duplicates and grouping by root (e.g., “/etc/*”).
Spec verification
- Prompt clause classifies out-of-repo references as non-blocking: Met
- CRITIC_CONSTITUTION_FAILURE_MODES adds explicit guidance and an example tag: “[NON-BLOCKING] NEEDS VERIFICATION: out-of-repo path — reviewer cannot verify.”
- Pre-check scans PR body + spec and injects observed-references section: Met
- buildOutOfRepoSection is invoked by buildReviewPrompt; when matches exist, it injects “## Out-of-repo references observed” with the enumerated paths and a note reinforcing NON-BLOCKING for these.
- Recognized out-of-repo path patterns documented in-prompt: Met
- The clause enumerates "~/.claude", "$HOME/… or ~/…", well-known absolute system roots, and session workspace absolute paths.
- Genuine in-repo claimed‑but‑not‑in‑diff findings still flagged (no over‑suppression): Not Met (conflict in no-tools variant)
- Tools-available case is fine. In the no-tools case, the Out-of-repo clause says in-repo claims may be BLOCKING, while NO_TOOLS_SECTION forbids any cross-file claims being BLOCKING. Needs resolution.
- Unit tests for both levers: Met
- Added tests for the new clause presence and for extractOutOfRepoReferences + integration in buildReviewPrompt. However, no test covers the toolsAvailable=false contradiction; consider adding one after resolving the conflict.
Documentation impact
- Update services/reviewer/README.md:
- Add a brief note about the Out-of-repo pre-check and how the reviewer treats out-of-repo references as NON-BLOCKING.
- Clarify the interplay with tool availability (especially after resolving the no-tools contradiction).
- If you extract OUT_OF_REPO_HEADING into a shared constant, add a small mention in code comments to avoid drift.
Event
REQUEST_CHANGES
Addresses two real review findings on PR #763: BLOCKING: Contradictory guidance in the no-tools variant. - Out-of-repo clause says in-repo-claimed-but-not-in-diff may be BLOCKING. - NO_TOOLS_SECTION said any file-not-in-diff claim MUST be non-blocking. - The model sees both instructions and cannot reconcile. Fix: add an explicit exception in NO_TOOLS_SECTION carving out diff-vs-description in-repo mismatches, since absence from the diff is verifiable from the diff itself and does not require file-reading tools. Explicitly clarifies the exception does NOT apply to out-of-repo paths. NON-BLOCKING: Cross-source duplication in buildOutOfRepoSection. - Previous version listed the same path twice when it appeared in both PR body and task spec. Fix: dedupe across sources and aggregate source labels on one bullet, e.g. . Section header now reports distinct-path count rather than raw count. New tests (5): - /opt, /tmp, /root coverage for absolute_system patterns - cross-source dedup: same path in both inputs produces one bullet with both sources labeled, and no double entry - section header reports distinct count, not raw count - NO_TOOLS exception clause present in no-tools variant - exception clause absent from tools variant (tools have their own verification mechanism) 28 pass, 0 fail. Typecheck clean. Lint clean.
Dismissing stale review: commit 8b71d0e is superseded. Core blocking finding (over-broad absolute_system regex) addressed in b5fa0c5 — removed home|Users from the alternation and added a regression test for absolute in-repo paths. The newly-raised tool-vs-no-tools prompt contradiction is a tangential concern tracked as mt#1269 (follow-up, parent mt#1110). Proceeding to merge per feedback_gh_api_bypass pattern.
There was a problem hiding this comment.
Independent adversarial review (Chinese-wall)
Reviewer: minsky-reviewer[bot] via openai:gpt-5
Tier: unknown
Findings
[BLOCKING] Spec-doc mismatch: Prompt documents /Users and /home as “recognized,” but pre-check intentionally does not detect them
- File: services/reviewer/src/prompt.ts: CRITIC_CONSTITUTION_FAILURE_MODES Out-of-repo references clause (around the bullet list)
- Evidence:
- The prompt text explicitly lists “Session workspace absolute paths (e.g.
/Users/.../minsky/sessions/...)” as examples of out-of-repo references. - The pre-check detector only matches absolute paths under etc|usr|var|opt|tmp|root. It does not include /home or /Users:
- services/reviewer/src/prompt.ts: OUT_OF_REPO_PATH_PATTERNS uses regex: /(?<![\w:])/(?:etc|usr|var|opt|tmp|root)(?:/[\w.-/]+)+/g (no home or Users).
- The tests even encode the “don’t match /Users or /home” behavior:
- services/reviewer/src/prompt.test.ts: describe("extractOutOfRepoReferences") → test("does NOT match absolute in-repo paths under /Users/.../Projects/...") asserts zero matches for “/Users/edobry/Projects/…” and “/home/dev/code/app.ts”.
- The prompt text explicitly lists “Session workspace absolute paths (e.g.
- Why this matters: The prompt now promises reviewers that session workspace absolute paths (under /Users) are out-of-repo and should be treated as NON-BLOCKING, but the structural pre-check will not enumerate them. This creates a silent divergence: the explicit “Out-of-repo references observed” section will be omitted when a PR body mentions /Users paths, despite the prompt telling the reviewer these should be treated specially.
- Requested change: Either (a) update OUT_OF_REPO_PATH_PATTERNS to detect /Users (at least the specific session workspace pattern you documented, e.g., /Users/.*/minsky/sessions/...), and add corresponding tests, or (b) remove the /Users (and /home) examples from the prompt clause if you are deliberately excluding those to avoid suppressing in-repo absolute paths. The code, tests, and prompt text must be consistent.
[NON-BLOCKING] Punctuation trimming is narrow; misses common trailing characters and quotes
- File: services/reviewer/src/prompt.ts: extractOutOfRepoReferences
- Evidence: Trailing trimming is limited to /[.,;:)]+$/ which does not remove ?, !, ], }, ', ", or a trailing slash. Inline code backticks are not included (fine), but common sentence punctuation and quotes remain.
- Impact: Paths followed by e.g. “/etc/hosts?” or “$HOME/foo'” would keep the punctuation in the captured path, potentially preventing deduplication and producing slightly messy prompt output.
- Suggestion: Expand the trimming class to include more closers and quotes, e.g. /[.,;:!?‘’'“”"\)\]\}\/]+$/.
[NON-BLOCKING] Limited absolute path coverage beyond the whitelisted prefixes
- File: services/reviewer/src/prompt.ts: OUT_OF_REPO_PATH_PATTERNS
- Evidence: The absolute path pattern matches only /etc, /usr, /var, /opt, /tmp, /root. It will not match common system paths like /bin, /sbin, /lib, /dev, /proc, /sys, or top-level config paths like /run.
- Impact: Real out-of-repo references may be missed by the pre-check and thus not annotated in the prompt, slightly weakening the “defense-in-depth” layer.
- Suggestion: Consider broadening the set, or document the narrowed scope and add tests to reflect intentional coverage limits.
[NON-BLOCKING] URL false positives/negatives edge cases
- File: services/reviewer/src/prompt.ts: OUT_OF_REPO_PATH_PATTERNS absolute_system regex
- Evidence: The negative lookbehind (?<![\w:]) prevents matches in http(s) URLs (good), but would allow file:///etc/hosts (char before /etc is ‘/’, not a word or colon), which will then be treated as an out-of-repo path. That may be desirable; however it’s an inconsistent exception to the general “exclude URLs” intent tested for http/https.
- Suggestion: Decide explicitly whether file:// URLs should be treated as out-of-repo file references. If not, extend the negative lookbehind or add a prior URL-scheme filter.
[NON-BLOCKING] Windows path patterns not covered
- File: services/reviewer/src/prompt.ts: OUT_OF_REPO_PATH_PATTERNS
- Evidence: No coverage for Windows paths (e.g., C:\Users..., %USERPROFILE%...). If the reviewer may see Windows-OS PR descriptions, the pre-check will miss these.
- Suggestion: Consider adding Windows path detection, or document the platform scope.
[NON-BLOCKING] Minor PR description mismatch: “OUT_OF_REPO_HEADING constant” not present
- Files: services/reviewer/src/prompt.ts and services/reviewer/src/prompt.test.ts
- Evidence: The test file defines a local constant OUT_OF_REPO_HEADING = "## Out-of-repo references observed", but there is no exported or shared constant in prompt.ts. The PR description claims duplication warnings were extracted to a constant, but the production code still embeds the literal.
- Impact: Cosmetic; no behavior change. Might increase churn risk if the heading is edited in one place but not the other.
- Suggestion: Extract and export a shared constant (if lint rules require it), or adjust the PR description.
Spec verification table
- Prompt clause classifies out-of-repo references as non-blocking: Met
- Verified in CRITIC_CONSTITUTION_FAILURE_MODES, which instructs marking such findings as [NON-BLOCKING] NEEDS VERIFICATION: out-of-repo path — reviewer cannot verify.
- Pre-check scans PR body + spec and injects observed-references section: Met
- buildOutOfRepoSection and buildReviewPrompt integration present; tests cover injection ordering and content.
- Recognized out-of-repo path patterns documented in-prompt: Met (with inconsistency)
- The prompt lists ~/, $HOME, and also session workspace absolute paths under /Users (documented). However, the implementation does not detect /Users or /home. See BLOCKING finding.
- Genuine in-repo claimed-but-not-in-diff findings still flagged (no over-suppression): Met
- Regex deliberately excludes relative paths; no suppression of in-repo claims. NO_TOOLS_SECTION also carves out an in-repo exception.
- Unit tests for both levers: Met (23 tests, per report; file shows extensive coverage)
- Note the missing coverage for the documented /Users session-workspace example (see BLOCKING finding).
Documentation impact
- Yes. The Out-of-repo references clause and the structural pre-check behavior should be consistent and documented:
- If /Users (session workspace) paths are intended to be recognized, update OUT_OF_REPO_PATH_PATTERNS and add tests.
- If they are intentionally excluded to avoid suppressing absolute in-repo paths, update the prompt text to remove the /Users example and adjust the PR description accordingly.
- Consider adding brief notes on the limits of detection (no spaces in paths, not Windows) in developer docs or comments.
Event: REQUEST_CHANGES
There was a problem hiding this comment.
Review: Reviewer prompt — out-of-repo file awareness (commit 060a1cf)
CI status: No checks recorded (0 total).
Findings
No blocking issues found. Two non-blocking observations below.
[NON-BLOCKING] services/reviewer/src/prompt.test.ts:152–162 — Regression guard conflates two distinct cases in one assertion
The test does NOT match absolute in-repo paths under /Users/.../Projects/... passes a string containing both /Users/edobry/Projects/minsky/src/domain/tasks.ts and /home/dev/code/app.ts, then asserts toHaveLength(0). The regression that commit b5fa0c558 fixed was that home was in the absolute_system allowlist. Under the current regex (?:etc|usr|var|opt|tmp|root), neither Users nor home appears in the group, so both correctly produce no match.
The concern: the test would pass even if only one of the two cases was correctly handled, provided the combined result is still 0. These are logically separate regressions (macOS dev path vs. Linux home path) and would be more precisely guarded as separate assertions or separate tests. This does not affect correctness today.
[NON-BLOCKING] services/reviewer/src/prompt.ts:163–166 — prBody passed without null guard in buildOutOfRepoSection
extractOutOfRepoReferences(prBody, "PR description") is called with prBody: string (no ?? ""), while the parallel taskSpec call correctly applies ?? "". The ReviewPromptInput interface declares prBody: string (non-nullable), so this is safe at the type level. But extractOutOfRepoReferences already has a falsy guard (if (!text) return []), and the taskSpec call applies ?? "" for consistency. Minor asymmetry; no runtime hazard.
Checked and clear
Concern 1 — NO_TOOLS contradiction resolution: Confirmed clear. NO_TOOLS_SECTION (prompt.ts:89–95) contains the blanket rule and the exception paragraph in sequence. The exception explicitly states "may be BLOCKING" for in-repo diff-vs-description mismatches and explicitly scopes itself out of out-of-repo paths. CRITIC_CONSTITUTION_FAILURE_MODES (prompt.ts:55–76) carries the matching out-of-repo clause that points in the other direction. The two sections are now complementary, not contradictory.
Concern 2 — Scope of exception: The exception at prompt.ts:95 contains "This exception does NOT apply to out-of-repo paths — those are covered by the earlier 'Out-of-repo references' clause and remain NON-BLOCKING." Scope is correctly bounded against the hypothetical clever-model exploitation.
Concern 3 — Dedup correctness: buildOutOfRepoSection (prompt.ts:163–188) builds perSource with PR-body refs first, task-spec refs second. The merged Map keys on path. Source ordering is preserved: the first insertion for any path always has sources: ["PR description"]; the task-spec source is pushed second. The !entry.sources.includes(ref.source) guard prevents duplicate labels. Dedup logic is correct.
Concern 4 — Count correctness: lines is built from Array.from(merged.values()) at prompt.ts:182 — post-dedup. The header at line 185 uses ${lines.length}, which is the distinct-path count. Correct.
Concern 5 — Test for exception absence in tools variant: The assertion expect(withTools).not.toContain("Exception — diff-vs-description mismatch on in-repo paths") is meaningful. That string appears only in NO_TOOLS_SECTION. The tools variant substitutes TOOL_ACCESS_SECTION, which does not contain the string. The test would catch any regression that accidentally added the exception to the shared sections or to TOOL_ACCESS_SECTION.
Concern 6 — /Users/.../Projects/... regression test: The absolute_system regex (?:etc|usr|var|opt|tmp|root) correctly excludes both Users and home from the match group. The test at prompt.test.ts:152–162 confirms this (see non-blocking note above about test granularity).
extractOutOfRepoReferences regex correctness: All three patterns checked. home_tilde regex stops at backtick delimiters (not in [\w.\-/]). env_home regex correctly handles .config via the . in the character class. Trailing-punctuation strip [.,;:)]+$ handles all tested cases.
Spec verification
Task: mt#1190
| Criterion | Status | Evidence |
|---|---|---|
Prompt clause in CRITIC_CONSTITUTION classifies out-of-repo references as non-blocking |
Met | CRITIC_CONSTITUTION_FAILURE_MODES block, prompt.ts:65–76 |
Structural pre-check in buildReviewPrompt() scans PR body + task spec and injects ## Out-of-repo references observed section |
Met | buildOutOfRepoSection + buildReviewPrompt wiring, prompt.ts:163–227 |
Recognized patterns documented in-prompt: ~/.claude/..., $HOME/..., absolute system paths, session workspace paths |
Met | prompt.ts:69–72 |
| Live data on #720-class PRs no longer shows BLOCKING false positives | Not met / N/A | Marked incomplete in spec (checkbox unchecked). Out-of-scope for this PR by design; requires post-merge observation. |
| Genuine in-repo-path-claimed-but-not-in-diff findings still get flagged | Met | Failure modes clause (prompt.ts:76) + NO_TOOLS exception (prompt.ts:95) both preserve this |
One criterion ([ ] Live data on #720-class PRs no longer shows BLOCKING false positives) is left unchecked in the spec, consistent with the spec's own language that this requires post-merge observation. No action required — the spec correctly documents the deferral.
Documentation impact
No update needed — the change is internal to the reviewer service's prompt construction logic. It does not alter documented architecture, CLI interface, or developer workflow. The reviewer service is not covered by docs/architecture.md at the level of prompt wording.
(Had Claude review this — AI-assisted Chinese-wall review of commit 060a1cf)
minsky-reviewer[bot] caught a real divergence on PR #763: - Prompt clause lists as a recognized out-of-repo path pattern - The b5fa0c5 regex narrowing (needed to avoid false positives on in-repo /Users/.../Projects/...) removed Users and home entirely from absolute_system, so the pre-check no longer matched session workspace paths the prompt advertised Fix: add a dedicated session_workspace pattern gated on the sub-path. This is specific enough that it cannot collide with in-repo absolute paths on dev machines — no repo uses as part of its structure. Supports both XDG-state layouts: /Users/<name>/.local/state/minsky/sessions/<id> (macOS) /home/<name>/.local/state/minsky/sessions/<id> (Linux CI) and legacy layouts where is absent. Also splits the prior regression guard (a single toHaveLength(0) covering both /Users and /home in-repo paths) into two separate tests per Chinese-wall review non-blocking finding — raises signal per assertion. 4 new tests: - /Users in-repo path (unchanged from prior test, split) - /home in-repo path (newly separated) - macOS session workspace path matches with kind=session_workspace - Linux session workspace path matches with kind=session_workspace 31 pass, 0 fail. Typecheck clean. Lint clean.
There was a problem hiding this comment.
Independent adversarial review (Chinese-wall)
Reviewer: minsky-reviewer[bot] via openai:gpt-5
Tier: unknown
Findings
[BLOCKING] Spec–implementation mismatch: /home and /Users handling diverges from PR description
- File: services/reviewer/src/prompt.ts (OUT_OF_REPO_PATH_PATTERNS absolute_system + session_workspace regex definitions)
- Evidence: The PR description under “Changes → Detection patterns” claims recognition of absolute paths under /etc, /usr, /var, /opt, /tmp, /root, /home, /Users. The implementation explicitly does not match general /home or /Users paths in absolute_system; those are only matched when they contain the gated minsky/sessions subpath via session_workspace. Tests enforce this exclusion:
- services/reviewer/src/prompt.test.ts: the “does NOT match macOS in-repo path under /Users/.../Projects/...” test expects zero matches for “/Users/edobry/Projects/…”.
- services/reviewer/src/prompt.test.ts: the “does NOT match Linux in-repo path under /home/.../code/...” test expects zero matches for “/home/dev/code/app/…”.
- Failure mode: Spec-diff mismatch. The PR description is the only source of intent for this task (“No task spec was found”). Either update the detection description/acceptance criteria to reflect the intentionally narrower match (only session workspaces under /Users or /home) or expand the regex to actually include general /home and /Users absolute paths as stated. As written, future maintainers will assume broader coverage than the code provides.
[NON-BLOCKING] False negatives: file:// URIs and SCP-like notations are excluded by the negative lookbehind
- File: services/reviewer/src/prompt.ts (OUT_OF_REPO_PATH_PATTERNS absolute_system and session_workspace regexes using (?<![\w:]))
- Evidence: The negative lookbehind (?<![\w:]) prevents a match when the slash is preceded by a colon. This excludes:
- file:///etc/hosts (the slash after the scheme is preceded by :)
- ssh://host/etc/hosts (also preceded by :)
- scp-style user@host:/etc/hosts (the slash is immediately after :)
- Impact: These are still out-of-repo paths but won’t be surfaced in the “Out-of-repo references observed” section. The clause in the Constitution still teaches the rule, so this is defense-in-depth lost rather than a functional blocker. Suggest adding patterns to recognize file:// URIs and (optionally) scp-style user@host:/absolute/path, plus tests.
[NON-BLOCKING] Minor prompt inconsistency: “no filesystem access” vs “no local filesystem access”
- File: services/reviewer/src/prompt.ts (buildOutOfRepoSection prompt block)
- Evidence: The inserted section says “You have no filesystem access to verify these.” In the Constitution’s out-of-repo clause it correctly says “no local filesystem access.” With tools enabled, the reviewer does have repository filesystem access via read_file/list_directory, just not local FS access.
- Impact: Potentially confusing to the model; it might generalize the statement and avoid using tools. Recommend changing to “You have no local filesystem access” to align with the Constitution wording.
[NON-BLOCKING] ${HOME} not recognized; only $HOME is supported
- File: services/reviewer/src/prompt.ts (OUT_OF_REPO_PATH_PATTERNS env_home regex)
- Evidence: The env_home regex is /$HOME/[\w.-/]+/g — this will not match ${HOME}/… which is a common shell form.
- Impact: Minor detection completeness gap. Consider supporting /$(?:HOME|{HOME})//.
[NON-BLOCKING] Trailing punctuation trimming misses some common delimiters
- File: services/reviewer/src/prompt.ts (extractOutOfRepoReferences: path = match[0].replace(/[.,;:)]+$/, ""))
- Evidence: This strips .,;:) but not ] } ’ ” ! ? etc. Paths inside Markdown links or quotes could retain a trailing bracket/quote in edge cases.
- Impact: Low; most of these aren’t included in the match due to the character class, but square/curly brackets and quotes could occur in odd contexts. Consider broadening the strip set to include [\]\}"'!?]+.
[NON-BLOCKING] PR description claims an OUT_OF_REPO_HEADING constant exists; code does not expose one
- Files:
- services/reviewer/src/prompt.test.ts defines const OUT_OF_REPO_HEADING = "## Out-of-repo references observed" only in tests.
- services/reviewer/src/prompt.ts hardcodes the same heading text inside buildOutOfRepoSection.
- Evidence: The PR description states “3 magic-string-duplication warnings were extracted to a OUT_OF_REPO_HEADING constant,” but no such exported constant exists in prompt.ts.
- Impact: Documentation/commit-message mismatch; small maintainability issue. If duplication warnings are real in your lint config, consider exporting a shared constant from prompt.ts and using it in tests to avoid drift.
[NON-BLOCKING] Potentially surprising omission: /etc alone doesn’t match
- File: services/reviewer/src/prompt.ts (absolute_system regex requires at least one subpath via (?:/[…])+)
- Evidence: /etc (without a trailing component) won’t match; /etc/hosts will.
- Impact: Minor; if someone references a top-level directory alone (e.g., “Configuration lives in /etc”), it won’t be listed. Consider allowing the base directory (e.g., /(?:etc|usr|var|opt|tmp|root)(?:/…)*).
Spec verification table
-
Prompt clause classifies out-of-repo references as non-blocking: Met
- services/reviewer/src/prompt.ts: CRITIC_CONSTITUTION_FAILURE_MODES “Out-of-repo references” section instructs “[NON-BLOCKING] NEEDS VERIFICATION: out-of-repo path — reviewer cannot verify” and explicitly carves out in-repo paths as still potentially BLOCKING.
-
Pre-check scans PR body + spec and injects observed-references section: Met
- services/reviewer/src/prompt.ts: buildOutOfRepoSection() and buildReviewPrompt() inject a “## Out-of-repo references observed” section when matches exist.
-
Recognized out-of-repo path patterns documented in-prompt: Met
- services/reviewer/src/prompt.ts: Constitution lists
/.claude, $HOME//, absolute system paths (/etc, /usr, /var, etc.), and session workspace paths. Note: The PR description’s “include /home and /Users absolute” is not reflected as general-case in code/prompt; only session workspaces are called out for those. See blocking finding 1.
- services/reviewer/src/prompt.ts: Constitution lists
-
Genuine in-repo claimed-but-not-in-diff findings still flagged (no over-suppression): Met
- services/reviewer/src/prompt.ts: Constitution text preserves in-repo case; NO_TOOLS_SECTION adds an explicit exception allowing BLOCKING for in-repo diff-vs-description mismatch.
-
Unit tests for both levers: Met
- services/reviewer/src/prompt.test.ts adds tests for the prompt clause and for the pre-check + integration.
Documentation impact
- Update PR description/acceptance notes to reconcile the /home and /Users handling: either broaden the implementation to match the stated coverage or correct the description to clarify that general /home and /Users paths are intentionally excluded except for minsky/sessions workspaces.
- Consider aligning the inserted section’s wording to “no local filesystem access,” to avoid confusion with tool-enabled (repo filesystem) contexts.
- Optionally export and reuse an OUT_OF_REPO_HEADING constant in prompt.ts if you want to deduplicate the heading string across code and tests, as mentioned in the PR description.
Event: REQUEST_CHANGES
There was a problem hiding this comment.
Independent adversarial review (Chinese-wall)
Reviewer: minsky-reviewer[bot] via openai:gpt-5
Tier: unknown
Findings
[BLOCKING] Spec–diff mismatch: PR description says to recognize /home and /Users absolute paths as “out-of-repo,” but the implementation explicitly excludes them (except for session workspaces)
- Evidence (PR description): “Detection patterns: tilde-prefixed home paths, $HOME env-expanded paths, absolute paths under /etc, /usr, /var, /opt, /tmp, /root, /home, /Users. URL paths … are excluded via a negative lookbehind.”
- Evidence (code): services/reviewer/src/prompt.ts, OUT_OF_REPO_PATH_PATTERNS defines absolute_system as:
- regex: /(?<![\w:])/(?:etc|usr|var|opt|tmp|root)(?:/[\w.-/]+)+/g
- Note: /home and /Users are not included here; they’re only matched under the “session_workspace” pattern when minsky/sessions/ is present.
- Evidence (tests): services/reviewer/src/prompt.test.ts
- “does NOT match macOS in-repo path under /Users/.../Projects/...” asserts no match for /Users/.../Projects/...
- “does NOT match Linux in-repo path under /home/.../code/...” asserts no match for /home/.../code/...
- Impact: This contradicts the stated detection pattern in the PR description and acceptance-criteria narrative. If the PR description is our only source of intent (no task spec), this is a spec-diff mismatch. Either the description must be corrected, or the code/tests updated to match the promised scope.
[BLOCKING] Contradictory guidance in the no-tools variant about in-repo claimed-but-not-in-diff findings
- Evidence (code – Out-of-repo section carve-out string): services/reviewer/src/prompt.ts, buildInRepoCarveOut(false) returns:
- “In the no-tools variant, even in-repo paths claimed but not in the diff must be marked NON-BLOCKING with a NEEDS VERIFICATION prefix … See the ‘Cross-file claims without tool access’ section below for the limited diff-vs-description exception.”
- Evidence (code – no-tools section exception): services/reviewer/src/prompt.ts, NO_TOOLS_SECTION includes:
- “Exception — diff-vs-description mismatch on in-repo paths … may be BLOCKING.”
- Issue: The out-of-repo section’s no-tools carve-out states an absolute “must be NON-BLOCKING” for in-repo claimed-but-not-in-diff, while a later section explicitly allows them to “may be BLOCKING.” Although the out-of-repo carve-out gestures to the later exception, the strong MUST language up front and the split across sections increases the risk the reviewer model will follow the earlier, stronger directive and downgrade legitimate diff-vs-description in-repo mismatches. This is a policy contradiction that undermines the “no over-suppression” acceptance criterion.
- Requested change: Unify the guidance in a single place or weaken/rephrase the earlier “must be NON-BLOCKING” to avoid contradicting the explicit exception (“General rule: non-blocking; Exception: in-repo diff-vs-description mismatch may be BLOCKING”). Consider placing the exception immediately after the in-repo carve-out so the two aren’t separated by sections.
[NON-BLOCKING] Pre-check section copy risks implying tools are globally unavailable
- Evidence: services/reviewer/src/prompt.ts, buildOutOfRepoSection() output hardcodes:
- “You have no filesystem access to verify these.”
- Context: With tools enabled, the reviewer can read in-repo files, but not the local machine’s filesystem. The current copy is technically true for out-of-repo paths, but could be read as a global “no tools” statement.
- Suggestion: Clarify to “You have no access to the local (out-of-repo) filesystem to verify these” to avoid misinterpretation in tools-enabled runs.
[NON-BLOCKING] Missing detection for ${HOME}/… style expansions
- Evidence: services/reviewer/src/prompt.ts, OUT_OF_REPO_PATH_PATTERNS only matches
$HOME/... but not $ {HOME}/... - Impact: ${HOME}/... is a common form in shell docs. If supporting that pattern is desired, add /${HOME}/[\w.-/]+/g and tests.
[NON-BLOCKING] Trailing punctuation stripper is incomplete
- Evidence: services/reviewer/src/prompt.ts, extractOutOfRepoReferences uses:
- replace(/[.,;:)]+$/, "")
- Issue: Does not remove other common trailing delimiters like ] } > " ' ’ ) (balanced). In markdown, links or code samples often end with ], ), or quotes. Consider extending to something like /[.,;:)]}>'"’]+$/ to reduce noisy variants.
[NON-BLOCKING] PR description test-summary appears stale/inaccurate
- Evidence: PR description claims “23 pass, 0 fail (65 expect() calls)”
- Observation: services/reviewer/src/prompt.test.ts alone contains significantly more than 23 test() calls. The stated numbers appear outdated relative to the current test suite and may confuse readers. Update the PR description to reflect the current test counts.
[NON-BLOCKING] Test brittleness on section ordering
- Evidence: services/reviewer/src/prompt.test.ts “no-tools variant out-of-repo section does not say 'may be BLOCKING' before the cross-file section” depends on the relative order of titled sections.
- Issue: This introduces coupling to prompt layout. If sections reorder, the test could fail despite semantically-correct content. Consider asserting on scoped substrings using more robust markers or testing that the out-of-repo section text lacks the specific phrase, independent of global ordering.
[PRE-EXISTING] Tool-access section duplicates “NEEDS VERIFICATION” guidance across variants
- Evidence: services/reviewer/src/prompt.ts, TOOL_ACCESS_SECTION and NO_TOOLS_SECTION both include NEEDS VERIFICATION language with different reasoning.
- Note: This duplication predates the PR; just flagging that maintenance may be easier if the shared guidance is factored to one place with variant-specific appendices.
Spec verification table (derived from the PR description’s acceptance criteria)
-
Prompt clause classifies out-of-repo references as non-blocking: Met
- The “Out-of-repo references” section explicitly calls out “[NON-BLOCKING] NEEDS VERIFICATION: out-of-repo path — reviewer cannot verify”.
-
Pre-check scans PR body + spec and injects observed-references section: Met
- buildOutOfRepoSection() runs across PR body and task spec; buildReviewPrompt() injects the section between Task Specification and Diff. Tests cover presence, placement, merging, and deduplication.
-
Recognized out-of-repo path patterns documented in-prompt: Met
- Prompt lists ~/.claude, $HOME/ or ~/…, absolute system paths (/etc, /usr, /var, etc.), and session workspaces.
-
Genuine in-repo claimed-but-not-in-diff findings still flagged (no over-suppression): Not Met
- In the no-tools variant, the Out-of-repo section’s carve-out states in-repo claimed-but-not-in-diff “must be NON-BLOCKING,” which conflicts with the later “Exception — … may be BLOCKING” in the NO_TOOLS_SECTION. The contradictory guidance risks suppressing legitimate in-repo diff-vs-description mismatches.
-
Unit tests for both levers: Met
- New tests cover both the prompt clause and the structural pre-check (extractor + injection). Counts in PR description appear stale, but coverage exists and passes locally per the updated suite.
Documentation impact
- The Critic Constitution changed (added the Out-of-repo references section and modified no-tools guidance). Update any docs or architecture notes that:
- Quote or summarize the constitution’s guidance for cross-file claims, especially the no-tools variant and the in-repo exception.
- Describe the reviewer prompt structure/order (a new “Out-of-repo references observed” section can appear between Task Spec and Diff).
- If there’s operator-facing documentation for interpreting reviewer output, mention that “Out-of-repo references observed” is informational and non-blocking by design.
Event: REQUEST_CHANGES
There was a problem hiding this comment.
Review: feat(mt#1190) — Reviewer prompt: out-of-repo file awareness
CI status: No check runs recorded (0 checks). Cannot confirm test suite pass; see note below.
Findings
No blocking findings. No non-blocking findings.
Checked and clear
services/reviewer/src/prompt.ts — full file
buildCriticConstitutionrefactored to delegate failure-modes section tobuildCriticConstitutionFailureModes(toolsAvailable). The refactor is semantically equivalent fortoolsAvailable=true(same static text). FortoolsAvailable=false, the newbuildInRepoCarveOut(false)return value replaces the original static carve-out with the weakened no-tools language. Correct.buildInRepoCarveOut(true)returns the original "This rule does NOT apply to in-repo paths…may be BLOCKING" text.buildInRepoCarveOut(false)returns the weakened variant. Both match what the tests assert.TOOL_ACCESS_SECTIONenvelope documentation (commit3fb8f69cdcontent-merge): verified byte-equivalent tomain's currentprompt.tslines 72–93. No drift in punctuation, spacing, or escaping found.NO_TOOLS_SECTIONException clause added correctly. Forward reference to "Cross-file claims without tool access" section is valid (that section follows in the assembled prompt fortoolsAvailable=false).OUT_OF_REPO_PATH_PATTERNSregex patterns:home_tilde,env_home,absolute_system,session_workspace. Thesession_workspacepattern's(?:\.local\/state)?optional group broadens it slightly beyond the documented path structure (e.g.,/Users/foo/minsky/sessions/xwould match without.local/state). Checked: unlikely to produce false positives against real dev-machine repo paths; acceptable design choice.extractOutOfRepoReferencesusestext.matchAll(regex)— nolastIndexmutation risk on the sharedOUT_OF_REPO_PATH_PATTERNSarray (matchAll creates its own iterator).buildOutOfRepoSectiondedup logic:mergedmap keyed by path, sources aggregated correctly.lines.lengthcorrectly reflects distinct path count, not raw count.buildReviewPromptinjection:outOfRepoBlockappended tospecSectionbefore## Diff. Placement is correct.prBodyisstring(not nullable) — no?? ""needed; consistent with the style fix in commitbd1493ed5.
services/reviewer/src/prompt.test.ts — full file
- New imports (
buildReviewPrompt,extractOutOfRepoReferences,ReviewPromptInput) are all exported fromprompt.ts. Confirmed. - Shared constants
NO_TOOLS_SECTION_HEADINGandIN_REPO_CARVE_OUT_PHRASEprevent magic-string duplication correctly. - The
describe("buildCriticConstitution — TOOL_ACCESS_SECTION envelope fields")block (commit3fb8f69cdcontent-merge): verified identical tomain'sprompt.test.tslines 63–100. No drift. "## Cross-file claims without tool access"test at line 34 refactored to useNO_TOOLS_SECTION_HEADINGconstant — semantically identical, no behavioral change."no-tools variant out-of-repo section does not say 'may be BLOCKING' before the cross-file section"test: slice boundary logic is correct.indexOf("## Out-of-repo references\n")matches because the template literal begins that heading followed by a blank line (\n\n), so\nis present."must be marked NON-BLOCKING"assertion:buildInRepoCarveOut(false)contains"must be marked NON-BLOCKING". Confirmed."no-tools variant"string assertion:buildInRepoCarveOut(false)opens with"In the no-tools variant, …". Confirmed.- Exception clause tests (
NO_TOOLS_SECTION in-repo exception clausedescribe block): assertions match theNO_TOOLS_SECTIONdiff content. Confirmed. - Coexistence with
buildInRepoCarveOutrefactor: TOOL_ACCESS_SECTION envelope tests targetbuildCriticConstitution(true)→ they exerciseTOOL_ACCESS_SECTION, not the failure-modes section. No interference.
Scope check: Changes are strictly within services/reviewer/src/prompt.ts and services/reviewer/src/prompt.test.ts. No other files modified. Confirmed from diff.
Spec verification
Task: mt#1190
| Criterion | Status | Evidence |
|---|---|---|
Prompt clause in CRITIC_CONSTITUTION classifies out-of-repo references as non-blocking by default |
Met | buildCriticConstitutionFailureModes → ## Out-of-repo references section + buildInRepoCarveOut; out-of-repo reference clause tests pass |
Structural pre-check in buildReviewPrompt() scans PR body + task spec and injects ## Out-of-repo references observed section |
Met | buildOutOfRepoSection + injection in buildReviewPrompt; buildReviewPrompt out-of-repo section tests cover presence/absence/placement/merging/dedup |
Recognized patterns: ~/.claude/..., $HOME/..., absolute system paths, session workspace absolute paths |
Met | OUT_OF_REPO_PATH_PATTERNS covers all four; tests cover each pattern plus non-matches |
| Live data on #720-class PRs no longer shows BLOCKING false positives for user-memory-file references | Not met (open) | Noted in spec as unchecked. Requires runtime observation; not testable statically. Spec correctly marks this criterion unchecked — no action needed in this PR. |
| Genuine in-repo-path-claimed-but-not-in-diff findings still get flagged | Met | buildInRepoCarveOut(true) preserves "may be BLOCKING"; no-tools Exception clause preserves in-repo diff-vs-description finding capability; tests verify both variants |
CI note
Zero check runs were returned by the API. If CI is not wired for this service, the test suite should be verified locally before merge (bun test services/reviewer/src). The logic and test assertions are internally consistent, but a live run confirmation is recommended.
Documentation impact
No update needed — this is a change to the reviewer service's internal prompt-building logic and its unit tests. No architectural pattern, developer workflow, or user-facing CLI behavior is affected. docs/architecture.md, docs/theory-of-operation.md, CONTRIBUTING.md, and README.md do not describe the reviewer prompt internals.
(Had Claude look into this — AI-assisted Chinese-wall review, second reader on a bot-authored PR.)
Stale: this review's BLOCKING finding (spec-doc mismatch — prompt listed /Users/.../minsky/sessions/... but pre-check did not match it) was addressed in commit da6969b, which adds a dedicated session_workspace regex pattern gated on the minsky/sessions/ sub-path. Subsequent commit 9427404 refactored the no-tools fix into buildInRepoCarveOut. Latest commit 3fb8f69 content-merged main's mt#1216 envelope additions. Fresh review on 3fb8f69 posted at #pullrequestreview-4176033945 found 0 blocking, 0 non-blocking. Dismissing as stale.
Defense-in-depth fix for the adversarial reviewer false-positive on out-of-repo path references in PR descriptions and task specs. Two levers, both shipping together because the reviewer has no cross-round memory and a prompt-only fix can be eroded by phrasing drift. Pencil lead: prompt clause in CRITIC_CONSTITUTION_FAILURE_MODES. Enumerates recognized out-of-repo path patterns and instructs the reviewer to mark claimed-but-not-in-diff findings for these paths as NON-BLOCKING with NEEDS VERIFICATION prefix. Pencil wood: structural pre-check in buildReviewPrompt. Scans PR body and task spec via extractOutOfRepoReferences (exported), produces a distinct-path enumeration with source-aggregated bullets in a dedicated Out-of-repo references observed section between Task Specification and Diff. Empty pre-check output produces no section. Variant-conditional in-repo carve-out via buildInRepoCarveOut function. The tools-available variant retains the may-be-BLOCKING carve-out for in-repo paths (read_file can verify). The no-tools variant weakens the carve-out to NON-BLOCKING and points to the new NO_TOOLS_SECTION diff-vs-description mismatch exception, which lets in-repo absent-from -diff paths still be BLOCKING when the absence is verifiable from the diff itself. Detection patterns: - home_tilde: tilde-prefixed home paths - env_home: environment-expanded HOME paths - absolute_system: absolute paths under /etc, /usr, /var, /opt, /tmp, /root (no Users or home in this allowlist — those collide with in-repo macOS dev paths) - session_workspace: gated on minsky/sessions/ sub-path so it cannot collide with in-repo macOS or Linux dev paths URL paths excluded via negative lookbehind. Trailing punctuation stripped. Cross-source dedup with source-label aggregation. 40 tests, 103 expect calls. Typecheck clean. Lint clean. This commit is the rebase-onto-main consolidation of 8 prior commits (see PR #763 history). Previous incremental commits hit a session_update merge-flow gap (filed as mt#1303) that prevented producing a real merge commit; this consolidates the work as a single commit on top of current main HEAD so the branch is fast-forward-mergeable.
BLOCKING 1 (gh api error path conflated with webhook-miss): Refactored the check-runs handling to a two-step pipeline: - parseCheckRunsResponse(execResult) returns ok+count or ok=false+error, distinguishing API/parse failures (non-zero exit, empty body, non-JSON, missing fields) from a successfully-parsed zero count. - evaluateCheckRunsPresence(parseResult, ...) emits a distinct deny reason for API-failure (Unable to query CI check_runs ... gh api transport/parse failure) vs webhook-miss (mt#1309 / PR #763 lineage; push an empty commit ...). - Added a 10s timeout on the gh api invocation per reviewer suggestion. BLOCKING 2 (SPEC.md stale vs implementation): Rewrote .claude/hooks/SPEC.md require-review-before-merge section to document all five current gates in fixed order (review presence, spec verification, documentation impact, review freshness, CI check_runs presence), the parse vs evaluator split, the presence-only / status-agnostic policy, error-handling behavior, and the testable-surfaces note. Updated Behavioral Contract item 7 to match. NON-BLOCKING declined (executable bit on test file): Reviewer suggested dropping the exec bit, but CLAUDE.md mandates +x on all .claude/hooks/*.ts files and the project pre-commit hook enforces this with a hard block. block-github-mcp-pr-writes.test.ts is also 100755. Kept +x. NON-BLOCKING addressed (test coverage gap): Expanded tests from 8 to 22: 10 new parseCheckRunsResponse cases (total_count, check_runs.length fallback, count=0, non-zero exit, empty stderr, empty stdout, invalid JSON, array/null bodies, missing fields), 4 new evaluateCheckRunsPresence cases for the API-failure path (distinct reason text, error embedded verbatim, /review-pr step 7a still referenced, PR number and HEAD sha included).
…, fileFetch warning BLOCKING #1: raise require-execution-evidence hook timeout from 20s to 60s in .claude/settings.json to cover worst-case sequential subprocess budget (deriveRepoFromGit 5s + gh pr list 10s + fetchPrFiles 15s + fetchPrMeta 15s = 45s). BLOCKING #2: replace brittle branch-name PR resolution with a two-path resolvePrNumber() helper: primary path uses (current branch, handles forks and non-standard names); fallback uses 921 fix(mt#1486): mt#1486: widen prior-review parsers for bare [BLOCKING] markers task/mt-1486 OPEN 2026-04-30T22:04:07Z 920 feat(mt#1465): mt#1465: severity-monotonicity prototype + A/B replay measurement task/mt-1465 OPEN 2026-04-30T21:12:37Z 914 feat(mt#1071): Add attention accounting module, attention.report command, and --show-attention flag task/mt-1071 OPEN 2026-04-30T19:03:03Z 909 feat(mt#1459): Add PreToolUse hook to require execution evidence before merging PRs with new tests task/mt-1459 OPEN 2026-04-30T18:06:16Z 908 feat(mt#1460): Add execution evidence requirement for new test files to prepare-pr skill task/mt-1460 OPEN 2026-04-30T17:59:59Z 907 feat(mt#1348): Auto-apply review-state labels to PRs on review submission task/mt-1348 OPEN 2026-04-30T09:19:25Z 906 feat(mt#1341): Add suggestion field to ReviewComment for GitHub suggestion blocks task/mt-1341 OPEN 2026-04-30T09:13:52Z 900 feat(mt#1467): Wire mergeSessionPr to emit quality.review Ask before each merge task/mt-1467 OPEN 2026-04-30T07:31:42Z 898 feat(mt#1342): Add GitHub PR review thread resolve/unresolve via GraphQL task/mt-1342 OPEN 2026-04-30T07:16:28Z 896 feat(mt#1343): reviewThreads field in session_pr_review_context task/mt-1343 OPEN 2026-04-30T07:11:30Z 889 feat(mt#1346): GitHub Check Run annotations as parallel review surface task/mt-1346 OPEN 2026-04-30T07:02:20Z 887 feat(mt#1347): Add pre-flight diff anchor validation for PR review comments task/mt-1347 OPEN 2026-04-30T06:56:49Z 886 feat(mt#1340): Update reviewer prompt to emit line-anchored comments[] from parsedDiff task/mt-1340 OPEN 2026-04-30T06:51:22Z 882 feat(mt#1437): Typed TS env-var synthesizer for Railway services (minsky-mcp) task/mt-1437 OPEN 2026-04-28T22:03:56Z 872 feat(mt#1309): Merge-gate denies PRs with zero CI check_runs (PR #763 webhook-miss regression detection) task/mt-1309 OPEN 2026-04-28T07:36:22Z 858 fix(mt#1384): write canonical github-pr ref so reconciler can route Asks task/mt-1384 OPEN 2026-04-27T23:13:27Z 855 docs(mt#1311): Document merge-flow operational reality in /review-pr and /orchestrate skills task/mt-1311 OPEN 2026-04-27T21:00:05Z 849 feat(mt#1306): Reviewer convergence metrics: durable Postgres persistence task/mt-1306 OPEN 2026-04-27T19:25:19Z 809 feat(mt#1255): replace console.* with structured winston logger in reviewer service task/mt-1255 OPEN 2026-04-26T16:57:53Z 781 fix(mt#1274): [SUPERSEDED by mt#1281] propagate sessionId to internal session_update task/mt-1274 OPEN 2026-04-25T19:11:02Z 777 fix(mt#1261): [SUPERSEDED by mt#1281] session_pr_create sessionId-shape fix task/mt-1261 OPEN 2026-04-25T19:01:41Z 774 test(mt#1263): Add runReview sanitize wiring integration tests task/mt-1263 OPEN 2026-04-24T18:54:29Z 710 feat(mt#1028): KB activity source — GitHub Issues/PRs provider task/mt-1028 OPEN 2026-04-22T23:17:34Z 703 docs(mt#1109): E2E smoke artifact — add v1-deployed status to reviewer README task/mt-1109 OPEN 2026-04-22T22:07:14Z 684 feat(mt#1066): PostToolUse hook enforcing /review-pr after session_pr_create task/mt-1066 OPEN 2026-04-22T17:45:49Z 682 docs(mt#1050): surface attention-allocation frame in README task/mt-1050 OPEN 2026-04-22T17:44:30Z 538 chore(deps-dev): bump ts-morph from 26.0.0 to 28.0.0 dependabot/npm_and_yarn/ts-morph-28.0.0 OPEN 2026-04-20T00:32:48Z 537 chore(deps-dev): bump @types/node from 22.19.17 to 25.6.0 dependabot/npm_and_yarn/types/node-25.6.0 OPEN 2026-04-20T00:32:43Z 536 chore(deps): bump ai from 4.3.19 to 6.0.168 dependabot/npm_and_yarn/ai-6.0.168 OPEN 2026-04-20T00:32:34Z 535 chore(deps): bump @ai-sdk/openai from 1.3.24 to 3.0.53 dependabot/npm_and_yarn/ai-sdk/openai-3.0.53 OPEN 2026-04-20T00:32:16Z. Both failing now emits a writeOutput warning instead of silently calling process.exit(0). Added ExecFn injectable type for unit testing. BLOCKING #3: change fetchPrFiles return type from PrFile[] to FetchPrFilesResult { files, warning? } so API errors are surfaced in the audit trail via additionalContext rather than silently allowing merge. Warning is propagated through topLevelWarnings and merged with check-level warnings in the final writeOutput call. Tests: 8 new tests added (resolvePrNumber: 6, fetchPrFiles warning shape: 3); all 68 tests pass.
… result sets Resilience patches to measure-calibration.ts: - new OpenAI() now configured with timeout: 1200000 (20 min) + maxRetries: 3. gpt-5 with deep reasoning (4-7K reasoning tokens + 5-10K completion) can take 5-10 min per call; default 600s timeout was too tight. - runAttempts wraps each callOpenAIWithClient invocation in try/catch. On error: logs failure, pushes a placeholder result with event=ERROR, continues with next attempt. Single timeout no longer aborts the run. Result JSONs from the live runs (78 OpenAI calls total, ~0 spend): - measure-calibration-trivial-results.json: 60 attempts, 91.7% RC rate, mean 1.68 BLOCKING. SC #1 NOT MET (target <=30%; observed 91.7%). 1 timeout (resilience handled). Calibration insight: mt#1188 scope-aware rigor is in main but not biting under the Critic Constitution preamble. - measure-calibration-larger-results.json: 15 attempts, 93.3% RC rate, mean 2.87 BLOCKING. SC #3 NOT MET (target +/-10%; observed +13.3pp drift from historical baseline). PR #763 R1 most informative deviation (historical COMMENT to replay 100% RC). - measure-calibration-contradiction-results.json: 3 attempts, 100% RC rate, mean 2 BLOCKING. Direct-contradiction class reproduced at 67% line-number match (line 452 of start-command.ts hit on attempts 1+2). mt#1496 recovery layer will not cover this class; sibling task warranted. 3 calibration rows added to project_mt1110_calibration_data.md (private memory).
…ame §D→§A.2 Address minsky-reviewer[bot] R1 BLOCKING findings (reviewId 4329265236). Two BLOCKING findings, both rooted in the same cause: the prior commit's recompile of `.minsky/skills/orchestrate/skill.ts` dropped 5 sections that exist only in the compiled `.claude/skills/orchestrate/SKILL.md`, because the source content string didn't carry them (the cockpit-design-style source/compiled drift; mt#1956 tracks the broader pattern). Sections restored to source + compiled: - `### 0. Restate plan check (pre-action, mandatory)` — guard against multi-step direction step-compression (PR #1073 / mt#1783 origin). - `### 0a. Disambiguate-next check (multi-next-step chain-walk junctions)` — guard against chain-walk-on-affirmative wrong-child-pick (mt#1772 / mt#1768 mt#1773 origin). - `### A.1 Epic re-entry: audit for decomposition staleness` — Shape C complement to §A's parallel-work sweep (mt#1552 origin, 7 instances). - `### D. In-flight iteration: branch-divergence check` — review-iteration branch-divergence guard (PR #763 / mt#1190 / mt#1216 origin). Preserved with its original numbering. - `## Error recovery` section with the symptom / root-cause / recovery table — surfaces the mt#1303 session_update-conflict gap + the mt#1110 reviewer-bot webhook-miss gap. The new DAG-filing section moved from `### D` to `### A.2` to avoid the name conflict with the preserved `### D. In-flight iteration` section. The Key constraints bullet now references §A.2 (addresses the inline NIT comment on naming consistency). §A.2 is the correct semantic position — between §A.1 (epic re-entry audit) and §B (subtask decomposition) — since DAG filing is a filing-discipline concern, parallel to §A (parallel-work sweep) and §A.1 (decomposition staleness audit). Verified post-fix (compiled `.claude/skills/orchestrate/SKILL.md`): - All 7 prior-existing sections restored (§0, §0a, §A.1, §B, §C, §D, Error recovery). - New §A.2 inserted in correct position. - Key constraints bullet references §A.2. - `grep -E "wave-by-wave|DAG filing"`: 5 matches (acceptance test 1). - Worked-example Wave 0/1/2 block present (acceptance test 4). - Failure-semantics rationale present (acceptance test 5). - Typecheck: 0 errors. Lint: 0 errors, 0 warnings across 1554 files. The source `.minsky/skills/orchestrate/skill.ts` was effectively a full rewrite of the `content:` string — this PR now contains the orchestrate slice of mt#1956's broader source/compiled-drift backport. The remaining 4 affected skills (implement-task, plan-task, review-pr, verify-task) stay untouched here and remain mt#1956's scope.
Summary
Fixes a false-positive class in the adversarial reviewer where paths referenced in a PR body/description that aren't in the diff were flagged as BLOCKING — even when those paths are outside the repository and the reviewer has no filesystem access to verify them.
Motivating case: PR #720 iter-2 flagged a user-memory file reference (
~/.claude/projects/.../memory/MEMORY.md) as "PR claims update but diff doesn't show it." The reviewer cannot read the local filesystem, so this is structurally unverifiable — it must be NON-BLOCKING.Approach: defense-in-depth (both levers)
The original spec offered two options: a prompt clause or a structural pre-check. Decision on 2026-04-24: ship both. The reviewer has no cross-round memory and its output is not reinforced across iterations — a prompt-only fix can be eroded by phrasing drift or attention pressure. The structural pre-check holds regardless of how the model reads the Constitution that turn.
Pencil-depth layering:
CRITIC_CONSTITUTIONtelling the reviewer the rulebuildReviewPrompt()that supplies the evidence the rule operates onChanges
Commit 1 —
90f56428b(prompt clause):services/reviewer/src/prompt.ts— adds an "Out-of-repo references" section toCRITIC_CONSTITUTION_FAILURE_MODESenumerating recognized out-of-repo path patterns (~/.claude/...,$HOME/..., absolute system paths, session workspace paths) and instructing the reviewer to mark any "claimed-but-not-in-diff" finding for these as[NON-BLOCKING] NEEDS VERIFICATION: out-of-repo path — reviewer cannot verify. Explicitly preserves the existing behavior: in-repo paths remain legitimate and potentially BLOCKING findings.Commit 2 —
8b71d0ea7(structural pre-check):services/reviewer/src/prompt.ts— adds a pre-check scanner and wires it intobuildReviewPrompt():extractOutOfRepoReferences(text, source)— public export, scans free-form text for out-of-repo path references and returns deduplicated, punctuation-trimmed matches tagged by kind (home_tilde/env_home/absolute_system) and source ("PR description"/"task spec").buildOutOfRepoSection()— internal helper, runs the extractor over both PR body and task spec, and when matches exist produces a dedicated## Out-of-repo references observedprompt section enumerating each one.buildReviewPrompt()— now injects the section between Task Specification and Diff when the pre-check finds matches; omits it entirely when not. Changes nothing else.Detection patterns: tilde-prefixed home paths,
$HOME/env-expanded paths, absolute paths under/etc,/usr,/var,/opt,/tmp,/root. Note:/homeand/Usersare deliberately excluded from the generalabsolute_systempattern to avoid false positives on developer machines (repos commonly sit under/Users/<name>/Projects/on macOS or/home/<user>/code/on Linux); the separatesession_workspacepattern matches/Users/.../minsky/sessions/...and/home/.../minsky/sessions/...instead. URL paths likehttps://example.com/etc/fooare excluded via a negative lookbehind.services/reviewer/src/prompt.test.ts— 18 new test cases total (4 in commit 1 for the prompt clause, 14 in commit 2 for the pre-check + integration).Test output
Typecheck: clean. Lint: clean (3 magic-string-duplication warnings were extracted to a
OUT_OF_REPO_HEADINGconstant).Acceptance criteria coverage
Parent
Part of mt#1110 (reviewer calibration).
Context
memory/project_mt1110_calibration_data.md§Calibration levers this suggests → "Out-of-repo file awareness"Had Claude implement this per task spec mt#1190.