fix(pr-review): per-file diff budget with manifest, replacing byte-slice truncation by simonrosenberg · Pull Request #234 · OpenHands/extensions

simonrosenberg · 2026-05-13T13:34:52Z

Closes #233.

Why

The PR review bot's diff truncation (MAX_TOTAL_DIFF = 100000 + naive byte slice) silently dropped any patch whose bytes landed past 100K. Because the GitHub raw-diff endpoint returns files alphabetically, this systematically punished late-alphabet frontend files and all openhands/**/*.py backend changes on large PRs. The bot then confidently flagged dropped files as "missing from the diff" — see OpenHands/OpenHands#14401, where it blocked approval four times on a file that was actually in the PR but lived past byte 105,151 of a 190KB diff.

What changed

agent_script.py — replace the raw-diff API + byte slice with /pulls/{n}/files + per-file budget:

New get_pr_files(): paginated fetch of structured per-file objects (filename, status, additions, deletions, patch).
New format_files_manifest(): lists every file in the PR with status and +/- counts. Invariant: every file in the input appears exactly once, regardless of budget.
New format_patches(): emits each file's patch up to MAX_PER_FILE_PATCH = 8000 chars; when the running total would exceed MAX_TOTAL_DIFF = 100000 chars, later files get a header-only stub with an explicit [patch omitted: ...] marker. Per-file truncation is annotated inline with [patch abbreviated: ...]. Never silently drops a file.
Caps unchanged from the prior code (100K total, 8K per file). With the manifest + per-file budget, the cap no longer drops whole files — abbreviated patches and the "read from workspace" skill rule cover the rest.

prompt.py — add {files_manifest} slot above the ```diff fence, with explanatory text telling the agent that abbreviated/omitted patches are still in the PR and should be read from the workspace.

code-review/SKILL.md — new GROUNDING section requiring: (a) check the manifest + read the file from workspace before flagging anything as missing; (b) verify inline-comment line numbers with sed -n / view rather than counting from @@ hunk headers (which LLMs do unreliably).

tests/test_pr_review_diff_payload.py — 10 new tests, including a regression test that mirrors the PR #14401 shape (small file behind two large ones).

Test plan

uv run --group test pytest tests/test_pr_review_diff_payload.py tests/test_pr_review_prompt.py tests/test_pr_review_review_context.py — 23/23 pass.
Full suite (rebased on fix(ci): catch up with openhands-sdk v1.22.0 import path changes #235): 269/269 pass.
CI green: test, sync-sdk-skill, sync-extensions, validate-claude-code all pass.
Live verification on a large PR after merge — confirm manifest renders and large files no longer trigger "file missing" findings.

Out of scope

Line-number hallucinations — the skill change addresses this with a "verify before commenting" rule, but a deeper fix (e.g. inline line-number annotations in patches) is a separate PR.
Cross-round dedup of false positives — the existing soft "don't repeat" instruction in prompt.py is unchanged. Improvement needs measurement, not just more prompting.

A matching README fix lands as a separate PR in software-agent-sdk (#3239).

🤖 Generated with Claude Code

all-hands-bot

🟢 Good taste - Elegant solution to a real, documented bug.

[RISK ASSESSMENT]

[Overall PR] ⚠️ Risk Assessment: 🟡 MEDIUM

High-impact area (core diff handling logic) but well-tested with comprehensive regression coverage. The structured file API eliminates the silent-drop failure mode entirely. Live verification pending (acknowledged in PR description).

VERDICT:
✅ Worth merging: Solves the exact problem described, clean architecture, proper tests.

KEY INSIGHT:
Switching from byte-slice truncation to structured per-file API with explicit markers converts a silent failure into visible, actionable information for the agent.

all-hands-bot · 2026-05-13T13:37:17Z

+
+        patch = file.get("patch") or ""
+        remaining_budget = max_total - total
+        if remaining_budget <= len(header):


🟡 Suggestion: Minor observation - when remaining_budget <= len(header), appending header + "[patch omitted: ...]" can cause total to slightly exceed max_total by the marker length.

This is acceptable since the marker is essential (agent needs to know the patch was omitted), but worth documenting. Consider adding a comment:

# Note: This may push total slightly over max_total, but the marker # is essential so the agent knows to read from workspace

…ice truncation (#233) Switch agent_script.py from the raw-diff endpoint to `/pulls/{n}/files` and feed the agent a Files Changed manifest plus per-file budgeted patches. Update the code-review skill to require manifest + workspace checks before flagging a file as missing, and to ground line numbers against the file before posting inline comments. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The cap bump from the previous commit was scope creep. With per-file budgeting and the manifest in place, files past the cap are already visible to the agent and reachable from the workspace — the bug is fixed without raising the cap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

xingyaoww

Thanks!

all-hands-bot

🟡 Acceptable - Solid fix for a real bug, clean architecture, comprehensive tests. One critical documentation error must be corrected before merge.

[CRITICAL ISSUES]

See inline comment on line 92.

[RISK ASSESSMENT]

[Overall PR] ⚠️ Risk Assessment: 🟡 MEDIUM

High-impact area (core diff handling logic) but well-tested with comprehensive regression coverage including the exact failure mode from issue #233. The structured API eliminates the silent-drop bug entirely. Documentation mismatch must be resolved but doesn't affect the implementation quality.

VERDICT:
✅ Worth merging after fix: Solves a documented production bug with an elegant solution. Fix the MAX_TOTAL_DIFF documentation discrepancy and this is ready.

KEY INSIGHT:
Switching from byte-slice truncation to structured per-file budgeting with explicit markers transforms a silent failure mode into a visible, debuggable one - this is exactly the kind of defensive engineering that prevents future "file missing" false positives.

all-hands-bot

🟢 Good taste - Elegant fix for a real, documented bug (issue #233). The structured file API eliminates the root cause (byte-slicing a format that isn't byte-alignable), and the manifest + per-file budget with explicit markers is clean design. Comprehensive test coverage including the exact regression scenario. The unresolved thread about marker length is fine as-is — the agent needs that text.

[RISK ASSESSMENT]

[Overall PR] ⚠️ Risk Assessment: 🟡 MEDIUM

High-impact area (core diff handling logic) but well-tested with regression coverage for the exact failure mode. The structured API eliminates silent file drops entirely. Live verification pending (acknowledged in PR description) is acceptable given urgency and test coverage.

VERDICT:
✅ Worth merging - Solves the documented problem, clean architecture, comprehensive tests.

KEY INSIGHT:
Switching from byte-slicing raw diff to per-file structured data eliminates an entire class of bugs where arbitrary truncation points corrupt patches or silently drop files.

simonrosenberg mentioned this pull request May 13, 2026

docs(pr-review): describe new manifest-based diff payload accurately OpenHands/software-agent-sdk#3239

Open

1 task

all-hands-bot approved these changes May 13, 2026

View reviewed changes

simonrosenberg self-assigned this May 13, 2026

simonrosenberg mentioned this pull request May 13, 2026

fix(ci): catch up with openhands-sdk v1.22.0 import path changes #235

Merged

2 tasks

Debug Agent and others added 2 commits May 13, 2026 17:55

simonrosenberg force-pushed the fix-pr-review-diff-truncation branch from b1c63b4 to 80f647b Compare May 13, 2026 15:55

xingyaoww approved these changes May 13, 2026

View reviewed changes

xingyaoww requested a review from all-hands-bot May 13, 2026 19:13

all-hands-bot requested changes May 13, 2026

View reviewed changes

Comment thread plugins/pr-review/scripts/agent_script.py

simonrosenberg requested a review from all-hands-bot May 15, 2026 09:21

all-hands-bot approved these changes May 15, 2026

View reviewed changes

simonrosenberg merged commit 0275b6b into main May 15, 2026
8 of 9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(pr-review): per-file diff budget with manifest, replacing byte-slice truncation#234

fix(pr-review): per-file diff budget with manifest, replacing byte-slice truncation#234
simonrosenberg merged 2 commits into
mainfrom
fix-pr-review-diff-truncation

simonrosenberg commented May 13, 2026 •

edited

Loading

Uh oh!

all-hands-bot left a comment

Uh oh!

all-hands-bot May 13, 2026

Uh oh!

xingyaoww left a comment

Uh oh!

all-hands-bot left a comment

Uh oh!

Uh oh!

all-hands-bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

simonrosenberg commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

What changed

Test plan

Out of scope

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

all-hands-bot May 13, 2026

Choose a reason for hiding this comment

Uh oh!

xingyaoww left a comment

Choose a reason for hiding this comment

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

simonrosenberg commented May 13, 2026 •

edited

Loading