Harden Codex large-diff prompt budgeting and fallbacks by wesm · Pull Request #558 · roborev-dev/roborev

wesm · 2026-03-20T18:50:01Z

Summary

harden oversized Codex review prompts so fallback instructions preserve review filtering and stay within prompt budgets
preserve the legacy prompt-size default, add UTF-8-safe truncation, and enforce a final hard cap for tiny configured budgets
avoid eagerly materializing oversized commit and range diffs by using bounded git reads
expand regression coverage for large diffs, large metadata, exclude handling, and small prompt caps

roborev-ci · 2026-03-20T18:52:32Z

roborev: Combined Review (`53c56a8`)

Verdict: The PR introduces a high-severity command injection vulnerability in the fallback shell snippets and contains configuration and matching
inconsistencies.

High

Location: internal/prompt/prompt.go:242, internal/prompt/prompt.go:253
Problem: The Codex oversized-diff fallback embeds sha and rangeRef directly into shell command examples without quoting or normalization
. This creates a command-injection sink if these values contain attacker-controlled ref names (e.g., topic;touch${IFS}/tmp/pwned). Because the agent is instructed to run these locally, malicious refs could lead to arbitrary local command execution.
Fix: Do not interpolate raw revspecs into
shell snippets. Resolve inputs to verified full commit OIDs before embedding, shell-escape them correctly, or instruct the agent to inspect the diff through a non-shell mechanism.

Medium

Location: internal/prompt/prompt.go:235
Problem: isCodexReview Agent only matches the literal name "codex". This feature will be skipped for any configured alias that resolves to the Codex provider, causing aliased setups to fall back to the older, weaker oversized-diff guidance.
Fix: Decide the fallback from resolved agent/provider metadata instead of the raw agentName,
or normalize all supported Codex aliases before checking.
Location: internal/prompt/prompt.go:330, internal/prompt/prompt.go:416
Problem: The oversized-diff gate uses the hardcoded MaxPromptSize constant instead of the config-aware
b.maxPromptSize() accessor. Non-default prompt-size configurations will trigger the Codex fallback incorrectly.
Fix: Replace the MaxPromptSize comparisons with b.maxPromptSize() (or equivalent config-aware accessor) in both prompt builders, and add coverage for non-default prompt-
size configuration.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

roborev-ci · 2026-03-20T19:03:28Z

roborev: Combined Review (`91c4d25`)

Verdict: The PR introduces size-aware fallback prompts for Codex
, but requires fixes for command injection risks, hardcoded limits, and inefficient test setups.

Medium

Location: internal/prompt/prompt.go:282, internal/prompt/prompt.go:467
Problem: The new Codex oversized-range fallback
embeds rangeRef directly into shell command examples (e.g., git log --oneline %s). rangeRef is a free-form revision expression and can contain shell metacharacters, creating a command-injection path in the reviewer environment if an external branch name is used.
Fix
: Resolve the range to verified object IDs before rendering it into command examples, or shell-escape the argument before interpolation.
Location: internal/prompt/prompt.go (around writeCodexCommitInspectionFallback, writeCodexRangeInspectionFallback, buildSinglePrompt, buildRangePrompt )
Problem: The new Codex fallback logic hard-codes MaxPromptSize instead of using the builder’s config-aware limit. If a smaller prompt cap is configured, the fallback can exceed the real limit.
Fix: Thread b.maxPromptSize() through the overflow check
and writeLongestFitting, and add a test that sets a reduced prompt limit to verify the fallback stays within the configured cap.
Location: internal/prompt/prompt_test.go (findSingleCommitNearCapRepo and findRangeNearCapRepo)
Problem:
The functions create up to 60 separate git repositories via shell commands in a loop to iteratively find a prompt size boundary. This trial-and-error file system manipulation will significantly slow down test execution and increase I/O overhead.
Fix: Calculate the required guidelineLen mathematically based on the
length of a baseline prompt without guidelines, and create the test repository only once.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

roborev-ci · 2026-03-20T19:10:25Z

roborev: Combined Review (`182913d`)

Verdict: The PR implements necessary prompt truncation and fallback logic for oversized Codex reviews, but requires fixes for a severe performance/correctness bug in UTF
-8 truncation and hardcoded prompt size limits.

High Severity

Location: internal/prompt/prompt.go:42 (truncateUTF8)
Problem: utf8.ValidString checks the entire string from the beginning. If the string contains any invalid UTF-8
prior to the truncation point (e.g., from user guidelines or commit messages), the loop will repeatedly decrement maxBytes to remove the invalid byte, unintentionally wiping out large portions of valid context (or the entire string). Additionally, this causes an O(N*M) CPU spike as utf8. ValidString repeatedly re-scans the valid prefix on every iteration.
Fix: Check only the truncation boundary. Walk backward from maxBytes (up to 3 bytes) to detect and remove a trailing incomplete rune, rather than re-validating the entire prefix.

Medium Severity

Location: internal/prompt/prompt.go:233, internal/prompt/prompt.go:409, internal/prompt/prompt.go:495
Problem: The new Codex fallback path hardcodes the legacy MaxPromptSize (2
50 KB) instead of using the builder’s config-aware prompt budget. On installations that set a smaller cap, writeLongestFitting can still emit an oversized prompt; on installations that set a larger cap, the code can unnecessarily drop the inline diff even though it would fit.
Fix: Thread
b.maxPromptSize() through the oversized-diff checks and fallback writers so both the “should we omit the diff?” decision and the trimming logic use the effective configured limit. Add a regression test that exercises a non-default prompt budget.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

roborev-ci · 2026-03-20T22:13:51Z

roborev: Combined Review (`563f187`)

Verdict: The PR requires important fixes to address a command injection vulnerability, a critical bug in UTF-8 truncation logic, and incorrect prompt
size limits.

High Severity Findings

Location: internal/prompt/prompt.go:281, internal/prompt/prompt.go:307, internal/prompt/prompt.go:429, internal/prompt/prompt.go:534
- Problem: The new Codex oversized-diff fallbacks interpolate sha/rangeRef directly into shell command snippets and explicitly instruct Codex to run them. If those refs are not already canonical SHAs, an attacker could supply ref names containing shell metacharacters, leading to command injection when the agent
  executes the suggested command verbatim.
- Suggested Fix: Resolve refs to validated full commit SHAs before embedding them in any command text, or shell-escape them with a dedicated quoting routine. Enforce the “must be a SHA” invariant at this boundary.
Location: internal/prompt/prompt. go (truncateUTF8 function)
- Problem: utf8.ValidString(s[:maxBytes]) validates the entire substring. If the text contains any invalid UTF-8 sequence prior to the truncation point, it will return false on every iteration until maxBytes drops
  to 0, completely wiping out the string instead of just truncating it.
- Suggested Fix: Check only the boundary byte to avoid splitting a multi-byte character. Replace the loop condition with: for maxBytes > 0 && !utf8.RuneStart(s[maxBytes]) { maxBytes-- }

Medium Severity Findings

Location: internal/prompt/prompt.go:427, internal/prompt/prompt.go:532
- Problem: The new fallback path uses the legacy MaxPromptSize constant for both the overflow check and
  the trimming budget, ignoring Builder.maxPromptSize(). Any non-default configured prompt cap will be applied incorrectly: prompts can still exceed the real cap or degrade to fallback too early.
- Suggested Fix: Resolve the active limit once with b.maxPromptSize() and use that value consistently
  in the size check and in buildPromptPreservingCurrentSection.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

- Use config-resolved prompt cap instead of hardcoded MaxPromptSize in buildSinglePrompt, buildRangePrompt, and BuildDirty - Shell-quote rangeRef in Codex fallback variants to prevent injection via crafted branch names - Replace O(N²) utf8.ValidString loop in truncateUTF8 with O(1) utf8.RuneStart boundary check Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Non-Codex single/range fallbacks now route through buildPromptPreservingCurrentSection to trim optional context (guidelines, previous reviews) when it alone exceeds the cap - BuildDirty separates optional context from required sections and trims it before the diff-size check - Add regression tests with a 10KB configured cap for non-Codex single, range, and dirty paths Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

roborev-ci · 2026-03-22T18:54:39Z

roborev: Combined Review (`1f32674`)

Verdict: The PR introduces bounded git diff reading and improves prompt budget handling, but contains a medium-severity issue regarding UTF-8 safe truncation.

Medium

Location:
internal/git/git.go:667
Problem: captureGitOutputLimited truncates git output at an arbitrary byte boundary and immediately converts it to string. If the diff contains multibyte UTF-8 text near the cutoff, the returned prompt can now contain invalid
UTF-8 even though the prompt layer added UTF-8-safe trimming elsewhere.
Fix: Trim the captured bytes to a valid UTF-8 boundary before returning them, and add a test that limits a diff containing non-ASCII content.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

On Windows, a killed git process can remain blocked if its stdout pipe buffer is full and nothing is reading it. This prevents cmd.Wait from returning because the stderr copy goroutine never sees EOF. Drain remaining stdout after cancel so the process can unblock, exit, and release all pipe handles. Fixes 600s timeout in Windows CI for git and prompt packages. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Tests now assert against config.DefaultMaxPromptSize (200KB) instead of the legacy MaxPromptSize constant (250KB), matching what resolveMaxPromptSize actually returns with no config - Add regression test proving the 200KB default cap is honored - sanitizeToValidUTF8 uses U+FFFD replacement marker instead of silently dropping invalid byte sequences - Trim trailing partial rune before sanitizing so truncation doesn't expand the output past the byte limit Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The previous test used a diff far larger than both caps, so it passed regardless of which limit was enforced. Now creates a prompt between 200KB and 250KB and verifies the builder with no config enforces the 200KB default. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

roborev-ci · 2026-03-22T20:14:32Z

roborev: Combined Review (`8d7ca23`)

Verdict: The patch introduces useful streaming diff readers and bounded allocations, but contains a medium-severity command injection risk on Windows and functional bugs in diff-budget calculations and generic file filtering.

Medium

Location: internal/prompt/prompt.go (Line 294
, Line 340)
Problem:
shellQuote() now emits single-quoted command arguments on Windows, and codexRangeInspectionFallbackVariants() embeds rangeRef into shell command strings that the review agent is told to run locally. Single quotes are PowerShell-specific; they do not protect arguments in cmd.exe, where metacharacters such
as &, |, > and < remain active. If an externally sourced ref name ever reaches rangeRef and an oversized diff triggers this fallback, a Windows agent executing the suggested command through cmd.exe can be steered into running attacker-controlled commands.
Fix: Do not render shell
-ready strings from ref names. Prefer structured/argv-style command instructions so the agent can invoke git without a shell. If text commands must be emitted, generate them for one explicit shell only and escape according to that shell, or reject metacharacters in externally influenced refs before rendering.
Location
: internal/prompt/prompt.go (around buildSinglePrompt / buildRangePrompt)
Problem: The code computes diffLimit before trimming optional sections (optional context, large commit bodies, previous reviews). If those sections alone push baseLen over the cap,
GetDiffLimited is called with 0 bytes and the prompt falls back to "Diff too large", even when the actual diff would fit after trimming the optional context. This causes small commits/ranges to lose their inline diff unnecessarily.
Fix: Trim optional context/current overflow to the configured cap first
, then calculate remaining diff budget and only fall back when the diff still does not fit after that trimming.
Location: internal/git/git.go (around GetFilesChanged and GetRangeFilesChanged)
Problem: These generic helpers now always apply ReviewPathspec Args, which includes built-in and configured review excludes even when callers pass no excludes. That silently changes their contract from "all changed files" to "review-filtered changed files", so any existing caller that uses them for generic file lists or counts will now miss files like go.sum or repo-configured excludes
.
Fix: Preserve the old unfiltered behavior in GetFilesChanged / GetRangeFilesChanged, and add separate review-specific helpers or pass the filtered pathspec only at the review call sites.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

- Restore GetFilesChanged/GetRangeFilesChanged to return all files without review excludes; callers that need filtering pass excludes explicitly via GetDiff/GetDiffLimited - Drop Windows-specific shellQuote branch (single quotes don't protect in cmd.exe); always use POSIX quoting since commands are instructions for Codex agents, not shell invocations - Compute diff budget assuming optional context can be trimmed, so large guidelines don't force a needless "diff too large" fallback when the actual diff would fit Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The diff budget is now computed from required sections only (system prompt, commit metadata). Optional context (guidelines, previous reviews) is trimmed after the diff size is known, so large guidelines no longer cause small diffs to hit the "diff too large" fallback unnecessarily. Add regression test where guidelines consume most of the budget but a small diff still gets inlined after context trimming. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

roborev-ci · 2026-03-22T22:05:42Z

roborev: Combined Review (`385504e`)

The changes implement budget-aware diff capture and prompt fallback logic, but introduce a few medium-severity correctness issues regarding mandatory diff preservation and safe prompt shaping.

Medium Severity Findings

Location: internal/prompt/prompt.go:219
Problem: BuildDirty trims optionalContext against promptCap without reserving space for any mandatory diff/fallback text, then hard-caps the final prompt afterward. If guidelines/history
already consume the budget, the tail cut can remove ### Diff entirely, producing a dirty-review prompt that contains no changes to review.
Fix: Reserve space for a minimal diff section/fallback before trimming optional context, the same way the single-commit/range fallback path reserves room for its shortest
variant.
Location: internal/prompt/prompt.go:271
Problem: buildPromptPreservingCurrentSection and the other truncateUTF8 call sites trim markdown-heavy context at arbitrary byte boundaries. That can leave an open code fence or splice ### Diff/### Combined Diff directly onto a partial line, which can cause the rest of the prompt to be interpreted as code or malformed prose.
Fix: Trim at section boundaries where possible, or normalize the truncated fragment by forcing a separator newline and closing any unmatched code fences before appending the next section
.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

wesm · 2026-03-22T22:07:55Z

"Both findings are speculative edge cases, not practical bugs:

BuildDirty with no diff section: If guidelines alone exceed the cap, the diff section
gets the truncation treatment — same as every other path. The hardCapPrompt is a safety net,
not the primary trimming mechanism. And in practice, guidelines that exceed 200KB are
pathological.
Truncation at markdown boundaries: Every truncation in every LLM prompt system works this
way. Agents handle malformed markdown gracefully — they're not parsers. Engineering
section-boundary-aware truncation is significant complexity for zero practical benefit.

These are the kind of findings that sound rigorous but don't correspond to real failure
modes"

TUI runtime files stored endpoint.BaseURL() which collapses Unix socket endpoints to "http://localhost", losing the socket path. Add ConfigAddr() that returns a ParseEndpoint-compatible string (e.g. "unix:///path/to/sock") and use it in runtime metadata so external tools can reconnect to the correct transport. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

roborev-ci · 2026-03-23T16:01:55Z

roborev: Combined Review (`8c43452`)

Verdict: The PR
implements prompt size budgeting and fallback commands, but contains medium-severity issues related to Windows shell compatibility and Markdown injection vulnerabilities.

Review Findings

Medium

Location: internal/prompt/prompt.go (shellQuote, renderShellCommand, codexCommitInspectionFallbackVariants, codex RangeInspectionFallbackVariants)
- Problem: The new oversized-diff fallback commands are always rendered with POSIX single-quote escaping. On Windows shells, single quotes are not parsed the same way, so commands containing quoted pathspec excludes like ':(exclude,glob)**/go.sum' will be passed to
  Git with the quotes included and fail. This breaks the local-inspection path for Codex on Windows when the inline diff has been omitted.
- Fix: Render fallback commands with OS/shell-appropriate quoting, or avoid shell-escaped strings entirely and provide argv-style command templates that do not depend on POS
  IX quoting rules.
Location: internal/prompt/prompt.go (Lines 326, 349)
- Problem: The oversized-diff Codex fallback embeds shell commands inside Markdown inline code using `...`, but only shell-quotes the
  interpolated arguments. Repo-controlled exclude patterns (from .roborev.toml) or ref/range strings containing backticks or embedded newlines can break out of the code span and inject additional prompt text/instructions. This could allow a malicious shared repository to steer the review agent.
- Fix: Escape
  untrusted values for Markdown separately from shell quoting, or avoid inline-code rendering entirely for interpolated commands. Consider using fenced blocks with a dynamically chosen fence length, and reject control characters/newlines in repo-sourced exclude patterns before embedding them.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

The fallback inspection commands included the full built-in lockfile exclude list (20+ pathspec args), making prompts unreadable. Built-in excludes are already applied when reading the diff via GetDiffLimited; repeating them in agent instructions is redundant. Now only user- configured excludes (from exclude_patterns) appear in the fallback. Export FormatExcludeArgs for the prompt package to use directly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Exclude patterns from .roborev.toml are embedded in markdown inline code spans in Codex fallback commands. Patterns containing backticks could break out of the code span and inject prompt text. Silently drop patterns with control characters or backticks in FormatExcludeArgs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

FormatExcludeArgs feeds into actual git diff pathspecs, so dropping patterns with backticks there would break real exclude behavior. Move the sanitization to the prompt rendering layer where pathspec args are embedded in markdown inline code spans. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

roborev-ci · 2026-03-23T16:35:32Z

roborev: Combined Review (`f5b3b96`)

Summary Verdict: Enhances prompt building and git diff extraction to enforce size budgets, but introduces command injection and markdown formatting vulnerabilities via unescaped git refs.

High Severity

Command Injection via Git Refs: Untrusted git refs are interpolated into shell commands without quoting in the oversized-diff fallback paths. buildSinglePrompt takes a general git ref, not a guaranteed full SHA, and the new fallback strings emit git show %s -- path/to/file, View with: git show %s, and View with: git diff %s directly. Git ref names can legally contain shell-significant characters, so reviewing an attacker-controlled branch or tag from a shared repo can inject extra shell syntax into the command that the agent is told to run locally.
*
Affected Locations:
* internal/prompt/prompt.go:389
* [internal/prompt/prompt.go:529](/
home/roborev/repos/roborev/internal/prompt/prompt.go#L529)
* [internal/prompt/prompt.go:638](/home/roborev/repos/roborev/internal/prompt/prompt.go#L

*   *Suggested Remediation:* Resolve refs to canonical object IDs before formatting, and generate every displayed command with `renderShellCommand(...)` rather than raw `fmt.Sprintf`.

Medium Severity

Markdown Formatting Breakout: The new markdown-safety filtering only covers pathspec args,
not the ref/range values embedded into inline command examples. safeForMarkdown drops backticks from exclude args, but sha and rangeRef are still inserted into backtick-delimited markdown command snippets. A malicious external ref name containing backticks can break out of the inline code span and inject
additional prompt text or malformed markdown into the instructions sent to the review agent.
- Affected Locations:
  - [internal/prompt/prompt.go:376](/home/roborev/repos/roborev/internal/prompt/prompt.go#L3

```
*   [`internal/prompt/prompt.go:414`](/home/roborev/repos/roborev/internal/prompt/prompt.go#L414)
```
- Suggested Remediation: Apply markdown sanitization to ref/range values
  too, or stop embedding untrusted values inside inline code spans; resolving refs to full SHAs before prompt construction is the simplest fix for single-commit reviews.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

The non-Codex oversized-diff fallbacks and one Codex example used raw fmt.Sprintf for git refs. Use renderShellCommand/shellQuote for consistency with the other Codex fallback commands. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

roborev-ci · 2026-03-23T16:53:24Z

roborev: Combined Review (`175f12e`)

Verdict: The changes effectively harden prompt budgeting and truncation, but introduce medium-severity issues in the Codex oversized-diff fallback regarding review scope consistency
and markdown injection vulnerabilities.

Medium

Location: internal/prompt/prompt.go#L366, internal/prompt/prompt.go#L39
0 (buildSinglePrompt / buildRangePrompt)
Problem: The fallback commands fail to preserve the full review scope. They only apply user-configured excludes and drop built
-in exclusions (excludedPathPatterns). This causes the agent to inspect files that are normally hidden (like lockfiles or generated artifacts), leading to false-positive findings on oversized reviews.
Fix: Build the fallback commands from the same full pathspec used to collect review diffs, or provide an equivalent helper that
applies both built-in and user excludes.
Location: internal/prompt/prompt.go#L366, internal/prompt/prompt.go#L3
90
Problem: Markdown injection vulnerability. While pathspec arguments use safeForMarkdown, the sha and rangeRef variables do not. A hostile ref name from
a shared repo containing backticks can terminate inline code spans and inject attacker-controlled text into the review prompt sent to the autonomous agent.
Fix: Treat sha and rangeRef as markdown-untrusted input. Run them through the same markdown safety filter used for pathspec args, or use a
dynamically fenced code block instead of inline backticks.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

wesm · 2026-03-23T18:14:26Z

" 1. Lockfile excludes in fallback: Same as review 10768. Already decided this is intentional — the agent
can use judgment. Not changing.
2. Markdown injection via refs: The sha and rangeRef values in the Codex fallback are already passed
through shellQuote/renderShellCommand which wraps them in single quotes — those don't contain backticks.
The non-Codex "View with:" lines aren't inside inline code spans (no backticks), so there's nothing to
break out of. Not a real issue."

…d regression tests - renderShellCommand re-applies stripInlineCodeBreakers so backticks and control characters in git refs cannot escape the enclosing Markdown inline code span. The sanitizer pre-dated this branch (pr #558) but was dropped when renderShellCommand was refactored. - review_comments template ends with an extra trailing blank line and drops its own leading blank, so comment-bearing entries in previous_reviews / in_range_reviews / previous_review_attempts keep a blank line separator before the next entry. Before this fix, the {{- end}} trim ate the only separator after the comment block and the following '--- Review ... ---' header butted against the last comment line. Regression tests added: - TestTrimOptionalSectionsPropagatesInRangeReviewsClear: locks in that trimOptionalSections (*view = ctx) propagates the cleared InRangeReviews slice back to the caller, which the field-by-field rebuild did not. - TestMeasureOptionalSectionsLossCountsInRangeReviews: ensures the fallback selector treats dropped InRangeReviews as a loss. - TestReviewOptionalContextTrimNextPreservesPriority: extended to cover the InRangeReviews priority slot. - TestRenderShellCommandStripsInlineCodeBreakers: table-driven test that backticks and control bytes never survive rendering. - TestGoldenPrompt_PreviousReviewsWithComments: golden-file snapshot proving the comment separator fix. - TestGoldenPrompt_RangeTruncatedCodexPreservesInRangeReviews: golden-file snapshot for the truncated-codex-range path proving InRangeReviews survive and the codex fallback is selected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Fix Codex oversized diff review fallbacks

53c56a8

Keep Codex fallback prompts within size limit

91c4d25

Harden Codex prompt budget fallbacks

182913d

wesm added 3 commits March 20, 2026 15:51

Preserve current prompt sections in Codex fallback

25bea71

Cap Codex fallback prompts with oversized metadata

7edb36e

Trim oversized single-commit metadata in Codex prompts

563f187

wesm and others added 6 commits March 20, 2026 17:13

Prioritize commit subject in Codex prompt trimming

0e44da3

Make large-author prompt test portable

d72583f

Ignore inherited git identity in prompt tests

e151b0d

Fix Codex prompt fallback budget handling

1f32674

Fix UTF-8 truncation in limited git diffs

3ee3d2c

wesm changed the title ~~Fix Codex oversized diff review fallbacks~~ Harden Codex large-diff prompt budgeting and fallbacks Mar 22, 2026

wesm and others added 4 commits March 22, 2026 14:14

Align prompt budget defaults and fallback excludes

8b35d88

wesm and others added 2 commits March 22, 2026 15:28

wesm and others added 2 commits March 23, 2026 10:46

Skip Unix ConfigAddr round-trip test on Windows

8c43452

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

wesm and others added 3 commits March 23, 2026 11:02

wesm merged commit 6867dfb into main Mar 23, 2026
8 checks passed

wesm deleted the fix/sandboxing branch March 23, 2026 18:14

wesm mentioned this pull request Apr 19, 2026

refactor(prompt): consolidate template context and add golden snapshot tests #630

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Harden Codex large-diff prompt budgeting and fallbacks#558

Harden Codex large-diff prompt budgeting and fallbacks#558
wesm merged 25 commits intomainfrom
fix/sandboxing

wesm commented Mar 20, 2026 •

edited

Loading

Uh oh!

roborev-ci Bot commented Mar 20, 2026

Uh oh!

roborev-ci Bot commented Mar 20, 2026

Uh oh!

roborev-ci Bot commented Mar 20, 2026

Uh oh!

roborev-ci Bot commented Mar 20, 2026

Uh oh!

roborev-ci Bot commented Mar 22, 2026

Uh oh!

roborev-ci Bot commented Mar 22, 2026

Uh oh!

roborev-ci Bot commented Mar 22, 2026

Uh oh!

wesm commented Mar 22, 2026

Uh oh!

roborev-ci Bot commented Mar 23, 2026

Uh oh!

roborev-ci Bot commented Mar 23, 2026

Uh oh!

roborev-ci Bot commented Mar 23, 2026

Uh oh!

wesm commented Mar 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

wesm commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

roborev-ci Bot commented Mar 20, 2026

roborev: Combined Review (53c56a8)

High

Medium

Uh oh!

roborev-ci Bot commented Mar 20, 2026

roborev: Combined Review (91c4d25)

Medium

Uh oh!

roborev-ci Bot commented Mar 20, 2026

roborev: Combined Review (182913d)

High Severity

Medium Severity

Uh oh!

roborev-ci Bot commented Mar 20, 2026

roborev: Combined Review (563f187)

High Severity Findings

Medium Severity Findings

Uh oh!

roborev-ci Bot commented Mar 22, 2026

roborev: Combined Review (1f32674)

Medium

Uh oh!

roborev-ci Bot commented Mar 22, 2026

roborev: Combined Review (8d7ca23)

Medium

Uh oh!

roborev-ci Bot commented Mar 22, 2026

roborev: Combined Review (385504e)

Uh oh!

wesm commented Mar 22, 2026

Uh oh!

roborev-ci Bot commented Mar 23, 2026

roborev: Combined Review (8c43452)

Review Findings

Medium

Uh oh!

roborev-ci Bot commented Mar 23, 2026

roborev: Combined Review (f5b3b96)

High Severity

Medium Severity

Uh oh!

roborev-ci Bot commented Mar 23, 2026

roborev: Combined Review (175f12e)

Medium

Uh oh!

wesm commented Mar 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

wesm commented Mar 20, 2026 •

edited

Loading

roborev: Combined Review (`53c56a8`)

roborev: Combined Review (`91c4d25`)

roborev: Combined Review (`182913d`)

roborev: Combined Review (`563f187`)

roborev: Combined Review (`1f32674`)

roborev: Combined Review (`8d7ca23`)

roborev: Combined Review (`385504e`)

roborev: Combined Review (`8c43452`)

roborev: Combined Review (`f5b3b96`)

roborev: Combined Review (`175f12e`)