Skip to content

fix(agents): fallback retry image safety and partial execution detection#57959

Open
mitchmcalister wants to merge 6 commits intoopenclaw:mainfrom
mitchmcalister:fix/fallback-image-safety-partial-execution
Open

fix(agents): fallback retry image safety and partial execution detection#57959
mitchmcalister wants to merge 6 commits intoopenclaw:mainfrom
mitchmcalister:fix/fallback-image-safety-partial-execution

Conversation

@mitchmcalister
Copy link
Copy Markdown
Contributor

Summary

  • Failure-mode-aware image handling: Replace unconditional image stripping on fallback retry with resolveRetryImages — only strip on format errors (model can't handle images) or cross-provider retries (privacy boundary)
  • Cross-provider image privacy: Strip user-supplied images when fallback targets a different provider, preventing unintended third-party disclosure
  • Prompt-detected image bypass fix: Add suppressPromptImageDetection to the embedded runner so detectAndLoadPromptImages is skipped on cross-provider retries — closes the gap where prompt-referenced local images bypassed the image strip
  • Partial tool execution detection: Thread previousFailureReason and previousPartialExecution through ModelFallbackRunOptions so the retry callback knows what the prior attempt did; warn fallback models about already-executed tools to prevent replaying non-idempotent operations

Builds on #55632 (prompt preservation for new sessions). Supersedes #44188. Closes #43481, #43492.

Design boundary

All changes stay at the runAgentAttempt / runWithModelFallback boundary. The pi-runner internals are not modified beyond accepting the suppressPromptImageDetection flag. This is intentional — the runner's internal loops (profile rotation, cooldown skips) are treated as opaque. See #44188 for the full design discussion.

Changes

failover-error.ts

  • New PartialExecution type with toolNames and didSendViaMessagingTool
  • FailoverError carries optional partialExecution field
  • sanitizeToolNames static method (caps at 20 entries, strips non-alphanumeric, truncates to 100 chars)
  • describeFailoverError returns partialExecution when present

model-fallback.ts

  • ModelFallbackRunOptions extended with previousFailureReason and previousPartialExecution
  • Fallback loop accumulates failure reason and merges partial execution across attempts (union tool names, OR messaging flag)
  • Threaded into runOptions for next candidate

attempt-execution.ts

  • resolveRetryImages: failure-mode-aware image handling (format → strip, cross-provider → strip, same-provider non-format → preserve)
  • buildPartialExecutionSystemContext: system prompt warning about already-executed tools (same-provider only, CWE-200)
  • runAgentAttempt extended with primaryProvider, previousFailureReason, previousPartialExecution
  • Both CLI and embedded paths use resolveRetryImages and effectiveExtraSystemPrompt

Embedded runner (params.ts, run.ts, attempt.ts)

  • New suppressPromptImageDetection param gates detectAndLoadPromptImages

agent-command.ts

  • Passes primaryProvider, previousFailureReason, previousPartialExecution from fallback runOptions into runAgentAttempt

Test plan

  • resolveRetryImages — 6 tests: first attempt, format error, rate_limit, cross-provider, undefined provider, undefined images
  • buildPartialExecutionSystemContext — 3 tests: tool listing, messaging warning, empty tools
  • FailoverError.sanitizeToolNames — 3 tests: cap at 20, truncate to 100, filter empty
  • FailoverError.partialExecution — 2 tests: carries when provided, undefined when absent
  • describeFailoverError — 2 tests: includes/omits partialExecution
  • runWithModelFallback — 4 tests: threads previousFailureReason, threads previousPartialExecution, merges across attempts, omits when absent
  • pnpm check — clean (0 warnings, 0 errors)
  • pnpm build — clean (no type errors, no INEFFECTIVE_DYNAMIC_IMPORT)

🤖 Generated with Claude Code

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 30, 2026

Greptile Summary

This PR introduces failure-mode-aware image handling on fallback retries, cross-provider image privacy enforcement, prompt-image-detection suppression for cross-provider retries, and partial-execution context threading — all at the runAgentAttempt/runWithModelFallback boundary without touching runner internals beyond the new suppressPromptImageDetection flag.

Key changes:

  • resolveRetryImages replaces the unconditional image strip with a three-way policy: preserve on non-format same-provider retries, strip on format errors, strip on cross-provider retries.
  • suppressPromptImageDetection closes the gap where prompt-referenced local images bypassed the cross-provider strip in the embedded runner.
  • PartialExecution and partialExecution on FailoverError carry per-attempt tool execution state; model-fallback.ts merges it across attempts and threads it into each subsequent run callback.
  • buildPartialExecutionSystemContext embeds a same-provider-only warning about already-executed tools into the effective system prompt.

Issues found:

  • FailoverError.sanitizeToolNames is added and tested as a guard against prompt-injection of model-generated tool names, but it is never called in production code — buildPartialExecutionSystemContext embeds raw toolNames directly into the system prompt string. The sanitization layer provides no actual protection until it is wired up.
  • mergedPartialExecution.toolNames is built by plain array concatenation; when the same tool appears in multiple failed attempts the system prompt will list it multiple times (e.g. "bash, bash, web_search"), which is confusing and erodes the effective 20-entry cap.

Confidence Score: 4/5

  • Safe to merge after addressing the sanitizeToolNames gap; the cross-provider privacy and image-handling logic is sound.
  • One P1 finding: sanitizeToolNames is defined specifically to prevent prompt injection of tool names into system prompts, but buildPartialExecutionSystemContext never calls it, rendering the sanitization inert. Until it is wired up, any caller that populates partialExecution.toolNames from untrusted/model-generated data gets no protection. One P2 finding (duplicate tool names) is non-blocking but should be fixed before production traffic exercises the multi-attempt merge path.
  • src/agents/failover-error.ts (sanitizeToolNames not called), src/agents/model-fallback.ts (toolName deduplication in merge)
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/agents/failover-error.ts
Line: 50-55

Comment:
**`sanitizeToolNames` is never called in production code**

`sanitizeToolNames` is documented as "Sanitize tool names from model output before embedding in system prompts," but `buildPartialExecutionSystemContext` embeds `partialExecution.toolNames` directly into the system prompt string without ever calling this method. As-is, the sanitization layer added by this PR provides no actual protection.

If `partialExecution.toolNames` is ever populated from model-generated output (the docstring explicitly calls this out), unsanitized content would flow straight into the system prompt. The fix is to call `sanitizeToolNames` inside `buildPartialExecutionSystemContext` before building `toolList`:

```ts
// In buildPartialExecutionSystemContext:
const safeNames = FailoverError.sanitizeToolNames(partialExecution.toolNames);
if (safeNames.length === 0) return undefined;
const toolList = safeNames.join(", ");
```

Alternatively, apply it at the point where the `FailoverError` is constructed with `partialExecution`, but enforcing it at the point of use is harder to miss.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/agents/model-fallback.ts
Line: 819-829

Comment:
**Duplicate tool names when the same tool appears in multiple failed attempts**

The merge concatenates `toolNames` arrays without deduplication. If the same tool (e.g. `bash`) was executed in both the first and second failed attempts, the accumulated list will contain `"bash, bash, …"` and the resulting system prompt will say "already executed these tools: bash, bash, web_search." This is misleading and inflates the list toward the 20-entry cap that `sanitizeToolNames` enforces.

```suggestion
        mergedPartialExecution = mergedPartialExecution
          ? {
              toolNames: [
                ...new Set([
                  ...mergedPartialExecution.toolNames,
                  ...described.partialExecution.toolNames,
                ]),
              ],
              didSendViaMessagingTool:
                mergedPartialExecution.didSendViaMessagingTool ||
                described.partialExecution.didSendViaMessagingTool,
            }
          : described.partialExecution;
```

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "fix(agents): add suppressPromptImageDete..." | Re-trigger Greptile

Comment on lines +50 to 55
static sanitizeToolNames(names: string[]): string[] {
return names
.map((n) => n.replace(/[^a-zA-Z0-9_-]/g, "").slice(0, 100))
.filter(Boolean)
.slice(0, 20);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 sanitizeToolNames is never called in production code

sanitizeToolNames is documented as "Sanitize tool names from model output before embedding in system prompts," but buildPartialExecutionSystemContext embeds partialExecution.toolNames directly into the system prompt string without ever calling this method. As-is, the sanitization layer added by this PR provides no actual protection.

If partialExecution.toolNames is ever populated from model-generated output (the docstring explicitly calls this out), unsanitized content would flow straight into the system prompt. The fix is to call sanitizeToolNames inside buildPartialExecutionSystemContext before building toolList:

// In buildPartialExecutionSystemContext:
const safeNames = FailoverError.sanitizeToolNames(partialExecution.toolNames);
if (safeNames.length === 0) return undefined;
const toolList = safeNames.join(", ");

Alternatively, apply it at the point where the FailoverError is constructed with partialExecution, but enforcing it at the point of use is harder to miss.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/failover-error.ts
Line: 50-55

Comment:
**`sanitizeToolNames` is never called in production code**

`sanitizeToolNames` is documented as "Sanitize tool names from model output before embedding in system prompts," but `buildPartialExecutionSystemContext` embeds `partialExecution.toolNames` directly into the system prompt string without ever calling this method. As-is, the sanitization layer added by this PR provides no actual protection.

If `partialExecution.toolNames` is ever populated from model-generated output (the docstring explicitly calls this out), unsanitized content would flow straight into the system prompt. The fix is to call `sanitizeToolNames` inside `buildPartialExecutionSystemContext` before building `toolList`:

```ts
// In buildPartialExecutionSystemContext:
const safeNames = FailoverError.sanitizeToolNames(partialExecution.toolNames);
if (safeNames.length === 0) return undefined;
const toolList = safeNames.join(", ");
```

Alternatively, apply it at the point where the `FailoverError` is constructed with `partialExecution`, but enforcing it at the point of use is harder to miss.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +819 to +829
mergedPartialExecution = mergedPartialExecution
? {
toolNames: [
...mergedPartialExecution.toolNames,
...described.partialExecution.toolNames,
],
didSendViaMessagingTool:
mergedPartialExecution.didSendViaMessagingTool ||
described.partialExecution.didSendViaMessagingTool,
}
: described.partialExecution;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Duplicate tool names when the same tool appears in multiple failed attempts

The merge concatenates toolNames arrays without deduplication. If the same tool (e.g. bash) was executed in both the first and second failed attempts, the accumulated list will contain "bash, bash, …" and the resulting system prompt will say "already executed these tools: bash, bash, web_search." This is misleading and inflates the list toward the 20-entry cap that sanitizeToolNames enforces.

Suggested change
mergedPartialExecution = mergedPartialExecution
? {
toolNames: [
...mergedPartialExecution.toolNames,
...described.partialExecution.toolNames,
],
didSendViaMessagingTool:
mergedPartialExecution.didSendViaMessagingTool ||
described.partialExecution.didSendViaMessagingTool,
}
: described.partialExecution;
mergedPartialExecution = mergedPartialExecution
? {
toolNames: [
...new Set([
...mergedPartialExecution.toolNames,
...described.partialExecution.toolNames,
]),
],
didSendViaMessagingTool:
mergedPartialExecution.didSendViaMessagingTool ||
described.partialExecution.didSendViaMessagingTool,
}
: described.partialExecution;
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/model-fallback.ts
Line: 819-829

Comment:
**Duplicate tool names when the same tool appears in multiple failed attempts**

The merge concatenates `toolNames` arrays without deduplication. If the same tool (e.g. `bash`) was executed in both the first and second failed attempts, the accumulated list will contain `"bash, bash, …"` and the resulting system prompt will say "already executed these tools: bash, bash, web_search." This is misleading and inflates the list toward the 20-entry cap that `sanitizeToolNames` enforces.

```suggestion
        mergedPartialExecution = mergedPartialExecution
          ? {
              toolNames: [
                ...new Set([
                  ...mergedPartialExecution.toolNames,
                  ...described.partialExecution.toolNames,
                ]),
              ],
              didSendViaMessagingTool:
                mergedPartialExecution.didSendViaMessagingTool ||
                described.partialExecution.didSendViaMessagingTool,
            }
          : described.partialExecution;
```

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 29cf1c59ee

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

});
// Accumulate failure context for the next candidate's run callback.
lastFailureReason = described.reason ?? "unknown";
if (described.partialExecution) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Populate partialExecution before relying on fallback merge

runWithModelFallback now merges described.partialExecution for later retries, but in the current non-test codebase no new FailoverError(...) call sets partialExecution (the throw sites in the embedded/CLI runners still omit it), so this branch never runs in production. That leaves previousPartialExecution unset on retries, meaning the new replay-avoidance system prompt is never emitted and non-idempotent tools can still be replayed after a failed attempt.

Useful? React with 👍 / 👎.

@mitchmcalister
Copy link
Copy Markdown
Contributor Author

Addressed in c39c5cc:

Greptile P1 (sanitizeToolNames not called): Fixed — buildPartialExecutionSystemContext now calls FailoverError.sanitizeToolNames() before joining tool names into the system prompt. Added test for names that sanitize to empty.

Greptile P2 (duplicate tool names): Fixed — merge now uses new Set([...]) for deduplication. Updated the multi-attempt merge test to use overlapping tool names confirming dedup.

Codex P1 (partialExecution never populated in production): Acknowledged — this is by design. The plumbing is ready for when the embedded/CLI runners start populating partialExecution on FailoverError. We intentionally don't modify the pi-runner internals (see Design Boundary in the PR description). The throw sites are the runner's responsibility; this PR provides the consumption path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling size: L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cross-provider fallback should respect provider trust boundaries for images/attachments

1 participant