Serialize stored shell tool calls correctly on susbequent requests by ScriptSmith · Pull Request #41 · ScriptSmith/hadrian

ScriptSmith · 2026-05-31T09:56:10Z

No description provided.

greptile-apps · 2026-05-31T09:59:59Z

Greptile Summary

This PR fixes a serialization bug where shell_call / shell_call_output items reconstructed from previous_response_id history were forwarded verbatim to function-mode providers, causing OpenAI-compatible upstreams to reject the request (array output field where a string is expected) and Anthropic/Bedrock/Vertex to silently drop the tool results.

Adds rewrite_shell_history_to_function_calls (called unconditionally at the top of preprocess_shell_tools, before the early-return for no-tools, so continuations that no longer re-declare the shell tool still get their history normalized) to rewrite stored hosted-shell items to the function_call / function_call_output pair the model originally exchanged.
Adds render_shell_output_text to flatten the array output chunks back into the plain-text exit_code/stdout/stderr blob the live tool loop sends, and updates provider documentation to describe this rewrite contract.

Confidence Score: 5/5

Safe to merge — the rewrite is well-scoped, only touches function-mode provider paths, and is guarded by an existing openai_keep_native_shell gate for native passthrough; a focused integration test covers the round-trip.

The core logic is straightforward: iterate input items, swap two variant types, flatten array output to a string. The output text format matches the live executor for the common single-chunk case. The only discrepancy found is cosmetic — timed-out calls reconstruct exit_code 124 instead of -1 — which is unlikely to affect downstream model behavior.

The render_shell_output_text timeout branch in src/services/shell_tool.rs is worth a second look given the exit-code mismatch with the live executor path.

Important Files Changed

Filename	Overview
src/services/shell_tool.rs	Adds rewrite_shell_history_to_function_calls and render_shell_output_text to convert stored shell_call/shell_call_output history items into function_call/function_call_output for function-mode providers; minor fidelity discrepancy for timed-out calls (exit code 124 vs -1 in reconstructed text)
src/services/responses_chain.rs	Doc-comment only: explains that ShellCall/ShellCallOutput items are replayed verbatim here and normalized by preprocess_shell_tools downstream; no logic changes
agent_instructions/adding_provider.md	Documentation update explaining the shell history rewrite contract for provider implementors; accurately describes the rewrite behavior and its motivation

Sequence Diagram

sequenceDiagram
    participant Client
    participant Hadrian
    participant Provider

    Note over Client,Provider: Turn 1 (function mode)
    Client->>Hadrian: POST /responses (shell tool declared)
    Hadrian->>Provider: "function_call { name: shell, arguments: {...} }"
    Provider-->>Hadrian: shell executes
    Hadrian-->>Client: shell_call + shell_call_output (array output, persisted)

    Note over Client,Provider: Turn 2 - continuation (previous_response_id)
    Client->>Hadrian: POST /responses (previous_response_id)
    Hadrian->>Hadrian: output_item_to_input() replays shell_call/shell_call_output verbatim
    Hadrian->>Hadrian: preprocess_shell_tools() calls rewrite_shell_history_to_function_calls()
    Note right of Hadrian: ShellCall to OutputFunctionCall, ShellCallOutput to FunctionCallOutput, array output flattened to string
    Hadrian->>Provider: function_call + function_call_output (string output)
    Provider-->>Client: next model response

Prompt To Fix All With AI

Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
src/services/shell_tool.rs:838-843
**Timeout exit-code mismatch in reconstructed history**

The live executor writes `exit_code: {exit_for_report}` into the continuation text blob, where `exit_for_report = final_exit.unwrap_or(-1)`. A killed/timeout call has `final_exit = None`, so the model originally saw `exit_code: -1`. On continuation, `render_shell_output_text` maps `ShellCallOutcome::Timeout` to `124` — a `timeout(1)` sentinel — so the reconstructed history tells the model `exit_code: 124` rather than what it saw. The comment ("matches how the live loop reports a killed call") is therefore inaccurate. Most models tolerate any non-zero exit code for a failed call, so this is unlikely to change behavior, but the infidelity is worth noting for future multi-turn debugging.

_{Reviews (3): Last reviewed commit: "Review fixes" | Re-trigger Greptile}

ScriptSmith · 2026-05-31T10:13:54Z

@greptile-apps

Serialize stored shell tool calls correctly on susbequent requests

d326a85

greptile-apps Bot reviewed May 31, 2026

View reviewed changes

Comment thread src/services/shell_tool.rs

Comment thread src/services/shell_tool.rs Outdated

Review fixes

4e56c71

ScriptSmith merged commit 9b9efe1 into main May 31, 2026
20 checks passed

ScriptSmith deleted the shell-persistence branch May 31, 2026 10:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Serialize stored shell tool calls correctly on susbequent requests#41

Serialize stored shell tool calls correctly on susbequent requests#41
ScriptSmith merged 2 commits into
mainfrom
shell-persistence

ScriptSmith commented May 31, 2026

Uh oh!

greptile-apps Bot commented May 31, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

ScriptSmith commented May 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ScriptSmith commented May 31, 2026

Uh oh!

greptile-apps Bot commented May 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

ScriptSmith commented May 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

greptile-apps Bot commented May 31, 2026 •

edited

Loading