Skip to content

Refactor: Unify special tool handling for parallel execution #64

@FL4TLiN3

Description

@FL4TLiN3

Summary

Special tools (think, readPdfFile, readImageFile) currently block parallel tool call execution. When an LLM returns multiple tool calls including a special tool, only the special tool is executed and others are dropped.

Background

  • Past constraint: Tool results couldn't include base64-encoded PDF/image data directly
  • Current state: Vercel AI SDK 5.0+ and Anthropic API (Feb 2025) now support PDF/images in tool results
  • Perstack version: Using AI SDK v5.0.104, which includes this support

Current Implementation Issues

  1. calling-tool.ts: Special tools (think, readPdfFile, readImageFile) are prioritized and return early, dropping other parallel tool calls
  2. resolvingPdfFileLogic: Adds PDF to userMessage instead of toolMessage (legacy workaround)
  3. createToolMessage: Type doesn't include FileInlinePart for PDFs
  4. toolResultPartToCoreToolResultPart: No handling for fileInlinePart

Proposed Changes

1. Update message handling for PDF support

  • Extend createToolMessage type to support FileInlinePart
  • Update toolResultPartToCoreToolResultPart to convert fileInlinePart to media type
  • Modify resolvingPdfFileLogic to include PDF in toolMessage (not userMessage)

2. Simplify tool execution flow

Before:

attemptCompletion → single execution (others ignored)
think → single execution (others ignored)
readPdfFile → single execution (others ignored)
readImageFile → single execution (others ignored)
MCP tools → parallel execution
delegate/interactive → sequential after MCP

After:

attemptCompletion → single execution (others ignored, completion means done)
All other MCP tools (think, readPdfFile, readImageFile, etc.) → parallel execution
delegate/interactive → sequential after MCP parallel execution

3. Consider state machine simplification

With unified tool result handling, ResolvingPdfFile, ResolvingImageFile, and ResolvingThought states may be consolidatable into ResolvingToolResult.

Tasks

  • Extend createToolMessage type to include FileInlinePart
  • Update toolResultPartToCoreToolResultPart for fileInlinePart
  • Refactor resolvingPdfFileLogic to use toolMessage
  • Modify calling-tool.ts to parallelize all MCP tools except attemptCompletion
  • Update tests
  • Evaluate state machine simplification (may be separate PR)

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions