Skip to content

Feedback: Tool call ID instability across chunks in the streaming API #9

@aliou

Description

@aliou

Hello!

I wanted to try the Poolside models via pi and made a custom extension to define it as a provider. However, I noticed that tool calls were having an expected behaviour. Investigating with an agent (see below), the issue is because the tool call ids are not stable between chunks.

In the pool agent, the issue doesn't seem to happen, most likely because ids are ignored (using the tool call index as identifier) (see below for assumption on why based on the Charm code).

While a workaround is possible, would it be possible on your end to make this change? No worries if not, I get that you most likely want your models used via your harness.

Let me know if you need more details!

Investigation summary by zai/glm-5.1

Poolside Streaming API: Tool Call ID Instability Across Chunks

Summary

Poolside's /chat/completions streaming endpoint emits two different id values for a single logical tool call when streaming tool_calls deltas. The first chunk carries one ID (alongside the function name), and all subsequent chunks carry a different ID. While the OpenAI Chat Completions streaming specification does not explicitly mandate that id remain consistent across chunks, OpenAI's own API always sends the id only on the first chunk and omits it on subsequent chunks for the same tool call. Poolside's dual-ID behavior is unusual and breaks downstream consumers that use id as a stable key for tool call block lookup.

Observed Behavior

When streaming a tool call from Poolside's laguna-m.1 model, the SSE chunks look like this:

Chunk 1:
  delta.tool_calls[0]: {
    index: 0,
    id: "chatcmpl-tool-9859572ef05bcba3",
    function: { name: "read", arguments: "" }
  }

Chunk 2:
  delta.tool_calls[0]: {
    index: 0,
    id: "chatcmpl-tool-0460a8a39855429abf6421a88c33d18c",
    function: { name: null, arguments: "{\"path\": \"package" }
  }

Chunk 3:
  delta.tool_calls[0]: {
    index: 0,
    id: "chatcmpl-tool-0460a8a39855429abf6421a88c33d18c",
    function: { name: null, arguments: ".json" }
  }

...final chunk:
  delta.tool_calls[0]: {
    index: 0,
    id: "chatcmpl-tool-0460a8a39855429abf6421a88c33d18c",
    function: { name: null, arguments: "}" }
  },
  delta.content: "\n",
  finish_reason: "tool_calls"

Two distinct IDs appear at index: 0 for what is clearly a single logical tool call. The first ID (9859...) appears once and carries the function name. The second ID (0460...) appears in all subsequent chunks and carries the incremental arguments. The second ID is the one that would be considered "canonical" since it appears consistently across the majority of the stream.

What the OpenAI Spec Says

The OpenAI Chat Completions streaming specification defines the following fields on delta.tool_calls[]:

  • index (number, required): identifies which tool call slot the delta belongs to. This is the spec's designated mechanism for correlating tool call deltas across streaming chunks.
  • id (optional string): "The ID of the tool call." No further constraints are documented.

Notably, the spec does not explicitly state that id must be consistent across chunks for the same tool call. Contrast this with the top-level ChatCompletionChunk.id field, which explicitly states: "A unique identifier for the chat completion. Each chunk has the same ID." No such language exists for delta.tool_calls[].id.

So strictly speaking, Poolside's dual-ID behavior does not violate a documented spec requirement. However, it is inconsistent with how OpenAI's own API behaves in practice:

  • OpenAI's observed behavior: The id appears only on the first chunk for a tool call and is omitted from all subsequent chunks for the same tool call. When present, it is always the same value.
  • Poolside's observed behavior: Two different id values appear at the same index for a single tool call. The first chunk carries one ID; all subsequent chunks carry a different ID.

This difference matters because the id field serves as the tool call identifier that consumers must use when constructing the tool_call_id on the subsequent tool role message. If two different IDs appear for the same tool call, the consumer must decide which one is canonical, and consumers that use id as a stable lookup key will break.

Impact on Consumers

This behavior breaks any consumer that uses id (rather than index) as the primary key for tracking tool call identity across streaming chunks. Two concrete examples:

1. Pi's built-in OpenAI-completions handler (TypeScript)

Pi tracks tool call blocks via two maps: one keyed by streamIndex (the index field) and one keyed by toolCallId (the id field). When the second chunk arrives with a new ID at the same index, the handler finds the existing block by streamIndex, then registers the block under the new toolCallId in the second map. The first ID entry becomes stale but is never cleaned up. Combined with the content: "\n" that Poolside emits alongside the final tool_calls delta, this leads the handler to create a phantom tool call block with an empty name and empty arguments, which then gets fed back into the conversation history on subsequent turns, degrading model behavior.

2. Any consumer using id as a primary key

If a consumer builds a Map<id, ToolCallBlock> to accumulate streaming deltas (a natural approach given that id is the tool call identifier), the second chunk's new ID will create a separate entry. This produces two tool call objects from a single logical invocation: one with only a name (and no arguments), and one with only arguments (and no name). Neither is usable on its own.

Our Workaround

We implemented a custom streaming handler that uses only the index field as the key for tool call block identity, ignoring id for lookup purposes. This correctly merges all chunks at the same index into a single tool call block regardless of ID changes. The approach is:

// Key by stream index only -- the stable identifier in Poolside's stream
const key = toolCall.index !== undefined
  ? `index:${toolCall.index}`
  : `id:${toolCall.id ?? toolBlocksByKey.size}`;

We also corrected the ID assignment to prefer the last-seen ID (which is the canonical one in Poolside's stream) rather than the first:

// Use the latest ID, not just the first -- Poolside's canonical ID
// appears in the second chunk onward
if (toolCall.id) toolBlock.id = toolCall.id;

Why This May Not Affect Poolside's Own Harness (pool CLI)

We did not have access to the source code of Poolside's pool CLI (it is a compiled Go binary, not open source). However, based on binary analysis, the pool binary links against charmbracelet/openai-go, which is a fork of openai/openai-go -- the official OpenAI Go SDK. The Charm fork retains the same streaming accumulator implementation as upstream.

The relevant code in openai-go's ChatCompletionAccumulator.accumulateDelta() handles tool calls like this:

for j := range delta.Delta.ToolCalls {
    deltaTool := &delta.Delta.ToolCalls[j]
    toolIndex := clampToZero(deltaTool.Index)

    choice.Message.ToolCalls = expandToFit(choice.Message.ToolCalls, toolIndex)
    tool := &choice.Message.ToolCalls[toolIndex]

    if deltaTool.ID != "" {
        tool.ID = deltaTool.ID   // <-- OVERWRITES with the latest non-empty ID
    }
    if deltaTool.Type != "" {
        tool.Type = deltaTool.Type
    }
    tool.Function.Name += deltaTool.Function.Name
    tool.Function.Arguments += deltaTool.Function.Arguments
}

This accumulator uses toolIndex (the index field) as the sole mechanism for identifying which tool call slot a delta belongs to. It addresses tool calls by their position in the ToolCalls array, not by ID. The id field is handled with a simple overwrite: tool.ID = deltaTool.ID. This means:

  1. Chunk 1 arrives at index:0 with id:"9859...". The accumulator creates ToolCalls[0] and sets ID = "9859...".
  2. Chunk 2 arrives at index:0 with id:"0460...". The accumulator finds ToolCalls[0] by index and overwrites ID = "0460...".
  3. All subsequent chunks continue to overwrite with the same "0460..." ID, which is a no-op.

The result is a single ToolCalls[0] entry with the correct final ID (0460...), accumulated name, and accumulated arguments. No phantom or duplicate tool calls are created. The dual-ID behavior is silently absorbed.

Important caveat: We are not certain that pool uses the Charm accumulator directly for its streaming ingestion. The binary contains references to charmbracelet, bubbletea, and openai-com, which strongly suggests it does, but we cannot confirm the exact call path without source code. It is also possible that Poolside wrote custom streaming code that happens to handle this the same way (e.g., keying by index and overwriting IDs). We can only confirm that the Charm accumulator -- which appears to be linked into the binary -- would handle it correctly. This is a hypothesis, not a confirmed explanation.

Suggested Improvements

Two improvements would make Poolside's streaming output easier for third-party consumers to handle:

  1. Stabilize the id field: Ensure all streaming chunks for a given tool call carry the same id, or follow OpenAI's pattern of sending id only on the first chunk and omitting it on subsequent chunks. This would eliminate the ambiguity for any consumer that uses id as a lookup key.
  2. Avoid emitting content alongside tool_calls on the final chunk: The delta.content: "\n" that appears alongside the final tool_calls delta can cause consumers to create an unwanted text block. OpenAI's API does not emit content on chunks that also carry tool_calls deltas.

Reproduction

A minimal reproduction requires only a streaming request with a tool definition:

curl -s -N -X POST https://inference.poolside.ai/v1/chat/completions \
  -H "Authorization: Bearer $POOLSIDE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "poolside/laguna-m.1",
    "messages": [{"role": "user", "content": "Read the file package.json"}],
    "max_tokens": 500,
    "stream": true,
    "stream_options": {"include_usage": true},
    "tools": [{
      "type": "function",
      "function": {
        "name": "read",
        "description": "Read a file",
        "parameters": {
          "type": "object",
          "properties": {"path": {"type": "string"}},
          "required": ["path"]
        }
      }
    }]
  }'

Filter for tool call chunks to see the dual IDs:

... | grep -v '"tool_calls":null'

The output will show two distinct IDs at the same index:0 for a single tool call.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions