Feedback: Tool call ID instability across chunks in the streaming API

Hello! 

I wanted to try the Poolside models via [pi](https://pi.dev) and made a custom extension to define it as a provider. However, I noticed that tool calls were having an expected behaviour. Investigating with an agent (see below), the issue is because the tool call ids are not stable between chunks.

In the `pool` agent, the issue doesn't seem to happen, most likely because ids are ignored (using the tool call index as identifier) (see below for assumption on why based on the Charm code).

While a workaround is possible, would it be possible on your end to make this change? No worries if not, I get that you most likely want your models used via your harness.

Let me know if you need more details!


<details><summary>Investigation summary by zai/glm-5.1</summary>
<p>

# Poolside Streaming API: Tool Call ID Instability Across Chunks

## Summary

Poolside's `/chat/completions` streaming endpoint emits two different `id` values for a single logical tool call when streaming `tool_calls` deltas. The first chunk carries one ID (alongside the function name), and all subsequent chunks carry a different ID. While the OpenAI Chat Completions streaming specification does not explicitly mandate that `id` remain consistent across chunks, OpenAI's own API always sends the `id` only on the first chunk and omits it on subsequent chunks for the same tool call. Poolside's dual-ID behavior is unusual and breaks downstream consumers that use `id` as a stable key for tool call block lookup.

## Observed Behavior

When streaming a tool call from Poolside's `laguna-m.1` model, the SSE chunks look like this:

```
Chunk 1:
  delta.tool_calls[0]: {
    index: 0,
    id: "chatcmpl-tool-9859572ef05bcba3",
    function: { name: "read", arguments: "" }
  }

Chunk 2:
  delta.tool_calls[0]: {
    index: 0,
    id: "chatcmpl-tool-0460a8a39855429abf6421a88c33d18c",
    function: { name: null, arguments: "{\"path\": \"package" }
  }

Chunk 3:
  delta.tool_calls[0]: {
    index: 0,
    id: "chatcmpl-tool-0460a8a39855429abf6421a88c33d18c",
    function: { name: null, arguments: ".json" }
  }

...final chunk:
  delta.tool_calls[0]: {
    index: 0,
    id: "chatcmpl-tool-0460a8a39855429abf6421a88c33d18c",
    function: { name: null, arguments: "}" }
  },
  delta.content: "\n",
  finish_reason: "tool_calls"
```

Two distinct IDs appear at `index: 0` for what is clearly a single logical tool call. The first ID (`9859...`) appears once and carries the function name. The second ID (`0460...`) appears in all subsequent chunks and carries the incremental arguments. The second ID is the one that would be considered "canonical" since it appears consistently across the majority of the stream.

## What the OpenAI Spec Says

The OpenAI Chat Completions streaming specification defines the following fields on `delta.tool_calls[]`:

- **`index`** (number, required): identifies which tool call slot the delta belongs to. This is the spec's designated mechanism for correlating tool call deltas across streaming chunks.
- **`id`** (optional string): "The ID of the tool call." No further constraints are documented.

Notably, the spec does not explicitly state that `id` must be consistent across chunks for the same tool call. Contrast this with the top-level `ChatCompletionChunk.id` field, which explicitly states: *"A unique identifier for the chat completion. Each chunk has the same ID."* No such language exists for `delta.tool_calls[].id`.

So strictly speaking, Poolside's dual-ID behavior does not violate a documented spec requirement. However, it is inconsistent with how OpenAI's own API behaves in practice:

- **OpenAI's observed behavior**: The `id` appears only on the first chunk for a tool call and is omitted from all subsequent chunks for the same tool call. When present, it is always the same value.
- **Poolside's observed behavior**: Two different `id` values appear at the same `index` for a single tool call. The first chunk carries one ID; all subsequent chunks carry a different ID.

This difference matters because the `id` field serves as the tool call identifier that consumers must use when constructing the `tool_call_id` on the subsequent `tool` role message. If two different IDs appear for the same tool call, the consumer must decide which one is canonical, and consumers that use `id` as a stable lookup key will break.

## Impact on Consumers

This behavior breaks any consumer that uses `id` (rather than `index`) as the primary key for tracking tool call identity across streaming chunks. Two concrete examples:

### 1. Pi's built-in OpenAI-completions handler (TypeScript)

Pi tracks tool call blocks via two maps: one keyed by `streamIndex` (the `index` field) and one keyed by `toolCallId` (the `id` field). When the second chunk arrives with a new ID at the same index, the handler finds the existing block by `streamIndex`, then registers the block under the new `toolCallId` in the second map. The first ID entry becomes stale but is never cleaned up. Combined with the `content: "\n"` that Poolside emits alongside the final `tool_calls` delta, this leads the handler to create a phantom tool call block with an empty name and empty arguments, which then gets fed back into the conversation history on subsequent turns, degrading model behavior.

### 2. Any consumer using `id` as a primary key

If a consumer builds a `Map<id, ToolCallBlock>` to accumulate streaming deltas (a natural approach given that `id` is the tool call identifier), the second chunk's new ID will create a separate entry. This produces two tool call objects from a single logical invocation: one with only a name (and no arguments), and one with only arguments (and no name). Neither is usable on its own.

## Our Workaround

We implemented a custom streaming handler that uses only the `index` field as the key for tool call block identity, ignoring `id` for lookup purposes. This correctly merges all chunks at the same `index` into a single tool call block regardless of ID changes. The approach is:

```typescript
// Key by stream index only -- the stable identifier in Poolside's stream
const key = toolCall.index !== undefined
  ? `index:${toolCall.index}`
  : `id:${toolCall.id ?? toolBlocksByKey.size}`;
```

We also corrected the ID assignment to prefer the last-seen ID (which is the canonical one in Poolside's stream) rather than the first:

```typescript
// Use the latest ID, not just the first -- Poolside's canonical ID
// appears in the second chunk onward
if (toolCall.id) toolBlock.id = toolCall.id;
```

## Why This May Not Affect Poolside's Own Harness (`pool` CLI)

We did not have access to the source code of Poolside's `pool` CLI (it is a compiled Go binary, not open source). However, based on binary analysis, the `pool` binary links against `charmbracelet/openai-go`, which is a fork of `openai/openai-go` -- the official OpenAI Go SDK. The Charm fork retains the same streaming accumulator implementation as upstream.

The relevant code in `openai-go`'s `ChatCompletionAccumulator.accumulateDelta()` handles tool calls like this:

```go
for j := range delta.Delta.ToolCalls {
    deltaTool := &delta.Delta.ToolCalls[j]
    toolIndex := clampToZero(deltaTool.Index)

    choice.Message.ToolCalls = expandToFit(choice.Message.ToolCalls, toolIndex)
    tool := &choice.Message.ToolCalls[toolIndex]

    if deltaTool.ID != "" {
        tool.ID = deltaTool.ID   // <-- OVERWRITES with the latest non-empty ID
    }
    if deltaTool.Type != "" {
        tool.Type = deltaTool.Type
    }
    tool.Function.Name += deltaTool.Function.Name
    tool.Function.Arguments += deltaTool.Function.Arguments
}
```

This accumulator uses `toolIndex` (the `index` field) as the sole mechanism for identifying which tool call slot a delta belongs to. It addresses tool calls by their position in the `ToolCalls` array, not by ID. The `id` field is handled with a simple overwrite: `tool.ID = deltaTool.ID`. This means:

1. Chunk 1 arrives at `index:0` with `id:"9859..."`. The accumulator creates `ToolCalls[0]` and sets `ID = "9859..."`.
2. Chunk 2 arrives at `index:0` with `id:"0460..."`. The accumulator finds `ToolCalls[0]` by index and **overwrites** `ID = "0460..."`.
3. All subsequent chunks continue to overwrite with the same `"0460..."` ID, which is a no-op.

The result is a single `ToolCalls[0]` entry with the correct final ID (`0460...`), accumulated name, and accumulated arguments. No phantom or duplicate tool calls are created. The dual-ID behavior is silently absorbed.

**Important caveat:** We are not certain that `pool` uses the Charm accumulator directly for its streaming ingestion. The binary contains references to `charmbracelet`, `bubbletea`, and `openai-com`, which strongly suggests it does, but we cannot confirm the exact call path without source code. It is also possible that Poolside wrote custom streaming code that happens to handle this the same way (e.g., keying by index and overwriting IDs). We can only confirm that the Charm accumulator -- which appears to be linked into the binary -- would handle it correctly. This is a hypothesis, not a confirmed explanation.

## Suggested Improvements

Two improvements would make Poolside's streaming output easier for third-party consumers to handle:

1. **Stabilize the `id` field**: Ensure all streaming chunks for a given tool call carry the same `id`, or follow OpenAI's pattern of sending `id` only on the first chunk and omitting it on subsequent chunks. This would eliminate the ambiguity for any consumer that uses `id` as a lookup key.
2. **Avoid emitting `content` alongside `tool_calls` on the final chunk**: The `delta.content: "\n"` that appears alongside the final `tool_calls` delta can cause consumers to create an unwanted text block. OpenAI's API does not emit `content` on chunks that also carry `tool_calls` deltas.

## Reproduction

A minimal reproduction requires only a streaming request with a tool definition:

```bash
curl -s -N -X POST https://inference.poolside.ai/v1/chat/completions \
  -H "Authorization: Bearer $POOLSIDE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "poolside/laguna-m.1",
    "messages": [{"role": "user", "content": "Read the file package.json"}],
    "max_tokens": 500,
    "stream": true,
    "stream_options": {"include_usage": true},
    "tools": [{
      "type": "function",
      "function": {
        "name": "read",
        "description": "Read a file",
        "parameters": {
          "type": "object",
          "properties": {"path": {"type": "string"}},
          "required": ["path"]
        }
      }
    }]
  }'
```

Filter for tool call chunks to see the dual IDs:

```bash
... | grep -v '"tool_calls":null'
```

The output will show two distinct IDs at the same `index:0` for a single tool call.


</p>
</details> 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feedback: Tool call ID instability across chunks in the streaming API #9

Poolside Streaming API: Tool Call ID Instability Across Chunks

Summary

Observed Behavior

What the OpenAI Spec Says

Impact on Consumers

1. Pi's built-in OpenAI-completions handler (TypeScript)

2. Any consumer using `id` as a primary key

Our Workaround

Why This May Not Affect Poolside's Own Harness (`pool` CLI)

Suggested Improvements

Reproduction

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feedback: Tool call ID instability across chunks in the streaming API #9

Description

Poolside Streaming API: Tool Call ID Instability Across Chunks

Summary

Observed Behavior

What the OpenAI Spec Says

Impact on Consumers

1. Pi's built-in OpenAI-completions handler (TypeScript)

2. Any consumer using id as a primary key

Our Workaround

Why This May Not Affect Poolside's Own Harness (pool CLI)

Suggested Improvements

Reproduction

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

2. Any consumer using `id` as a primary key

Why This May Not Affect Poolside's Own Harness (`pool` CLI)