Skip to content

fix(google): Preserve thought_signature on tool calls and bump response-header timeout#464

Merged
edenreich merged 1 commit intomainfrom
fix/google-thought-signature-and-response-header-timeout
Apr 28, 2026
Merged

fix(google): Preserve thought_signature on tool calls and bump response-header timeout#464
edenreich merged 1 commit intomainfrom
fix/google-thought-signature-and-response-header-timeout

Conversation

@edenreich
Copy link
Copy Markdown
Contributor

@edenreich edenreich commented Apr 28, 2026

Summary

  • Pull in SDK v1.16.0 and merge extra_content.google.thought_signature into accumulated tool calls so it round-trips back to the gateway, fixing Google Gemini extended-thinking models rejecting follow-up turns with HTTP 400: Function call is missing a thought_signature in functionCall parts.
  • Propagate CLIENT_RESPONSE_HEADER_TIMEOUT from gateway.timeout in both Docker and local-binary spawn paths - the gateway's HTTP client otherwise defaults to 10s and cuts slow upstream responses off before CLIENT_TIMEOUT would fire.
  • Add a regression test case (preserves_google_thought_signature_across_chunks) for the streaming-chunk merge.

Background

Gemini-3-Pro and other extended-thinking Gemini models return a per-tool-call thought_signature on each function-call part (Google docs: https://ai.google.dev/gemini-api/docs/thought-signatures). Google requires this opaque value to be echoed back on every subsequent request — otherwise it returns 400 INVALID_ARGUMENT.

The signature lives at tool_calls[].extra_content.google.thought_signature in Google's OpenAI-compatible API. The schema/SDK update (released as inference-gateway/sdk@v1.16.0) introduces ChatCompletionMessageToolCall.ExtraContent so the typed unmarshal preserves the field. This PR is the CLI side: capture it from streaming chunks in accumulateToolCalls and let it ride through ConversationEntry.Message.ToolCalls on the next outbound request.

The timeout change is independent but discovered during testing - image-generation flows sat 10s+ on the upstream call and the gateway's per-client ResponseHeaderTimeout (default 10s) was killing them well before CLIENT_TIMEOUT (already configured) had any chance to apply.

…se-header timeout

Gemini extended-thinking models return a per-tool-call `thought_signature`
that must be echoed back on the next request. The CLI was dropping it on
the streaming path, causing Google to reject follow-up turns with HTTP 400
"Function call is missing a thought_signature in functionCall parts".

Pulls in SDK v1.16.0 (which adds `extra_content.google.thought_signature`
on `ChatCompletionMessageToolCall` and the streaming chunk variant) and
merges it into the accumulated tool call so it round-trips through
ConversationEntry → outbound request.

Also propagate `CLIENT_RESPONSE_HEADER_TIMEOUT` from `gateway.timeout` —
the gateway's HTTP client otherwise defaults to 10s, which kills slow
upstream responses (e.g. image generation) long before `CLIENT_TIMEOUT`
would fire.
@edenreich edenreich changed the title fix(google): preserve thought_signature on tool calls and bump response-header timeout fix(google): Preserve thought_signature on tool calls and bump response-header timeout Apr 28, 2026
@edenreich edenreich merged commit 57c9e0a into main Apr 28, 2026
5 checks passed
@edenreich edenreich deleted the fix/google-thought-signature-and-response-header-timeout branch April 28, 2026 03:20
ig-semantic-release-bot Bot pushed a commit that referenced this pull request Apr 28, 2026
## [0.105.1](v0.105.0...v0.105.1) (2026-04-28)

### 🐛 Bug Fixes

* **google:** Preserve thought_signature on tool calls and bump response-header timeout ([#464](#464)) ([57c9e0a](57c9e0a))

### 📚 Documentation

* Document .infer/plans/ in directory structure ([daf9a75](daf9a75))

### 🧹 Maintenance

* **nix:** Update package to v0.105.0 ([#462](#462)) ([de53e59](de53e59))
@ig-semantic-release-bot
Copy link
Copy Markdown
Contributor

🎉 This PR is included in version 0.105.1 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant