Problem
The @cloudflare/tanstack-ai adapter (createWorkersAiChat) does not expose reasoning_effort or chat_template_kwargs parameters. These are supported by Workers AI reasoning models (GLM-4.7-flash, Gemma-4-26b, Kimi K2.5/K2.6) at the inputs level of binding.run().
This is the same class of issue as #501 (filed for workers-ai-provider), but affecting the TanStack AI adapter.
Affected parameters
From @cloudflare/workers-types (ChatCompletionsCommonOptions):
reasoning_effort — "low" | "medium" | "high" | null
chat_template_kwargs — { enable_thinking?: boolean, clear_thinking?: boolean }
Expected behavior
createWorkersAiChat should accept and forward these parameters to the underlying binding.run() inputs, allowing consumers to control reasoning behavior per request.
Context
As more reasoning models land on Workers AI (GLM-4.7-flash, Gemma-4, Kimi K2.6), the ability to control reasoning effort from any Workers AI adapter becomes essential. Without it, models can burn entire output token budgets on chain-of-thought loops with no content produced.
Problem
The
@cloudflare/tanstack-aiadapter (createWorkersAiChat) does not exposereasoning_effortorchat_template_kwargsparameters. These are supported by Workers AI reasoning models (GLM-4.7-flash, Gemma-4-26b, Kimi K2.5/K2.6) at the inputs level ofbinding.run().This is the same class of issue as #501 (filed for
workers-ai-provider), but affecting the TanStack AI adapter.Affected parameters
From
@cloudflare/workers-types(ChatCompletionsCommonOptions):reasoning_effort—"low" | "medium" | "high" | nullchat_template_kwargs—{ enable_thinking?: boolean, clear_thinking?: boolean }Expected behavior
createWorkersAiChatshould accept and forward these parameters to the underlyingbinding.run()inputs, allowing consumers to control reasoning behavior per request.Context
As more reasoning models land on Workers AI (GLM-4.7-flash, Gemma-4, Kimi K2.6), the ability to control reasoning effort from any Workers AI adapter becomes essential. Without it, models can burn entire output token budgets on chain-of-thought loops with no content produced.