You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Azure GPT-5.x deployments routed through the built-in cloudflare-ai-gateway provider fail in OpenCode because requests use max_tokens instead of max_completion_tokens.
This is distinct from #25096 (custom @ai-sdk/openai-compatible providers) and #22623 (native Azure provider). Here the failure is in the built-incloudflare-ai-gateway path backed by @earendil-works/pi-ai.
Environment
OpenCode: 1.17.8
Provider: cloudflare-ai-gateway (connected via /connect, Azure keys in gateway BYOK)
opencode run -m cloudflare-ai-gateway/azure-openai/gpt-5.4-nano "Reply with exactly: OK"
Error:
Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.
Root cause (pi-ai)
In @earendil-works/pi-aiopenai-completions.js, detectCompat() treats all cloudflare-ai-gateway URLs as non-standard and sets:
maxTokensField: useMaxTokens ? "max_tokens" : "max_completion_tokens"// where useMaxTokens includes isCloudflareAiGateway
So every gateway request sends max_tokens, which newer Azure GPT-5 deployments reject.
getCompat() can override via model.compat.maxTokensField, but that override is not reachable from normal OpenCode config for custom models added in opencode.jsonc.
Workarounds attempted (none worked)
Per-model options.compat.maxTokensField in opencode.jsonc
modelOverrides / full model entries in ~/.config/opencode/models.json
Full model entries in ~/.pi/agent/models.json (pi docs path)
options.compat.maxTokensField: "max_completion_tokens" per model in opencode.jsonc
modelOverrides / models.json compat for config-defined cloudflare-ai-gateway models
Smarter defaults: azure-openai/gpt-5* routes use max_completion_tokens while other gateway models keep max_tokens
Suggested fix
OpenCode: Map provider.models.*.options.compat (or top-level model compat) into pi model.compat when registering config models.
pi-ai: Do not blanket-force max_tokens for all cloudflare-ai-gateway traffic; either per-model override or prefix-based detection for azure-openai/gpt-5*.
Notes
Gateway + Azure deployment itself is healthy; this is a client parameter naming issue.
Switching to the native Azure provider without pointing baseURL at the gateway would bypass Cloudflare AI Gateway observability/BYOK routing.
Summary
Azure GPT-5.x deployments routed through the built-in
cloudflare-ai-gatewayprovider fail in OpenCode because requests usemax_tokensinstead ofmax_completion_tokens.This is distinct from #25096 (custom
@ai-sdk/openai-compatibleproviders) and #22623 (native Azure provider). Here the failure is in the built-incloudflare-ai-gatewaypath backed by@earendil-works/pi-ai.Environment
cloudflare-ai-gateway(connected via/connect, Azure keys in gateway BYOK)https://gateway.ai.cloudflare.com/v1/{account}/{gateway}/compatazure-openai/gpt-5.4-nano,azure-openai/gpt-5.4,azure-openai/gpt-5.4-mini,azure-openai/gpt-5.5grok-4-20-non-reasoning,kimi-k2.6) work fineReproduce
~/.config/opencode/opencode.jsonc:{ "provider": { "cloudflare-ai-gateway": { "models": { "azure-openai/gpt-5.4-nano": { "name": "GPT 5.4 Nano (Azure)" } } } } }opencode run -m cloudflare-ai-gateway/azure-openai/gpt-5.4-nano "Reply with exactly: OK"Error:
Root cause (pi-ai)
In
@earendil-works/pi-aiopenai-completions.js,detectCompat()treats allcloudflare-ai-gatewayURLs as non-standard and sets:So every gateway request sends
max_tokens, which newer Azure GPT-5 deployments reject.getCompat()can override viamodel.compat.maxTokensField, but that override is not reachable from normal OpenCode config for custom models added inopencode.jsonc.Workarounds attempted (none worked)
options.compat.maxTokensFieldinopencode.jsoncmodelOverrides/ full model entries in~/.config/opencode/models.json~/.pi/agent/models.json(pi docs path)@ai-sdk/openai-compatiblepointed at the same compatbaseURL(still sendsmax_tokens— same class of bug as openai-compatible adapter sends max_tokens to GPT-5/o-series reasoning models that require max_completion_tokens #25096)Expected behavior
At least one of these should work:
options.compat.maxTokensField: "max_completion_tokens"per model inopencode.jsoncmodelOverrides/models.jsoncompat for config-definedcloudflare-ai-gatewaymodelsazure-openai/gpt-5*routes usemax_completion_tokenswhile other gateway models keepmax_tokensSuggested fix
provider.models.*.options.compat(or top-level modelcompat) into pimodel.compatwhen registering config models.max_tokensfor allcloudflare-ai-gatewaytraffic; either per-model override or prefix-based detection forazure-openai/gpt-5*.Notes
baseURLat the gateway would bypass Cloudflare AI Gateway observability/BYOK routing.Related: #22623, #25096