Skip to content

[Go SDK] max_tokens not rewritten to max_completion_tokens for newer OpenAI models #441

@santoshkumarradha

Description

@santoshkumarradha

Summary

The Go SDK AI client always emits max_tokens in request bodies, but newer OpenAI models (o1-series, gpt-4.x-series) only accept max_completion_tokens and silently ignore or reject max_tokens, causing Go agents to lose their output-length cap.

Context

sdk/go/ai/request.go serializes the user-configured token limit as max_tokens. OpenAI's o1 and later 4.x model family dropped support for max_tokens in favor of max_completion_tokens. The Python SDK handles this via a per-model rewrite in litellm_adapters.py. Go has no equivalent rewrite pass: agents pointed at o1-mini, o1-preview, or gpt-4o variants silently send a parameter the model ignores, and the generated response may be longer or shorter than intended with no error returned. This is a silent correctness bug affecting any Go agent using newer OpenAI models.

Scope

In Scope

  • Add a provider/model-aware rewrite pass in the Go SDK AI client that substitutes max_completion_tokens for max_tokens when the model requires it.
  • The model list requiring the rewrite should be configurable or derived from a simple prefix/contains check (e.g. models starting with o1, o3, or matching OpenAI's documented list).
  • The rewrite should only apply to OpenAI-compatible endpoints (detected via base URL or an explicit provider hint).

Out of Scope

  • Rewriting other OpenAI-deprecated parameters — focus on max_tokens only.
  • Supporting Anthropic or Gemini model quirks in this issue — OpenAI only.
  • Changing the user-facing MaxTokens field name in RequestConfig — rewrite at serialization time, not at the API level.

Files

  • sdk/go/ai/request.go — add model-aware serialization: emit max_completion_tokens instead of max_tokens for matching OpenAI models
  • sdk/go/ai/client.go — apply the rewrite in the request-build path, using model name and base URL to detect when it's needed
  • sdk/go/ai/request_test.go — tests: o1-mini request serializes max_completion_tokens; gpt-3.5-turbo request serializes max_tokens; non-OpenAI base URL uses max_tokens regardless of model name

Acceptance Criteria

  • Requests to OpenAI o1-series and gpt-4.x models that require it include max_completion_tokens (not max_tokens) in the JSON body
  • Requests to models that still use max_tokens are unaffected
  • The model list / detection logic is documented in code comments and easy to extend
  • Tests pass (go test ./sdk/go/...)
  • Linting passes (make lint)

Notes for Contributors

Severity: MEDIUM

Reference sdk/python/agentfield/ai/litellm_adapters.py for the model list and rewrite logic used in Python. A simple strings.HasPrefix(model, "o1") or a small allow-list map[string]bool is sufficient — avoid pulling in a full model registry. If the OpenAI base URL detection is ambiguous, add an explicit Provider string field to ClientConfig so callers can declare intent.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:aiAI/LLM integrationenhancementNew feature or requestsdk:goGo SDK related

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions