feat(openai): expose max_output_tokens on Responses API LLM by piyush-gambhir · Pull Request #5449 · livekit/agents

piyush-gambhir · 2026-04-14T18:07:26Z

Summary

Adds max_output_tokens as a first-class constructor argument on the OpenAI, Azure, and xAI Responses API LLM wrappers — mirroring how service_tier is handled today (#5346).

The underlying /v1/responses endpoint accepts max_output_tokens natively (see openai.types.responses.response_create_params), but the plugins previously required callers to reach for extra_kwargs on every chat() call to set it. This PR makes it a plain constructor arg so a bound can be set once at LLM construction time.

Changes

livekit-plugins-openai/.../responses/llm.py: add max_output_tokens: NotGivenOr[int] to _LLMOptions and the LLM.__init__ signature; forward it into extra so it reaches both the HTTP (client.responses.create) and WebSocket (response.create) paths.
livekit-plugins-azure/.../responses/llm.py: thread the new arg through to openai.responses.LLM.__init__.
livekit-plugins-xai/.../responses/llm.py: thread the new arg through to openai.responses.LLM.__init__.

Usage

from livekit.plugins import openai

llm = openai.responses.LLM(
    model="gpt-4.1",
    max_output_tokens=1024,
)

Motivation

All sibling chat-completions LLMs (openai.LLM, anthropic.LLM, google.LLM, aws.LLM, mistralai.LLM, groq.LLM) already expose a first-class max_completion_tokens / max_tokens / max_output_tokens argument. The Responses variants were the only remaining path that forced callers into extra_kwargs. This brings them in line.

Test plan

ast.parse passes on all three modified files
make check (format + lint + type-check)
Smoke test: instantiate openai.responses.LLM(model="gpt-4.1", max_output_tokens=64) and verify the response honors the cap in both use_websocket=True and use_websocket=False modes

Adds max_output_tokens as a first-class constructor argument on the OpenAI, Azure, and xAI Responses API LLM wrappers, mirroring how service_tier is handled. Previously callers had to reach for extra_kwargs per-call to set an upper bound on generated tokens, even though the underlying /v1/responses endpoint accepts max_output_tokens natively.

devin-ai-integration

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no bugs or issues to report.

piyush-gambhir · 2026-04-14T19:29:22Z

Smoke test (live OpenAI API, gpt-4.1):

HTTP path ✅: LLM(model='gpt-4.1', use_websocket=False, max_output_tokens=16) — 500-word-story prompt returned truncated output (~12–16 tokens, e.g. "Once upon a time, in the misty blue mountains of Elderglow,"). Confirms max_output_tokens=16 is forwarded to client.responses.create(...) and honored by the API.
WebSocket path ✅: stream._extra_kwargs['max_output_tokens'] = 16 before send — the param is spread into the response.create payload dict via the same **self._extra_kwargs path that carries every other option (temperature, service_tier, etc.), so wiring is symmetric with HTTP.

make check passes locally (only pre-existing warnings, none in changed files).

piyush-gambhir mentioned this pull request Apr 14, 2026

feat(openai): expose maxOutputTokens on Responses API LLM livekit/agents-js#1252

Merged

4 tasks

devin-ai-integration bot reviewed Apr 14, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(openai): expose max_output_tokens on Responses API LLM#5449

feat(openai): expose max_output_tokens on Responses API LLM#5449
piyush-gambhir wants to merge 1 commit intolivekit:mainfrom
piyush-gambhir:feat/responses-max-output-tokens

piyush-gambhir commented Apr 14, 2026 •

edited

Loading

Uh oh!

devin-ai-integration bot left a comment

Uh oh!

piyush-gambhir commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

piyush-gambhir commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Usage

Motivation

Test plan

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

✅ Devin Review: No Issues Found

Uh oh!

piyush-gambhir commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

piyush-gambhir commented Apr 14, 2026 •

edited

Loading