fix: report finish_reason "length" when output hits the token limit by Defilan · Pull Request #10 · defilantech/mlx-server

Defilan · 2026-05-18T07:19:27Z

What

Chat completions truncated at max_tokens now report finish_reason: "length" instead of "stop".

Why

Both the streaming and non-streaming paths in ChatCompletion.swift hardcoded finish_reason to "tool_calls" or "stop". A response cut off at the token limit was reported as a natural "stop". OpenAI clients use finish_reason to decide whether to continue a truncated response, so this misled them. Found dogfooding the LLMKube metal-agent mlx-server runtime: three runs each generated exactly max_tokens (512) tokens, all reporting "stop".

How

A finishReason helper returns "tool_calls" when the model emitted tool calls, "length" when the generated token count reached the requested limit, otherwise "stop". Truncation is inferred by comparing generationTokenCount against parameters.maxTokens, since the generator does not surface a stop reason directly. Both completion paths use it. Unit-tested.

Fixes #9

Both the streaming and non-streaming chat-completion paths hardcoded finish_reason to "tool_calls" or "stop", so a response truncated at max_tokens was reported as a natural "stop". OpenAI clients rely on finish_reason to decide whether to continue a cut-off response. Add a finishReason helper: "length" when the generated token count reaches the requested limit, "tool_calls" when the model emitted tool calls, otherwise "stop". Used by both completion paths. Fixes defilantech#9 Signed-off-by: Christopher Maher <chris@mahercode.io>

Defilan merged commit 6f0061f into defilantech:main May 18, 2026
1 check passed

Defilan deleted the fix/finish-reason-length branch May 18, 2026 07:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: report finish_reason "length" when output hits the token limit#10

fix: report finish_reason "length" when output hits the token limit#10
Defilan merged 1 commit into
defilantech:mainfrom
Defilan:fix/finish-reason-length

Defilan commented May 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Defilan commented May 18, 2026

What

Why

How

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant