Correctly return finish reason length when finished#42157
Merged
LysandreJik merged 4 commits intomainfrom Nov 27, 2025
Merged
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
5b805cc to
8a1cd61
Compare
4db18ba to
7a372f8
Compare
b263e27 to
38a7f37
Compare
7a372f8 to
a6cda99
Compare
Wauplin
approved these changes
Nov 27, 2025
|
|
||
| chat.append({"role": "assistant", "content": model_output}) | ||
|
|
||
| if finish_reason == "length": |
Contributor
There was a problem hiding this comment.
don't we want to write something for other reasons? (at least write the reason in terminal). Or is length the only finish reason implemented so far?
nevermind, saw below that it's only "length" and "stop" so probably fine not to write it
Member
Author
There was a problem hiding this comment.
yes a bit of effort is still required here :) thanks for the review
Co-authored-by: Lucain <lucainp@gmail.com>
sarathc-cerebras
pushed a commit
to sarathc-cerebras/transformers
that referenced
this pull request
Dec 7, 2025
* Correctly return finish reason length when finished * Typos + fixup * Fix a few tests * Update src/transformers/cli/chat.py Co-authored-by: Lucain <lucainp@gmail.com> --------- Co-authored-by: Lucain <lucainp@gmail.com>
SangbumChoi
pushed a commit
to SangbumChoi/transformers
that referenced
this pull request
Jan 23, 2026
* Correctly return finish reason length when finished * Typos + fixup * Fix a few tests * Update src/transformers/cli/chat.py Co-authored-by: Lucain <lucainp@gmail.com> --------- Co-authored-by: Lucain <lucainp@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR updates
transfsormers serveso that it accurately returns the stop reason -> instead of only returning "stop" (stop token), it now accurately returns "length" as well, therefore indicating when the max new token limit has been reached.Transformers chat is updated to check on this reason, and to offer to continue:
This results in chats like the following:

Implemented for both continuous batching and generate.