Skip to content

Correctly return finish reason length when finished#42157

Merged
LysandreJik merged 4 commits intomainfrom
continue-on-sequence-end
Nov 27, 2025
Merged

Correctly return finish reason length when finished#42157
LysandreJik merged 4 commits intomainfrom
continue-on-sequence-end

Conversation

@LysandreJik
Copy link
Copy Markdown
Member

@LysandreJik LysandreJik commented Nov 12, 2025

This PR updates transfsormers serve so that it accurately returns the stop reason -> instead of only returning "stop" (stop token), it now accurately returns "length" as well, therefore indicating when the max new token limit has been reached.

Transformers chat is updated to check on this reason, and to offer to continue:

Generation stopped after reaching the token limit.

Continue generating? (y/N):

This results in chats like the following:
image

Implemented for both continuous batching and generate.

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@LysandreJik LysandreJik force-pushed the model-list-from-cache branch from 5b805cc to 8a1cd61 Compare November 24, 2025 14:49
@LysandreJik LysandreJik force-pushed the continue-on-sequence-end branch from 4db18ba to 7a372f8 Compare November 24, 2025 14:49
@LysandreJik LysandreJik force-pushed the model-list-from-cache branch 2 times, most recently from b263e27 to 38a7f37 Compare November 27, 2025 11:33
@LysandreJik LysandreJik force-pushed the continue-on-sequence-end branch from 7a372f8 to a6cda99 Compare November 27, 2025 13:17
@LysandreJik LysandreJik marked this pull request as ready for review November 27, 2025 13:17
@LysandreJik LysandreJik requested a review from Wauplin November 27, 2025 13:31
Copy link
Copy Markdown
Contributor

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice addition!


chat.append({"role": "assistant", "content": model_output})

if finish_reason == "length":
Copy link
Copy Markdown
Contributor

@Wauplin Wauplin Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't we want to write something for other reasons? (at least write the reason in terminal). Or is length the only finish reason implemented so far?

nevermind, saw below that it's only "length" and "stop" so probably fine not to write it

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes a bit of effort is still required here :) thanks for the review

Base automatically changed from model-list-from-cache to main November 27, 2025 14:34
Co-authored-by: Lucain <lucainp@gmail.com>
@LysandreJik LysandreJik merged commit 52c5c65 into main Nov 27, 2025
16 checks passed
@LysandreJik LysandreJik deleted the continue-on-sequence-end branch November 27, 2025 14:55
sarathc-cerebras pushed a commit to sarathc-cerebras/transformers that referenced this pull request Dec 7, 2025
* Correctly return finish reason length when finished

* Typos + fixup

* Fix a few tests

* Update src/transformers/cli/chat.py

Co-authored-by: Lucain <lucainp@gmail.com>

---------

Co-authored-by: Lucain <lucainp@gmail.com>
SangbumChoi pushed a commit to SangbumChoi/transformers that referenced this pull request Jan 23, 2026
* Correctly return finish reason length when finished

* Typos + fixup

* Fix a few tests

* Update src/transformers/cli/chat.py

Co-authored-by: Lucain <lucainp@gmail.com>

---------

Co-authored-by: Lucain <lucainp@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants