Skip to content

Conversation

gesanqiu
Copy link
Contributor

@gesanqiu gesanqiu commented Sep 19, 2023

When the eos_token_ids is also special tokens it will be skip by default and generation won't stop correctly, so I think vLLM should support stop_token_ids like FastChat does. But unlike FastChat, stop_str should have higher priotity than stop_token_ids because right now I only found this case, in other cases, stop_str works well.

Also check #792

Or else there are better solutions?

Copy link
Member

@zhuohan123 zhuohan123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for your contribution!

@zhuohan123 zhuohan123 merged commit f98b745 into vllm-project:main Sep 21, 2023
@gesanqiu gesanqiu deleted the stop_token_ids branch October 27, 2023 11:37
hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024
amy-why-3459 pushed a commit to amy-why-3459/vllm that referenced this pull request Sep 15, 2025
### What this PR does / why we need it?
When profiling, it is often necessary to disable the call stack to
reduce profiling overhead, and adjust the profiler_level to level1 to
obtain more detailed operator and communication information.

Therefore, it is recommended to modify the default profiling
configuration.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
No

Signed-off-by: ApsarasX <apsarax@outlook.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants