-
Notifications
You must be signed in to change notification settings - Fork 13.2k
fix: add generic fallback to detect trailing <think> tags in Jinja templates and handle forced-open reasoning blocks #16426
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
(llama-server --jinja)
Works also with "stream": true together with streaming-aware parser #16394 |
The CI seems to fail? |
0f63b5b
to
0869085
Compare
Reproduced it's me, I check...
|
89ae131
to
e1f526c
Compare
I'm getting "Errors while running CTest" on the CI, and I need to check if there's a regression somewhere else. test-chat / test-chat-template are OK vs. master :
|
Hi @ServeurpersoCom, I can run an automated high-severity-only LLM review on this PR and post a single focused inline comment. Reply with "approve" or add a comment saying "@tommarques56 approve" to proceed. |
@tommarques56 approve Hey tommarques56, I really like what you’re doing with these automated LLM reviews! That’s a great idea! |
@ServeurpersoCom I just blocked this user for spamming. This is not a good way to run such experiments because it introduces a lot of noise into the discussions. |
Ah, that explains the false XSS detection from the bot earlier! perfect, thanks for clarifying! |
…mplates and handle forced-open reasoning blocks - Detect trailing <think> tags in generic chat templates, trim whitespace, and either append the closing tag or mark the reasoning block as forced-open based on enable_thinking - Added a regression test covering a fallback template that opens the reasoning block in the prompt and verifies prompt differences, forced-open behaviour, and reasoning parsing - Now compatible with models using the default Jinja chat template, such as https://huggingface.co/unsloth/GLM-Z1-32B-0414-GGUF
…t through common_chat_params for consistent <think> handling - Added a supports_enable_thinking field to common_chat_params, populate it during template rendering, and reuse it when deciding whether the generic <think> fallback should run - Updated common_chat_templates_support_enable_thinking to consult the tracked capability and expanded the chat template tests to assert the flag for templates that do and do not react to enable_thinking - Updated chat template tests to assert the guarded fallback behaviour and to cover templates that conditionally open <think> blocks.
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
756e6ec
to
6041e25
Compare
Add generic fallback to detect trailing tags in Jinja templates and handle forced-open reasoning blocks :
Make sure to read the contributing guidelines before submitting a PR