fix: DSML token leakage in DeepSeek-V4 and 3.2#1647
Conversation
Signed-off-by: AlpinDale <alpindale@gmail.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e113d544f4
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if self.tool_call_start_token not in current_text: | ||
| overlap = partial_tag_overlap(current_text, self.tool_call_start_token) | ||
| sendable_idx = len(current_text) - overlap |
There was a problem hiding this comment.
Emit held-back suffix when no tool-call block is found
When current_text has no full start marker, this code withholds any suffix overlapping <|DSML|function_calls> (including a trailing <). In streams that never enter a tool-call block, that buffered suffix is never flushed at EOS because the only EOS fallback is gated on self.prev_tool_call_arr, so plain-text responses can lose their final characters.
Useful? React with 👍 / 👎.
|
|
||
| # Inside tool-call region: emit any newly completed invokes. | ||
| content = self._extract_content(current_text) | ||
| delta_tool_calls = self._extract_delta_tool_calls(current_text, request) |
There was a problem hiding this comment.
Gate invoke parsing on the function_calls start marker
This now parses <|DSML|invoke ...</|DSML|invoke> blocks unconditionally on every chunk, even before <|DSML|function_calls> has been seen. If the model outputs DSML invoke syntax as normal text (for example, showing an example call), it will be emitted as real tool_calls, which can trigger unintended tool handling; the previous logic avoided this by waiting until the start marker was detected.
Useful? React with 👍 / 👎.
No description provided.