Skip to content

fix(litellm): parse DeepSeek-V3 proprietary inline tool-call tokens#5654

Open
fuchun1010 wants to merge 5 commits into
google:mainfrom
fuchun1010:fix/deepseek-tool-call-parsing
Open

fix(litellm): parse DeepSeek-V3 proprietary inline tool-call tokens#5654
fuchun1010 wants to merge 5 commits into
google:mainfrom
fuchun1010:fix/deepseek-tool-call-parsing

Conversation

@fuchun1010
Copy link
Copy Markdown

Closes #5024

Problem

DeepSeek-V3 emits tool calls using proprietary special tokens embedded in the content field:

<|tool▁calls▁begin|><|tool▁call▁begin|>function<|tool▁sep|>analysis_input
```json
{"work_dir_name":"..."}
```<|tool▁call▁end|><|tool▁calls▁end|>

When LiteLLM does not translate these into structured tool_calls (intermittent), ADK's fallback JSON parser finds the JSON object but rejects it because the function name (analysis_input) is embedded in the tokens (<|tool▁sep|>analysis_input) rather than as a name key inside the JSON payload.

Result: tool call is silently dropped and the raw tokens appear as text content.

Solution

  • Added _parse_deepseek_tool_calls_from_text — detects the proprietary token format, extracts function name + arguments, and emits standard ChatCompletionMessageToolCall objects
  • Added _extract_json_from_deepseek_args helper — handles optional Markdown code fences (```json ```) around the arguments payload
  • Integrated into the existing _parse_tool_calls_from_text as the first-pass parser, with fallback to generic inline JSON parsing
  • Supports: single tool calls, multi-tool calls, code-fenced JSON, bare JSON, surrounding text, mixed formats

Testing Plan

Unit Tests: Added 8 new tests covering:

  • Single tool call with code-fenced JSON args
  • Multiple tool calls in a single wrapped block
  • Bare JSON args (no code fences)
  • Tool call embedded in surrounding text
  • Text without DeepSeek tokens (no false positives)
  • Empty/whitespace-only text
  • Integration test via _parse_tool_calls_from_text
  • Mixed formats (DeepSeek tokens + standard inline JSON)

Regression: Full test_litellm.py: 264 passed, 0 failed

Files Changed

File Changes
src/google/adk/models/lite_llm.py +147 lines (2 new functions + integration)
tests/unittests/models/test_litellm.py +124 lines (8 new test functions)

@google-cla
Copy link
Copy Markdown

google-cla Bot commented May 10, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@fuchun1010 fuchun1010 force-pushed the fix/deepseek-tool-call-parsing branch from c319bae to e91b1f6 Compare May 10, 2026 15:15
DeepSeek-V3 emits tool calls using proprietary special tokens
(<|tool▁calls▁begin|>…<|tool▁call▁begin|>function<|tool▁sep|>NAME)
embedded in the content field.  When LiteLLM does not translate these
into structured tool_calls (intermittent), the existing fallback JSON
parser rejects the payload because the function name is stored inside
the tokens rather than as a 'name' key in the JSON object.

Add _parse_deepseek_tool_calls_from_text that detects the proprietary
token format, extracts the function name and arguments, and emits
standard ChatCompletionMessageToolCall objects.  Integrate it into the
existing _parse_tool_calls_from_text pipeline.

Also add _extract_json_from_deepseek_args helper to handle optional
Markdown code fences (json … ) that DeepSeek wraps around
the arguments payload.

Closes google#5024
@fuchun1010 fuchun1010 force-pushed the fix/deepseek-tool-call-parsing branch from e91b1f6 to 08e864e Compare May 10, 2026 15:34
@rohityan rohityan self-assigned this May 12, 2026
@rohityan rohityan added models [Component] Issues related to model support needs review [Status] The PR/issue is awaiting review from the maintainer labels May 12, 2026
@rohityan rohityan requested a review from xuanyang15 May 12, 2026 03:48
@rohityan
Copy link
Copy Markdown
Collaborator

Hi @fuchun1010 , Thank you for your contribution! We appreciate you taking the time to submit this pull request. Your PR has been received by the team and is currently under review. We will provide feedback as soon as we have an update to share.

@rohityan
Copy link
Copy Markdown
Collaborator

Hi @xuanyang15 , can you please review this.

@xuanyang15
Copy link
Copy Markdown
Collaborator

@GWeale Could you please help review?

@fuchun1010
Copy link
Copy Markdown
Author

Hi @GWeale — gentle ping for review on this PR when you have a moment.
This fixes #5024 (DeepSeek-V3 tool calls silently dropped when LiteLLM doesn't translate proprietary inline tokens). 271 additions across 2 files: parser + 8 unit tests. CI is green.
Happy to address any feedback. Thanks!

Copy link
Copy Markdown
Author

@fuchun1010 fuchun1010 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

Thanks for this PR! The DeepSeek inline tool-call format is a real pain point when LiteLLM's translation is inconsistent, and this parser is a clean solution.

What Works Well ✅

  • Well-documented: Clear references to DeepSeek API docs and inline comments explaining the token format
  • Comprehensive test coverage: 8 test cases covering single/multi calls, plain JSON args (no code fences), surrounding text, mixed formats (DeepSeek + standard inline JSON), empty/whitespace-only input, and integration with the generic parser
  • Clean remainder handling: Surrounding text is correctly preserved and returned, matching the existing _parse_tool_calls_from_text contract
  • Recursive mixed-format support: When both DeepSeek tokens and standard inline JSON appear in the same text, the fallback recursion in _parse_tool_calls_from_text handles both correctly — nice touch
  • Quick guard optimization: The _DS_TCALLS_BEGIN not in text_block and _DS_TCALL_BEGIN not in text_block check avoids regex overhead on normal responses

Suggestions / Questions

  1. _extract_json_from_deepseek_args round-trip: The function does json.loads(raw_decode(...))json.dumps(candidate, ensure_ascii=False). While functionally correct (JSON objects are unordered by spec), this round-trip could theoretically reorder keys. Is there a reason not to return the raw substring from raw_decode? Something like:

    candidate, end = _JSON_DECODER.raw_decode(args_text, open_brace)
    return args_text[open_brace:end]

    This preserves the original formatting and avoids the serialize/deserialize cycle.

  2. Edge case — truncated tokens: What happens when the model output is cut off mid-token (e.g., partial <|tool▁call▁begin| due to max_tokens)? The current code appends the unparsed text to remainder_parts via the end_idx == -1 branches, which seems correct — the partial token becomes remainder text. Worth adding a test for this scenario?

  3. Thread safety of _JSON_DECODER: The module-level _JSON_DECODER is used in _extract_json_from_deepseek_args. json.JSONDecoder instances are generally thread-safe for read-only operations (raw_decode doesn't mutate state AFAIK), but worth double-checking since lite_llm.py may be used in async/threaded contexts.

  4. Minor: test helper deduplication: _DS_BEGIN_CALLS etc. are redefined as module-level constants in the test file with the same values as in lite_llm.py. Consider importing them from the source module to avoid drift — though I understand this may be intentional to keep tests independent of implementation details.

Verdict

LGTM overall. The suggestions above are non-blocking — the core logic is solid and the test coverage is thorough. Happy to approve once the questions above are addressed (or dismissed).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

models [Component] Issues related to model support needs review [Status] The PR/issue is awaiting review from the maintainer

Projects

None yet

Development

Successfully merging this pull request may close these issues.

LiteLLM + DeepSeek-V3 multi-tool calling fails: tool call parsing error

4 participants