[Feature]Add a switch for logprobs/prompt_logprobs token decoding. by qwes5s5 · Pull Request #5463 · PaddlePaddle/FastDeploy

qwes5s5 · 2025-12-09T11:47:43Z

Motivation

💡 If this PR is a Cherry Pick, the PR title needs to follow the format by adding the [Cherry-Pick] label at the very beginning and appending the original PR ID at the end. For example, [Cherry-Pick][CI] Add check trigger and logic(#5191)

💡 如若此PR是Cherry Pick，PR标题需遵循格式，在最开始加上[Cherry-Pick]标签，以及最后面加上原PR ID，例如[Cherry-Pick][CI] Add check trigger and logic(#5191)

It has been observed during current project usage that when logprobs or prompt_logprobs take a particularly large value, the process of converting token IDs to tokens consumes a significant amount of time. In some scenarios, the decoded tokens are not frequently used. Therefore, a switch needs to be added to control whether the decoding from token ID to token is enabled for logprobs and prompt_logprobs on a specific request.

Modifications

Add the control parameter include_logprobs_decode_token to the online link request, which is enabled by default.

For the v1/chat/completions interface:
- When include_logprobs_decode_token is enabled, the results for logprobs and prompt_logprobs are output in the normal format.
- When include_logprobs_decode_token is disabled, the token and bytes fields in the logprobs result are empty, and the decoded_tokens field in prompt_logprobs is empty.
For the v1/completions interface:
- When include_logprobs_decode_token is enabled, the results for logprobs and prompt_logprobs are output in the normal format.
- When include_logprobs_decode_token is disabled, the logprobs result is output in the normal format, and the decoded_tokens field in prompt_logprobs is empty.

Usage or Command

v1/chat/completions：

curl --location 'http://0:0:0:0:8180/v1/completions' \
--header 'Content-Type: application/json' \
--data '{
  "prompt": "请生成一篇关于理想的作文,不超过100字",
  "logprobs":3,
  "prompt_logprobs":3,
   "include_logprobs_decode_token":false
}
'

v1/completions：

curl --location 'http://0:0:0:0:8180/v1/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
    "messages": [
        {
            "role": "system",
            "content": "I'\''m a helpful AI assistant."
        },
        {
            "role": "user",
            "content": "give me three letter randdomly, just tell me letter without anything else"
        }
    ],
    "logprobs":"true",
    "top_logprobs":3,
    "prompt_logprobs":3,
    "include_logprobs_decode_token":false
}'

Accuracy Tests

Checklist

Add at least a tag in the PR title.
- Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
- You can add new tags based on the PR content, but the semantics must be clear.
Format your code, run pre-commit before commit.
Add unit tests. Please write the reason in this PR if no unit tests.
Provide accuracy results.
If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

paddle-bot · 2025-12-09T11:47:50Z

Thanks for your contribution!

codecov-commenter · 2025-12-09T13:08:26Z

Codecov Report

❌ Patch coverage is 59.25926% with 11 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@3066a0c). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
fastdeploy/entrypoints/openai/serving_chat.py	57.14%	6 Missing and 3 partials ⚠️
...astdeploy/entrypoints/openai/serving_completion.py	50.00%	1 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             develop    #5463   +/-   ##
==========================================
  Coverage           ?   60.31%           
==========================================
  Files              ?      329           
  Lines              ?    41067           
  Branches           ?     6247           
==========================================
  Hits               ?    24771           
  Misses             ?    14410           
  Partials           ?     1886

Flag	Coverage Δ
GPU	`60.31% <59.25%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

fastdeploy/entrypoints/openai/protocol.py

qwes5s5 requested review from Jiang-Jia-Jun and sunlei1024 and removed request for Jiang-Jia-Jun December 9, 2025 11:48

qwes5s5 force-pushed the add_detoken_switch branch from b2ceff7 to d601b17 Compare December 9, 2025 13:33

Jiang-Jia-Jun requested changes Dec 10, 2025

View reviewed changes

fastdeploy/entrypoints/openai/protocol.py Show resolved Hide resolved

add detoken switch

d7ae797

qwes5s5 force-pushed the add_detoken_switch branch from d601b17 to d7ae797 Compare December 10, 2025 11:58

Jiang-Jia-Jun approved these changes Dec 10, 2025

View reviewed changes

Jiang-Jia-Jun merged commit d79438b into PaddlePaddle:develop Dec 10, 2025
13 of 17 checks passed

qwes5s5 added a commit to qwes5s5/FastDeploy that referenced this pull request Dec 15, 2025

add detoken switch (PaddlePaddle#5463)

033434c

Jiang-Jia-Jun pushed a commit that referenced this pull request Dec 17, 2025

add detoken switch (#5463) (#5572)

d67b64d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]Add a switch for logprobs/prompt_logprobs token decoding.#5463

[Feature]Add a switch for logprobs/prompt_logprobs token decoding.#5463
Jiang-Jia-Jun merged 1 commit intoPaddlePaddle:developfrom
qwes5s5:add_detoken_switch

qwes5s5 commented Dec 9, 2025 •

edited

Loading

Uh oh!

paddle-bot bot commented Dec 9, 2025

Uh oh!

codecov-commenter commented Dec 9, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

qwes5s5 commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

paddle-bot bot commented Dec 9, 2025

Uh oh!

codecov-commenter commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

qwes5s5 commented Dec 9, 2025 •

edited

Loading

codecov-commenter commented Dec 9, 2025 •

edited

Loading