[Feature][Executor] GPU Model Runner Supports prompt_logprobs and max_logprobs #4769

ckl117 · 2025-11-03T08:39:16Z

Motivation

GPUModelRunner supports max_logprobs=-1 and prompt_logprobs.

computes vocal_size logprobs(including prompt_logprobs).
computes prompt_logprobs(disable prefix cache).

Modifications

Usage or Command

export FD_USE_GET_SAVE_OUTPUT_V1=1

python -m fastdeploy.entrypoints.openai.api_server \
    --model ./ERNIE-4.5-0.3B-PT \
    --max-model-len 32768 \
    --max-num-seqs 128 \
    --tensor-parallel-size 1 \
    --enable-logprob \
    --max-logprobs -1 \
    --no-enable-prefix-caching \

Accuracy Tests

TODO: Server layer should support top_logprobs=-1 and prompt_logprobs.

Checklist

Add at least a tag in the PR title.
- Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
- You can add new tags based on the PR content, but the semantics must be clear.
Format your code, run pre-commit before commit.
Add unit tests. Please write the reason in this PR if no unit tests.
Provide accuracy results.
If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

…into dev_logprob

paddle-bot · 2025-11-03T08:39:22Z

Thanks for your contribution!

…into dev_logprob

yuanlehome · 2025-11-04T05:49:45Z

fastdeploy/worker/gpu_model_runner.py

+                if request.sampling_params.prompt_logprobs is not None:
+                    self.prompt_logprobs_reqs[request.request_id] = request


超长时间压测后有没有考虑内存增长的情况？

另外RL场景中使用的话，需要在clear_requests函数中清空一下model_runner的一些对象，包括这个，也顺便梳理下有没有其他的对象需要清除

prompt变长和增大并发都会导致显存增长，但不存在显存泄露，这是符合预期的。

fastdeploy/worker/gpu_model_runner.py

gongshaotian · 2025-11-04T06:25:24Z

fastdeploy/worker/gpu_model_runner.py

+                self.prompt_logprobs_reqs.pop(request.request_id, None)
+                self.in_progress_prompt_logprobs.pop(request.request_id, None)


抢占时不需要 del self.prompt_logprobs_reqs[req.request_id] 吗

上面也有抢占时prompt_logprobs_reqs清除的逻辑啊。

gongshaotian · 2025-11-04T06:31:51Z

fastdeploy/worker/gpu_model_runner.py

+            if isinstance(prompt_token_ids, np.ndarray):
+                prompt_token_ids = prompt_token_ids.tolist()
+            prompt_token_ids_tensor = paddle.to_tensor(prompt_token_ids, dtype="int64")


paddle 不支持 ndarray 直接转tensor吗

…heck

gongshaotian

LGTM

ckl117 added 4 commits October 31, 2025 19:21

init temp

e87f5ba

gpu worker support prompt_logprob

6df86f1

add doc for max_logprobs and code check

c42dd80

Merge branch 'develop' of https://github.com/PaddlePaddle/FastDeploy …

ed0d2d9

…into dev_logprob

ckl117 added 4 commits November 3, 2025 16:45

code check

fc70fd5

check

58d6466

code check

ba28485

Merge branch 'develop' of https://github.com/PaddlePaddle/FastDeploy …

cf58d72

…into dev_logprob

yuanlehome reviewed Nov 4, 2025

View reviewed changes

gongshaotian reviewed Nov 4, 2025

View reviewed changes

fastdeploy/worker/gpu_model_runner.py Outdated Show resolved Hide resolved

gongshaotian reviewed Nov 4, 2025

View reviewed changes

ckl117 changed the title ~~[Feature] GPU Model Runner Supports prompt_logprobs and max_logprobs for Text model~~ [Feature][Executor] GPU Model Runner Supports prompt_logprobs and max_logprobs for Text model Nov 4, 2025

ckl117 changed the title ~~[Feature][Executor] GPU Model Runner Supports prompt_logprobs and max_logprobs for Text model~~ [Feature][Executor] GPU Model Runner Supports prompt_logprobs and max_logprobs Nov 4, 2025

gongshaotian reviewed Nov 4, 2025

View reviewed changes

ckl117 added 2 commits November 4, 2025 15:21

clear_requests clear prompt_logprob and _get_prompt_logprobs_list() c…

4c2f9ec

…heck

check pooling model req sampling_params is None

1a4edad

gongshaotian approved these changes Nov 4, 2025

View reviewed changes

yuanlehome approved these changes Nov 4, 2025

View reviewed changes

YuanRisheng added the skip-ci: coverage label Nov 5, 2025

ckl117 merged commit 1c3ca48 into PaddlePaddle:develop Nov 5, 2025
29 of 33 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature][Executor] GPU Model Runner Supports prompt_logprobs and max_logprobs #4769

[Feature][Executor] GPU Model Runner Supports prompt_logprobs and max_logprobs #4769

Uh oh!

ckl117 commented Nov 3, 2025 •

edited

Loading

Uh oh!

paddle-bot bot commented Nov 3, 2025

Uh oh!

yuanlehome Nov 4, 2025

Uh oh!

yuanlehome Nov 4, 2025

Uh oh!

ckl117 Nov 4, 2025

Uh oh!

Uh oh!

gongshaotian Nov 4, 2025 •

edited

Loading

Uh oh!

ckl117 Nov 4, 2025 •

edited

Loading

Uh oh!

gongshaotian Nov 4, 2025

Uh oh!

gongshaotian left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		if request.sampling_params.prompt_logprobs is not None:
		self.prompt_logprobs_reqs[request.request_id] = request

		self.prompt_logprobs_reqs.pop(request.request_id, None)
		self.in_progress_prompt_logprobs.pop(request.request_id, None)

[Feature][Executor] GPU Model Runner Supports prompt_logprobs and max_logprobs #4769

[Feature][Executor] GPU Model Runner Supports prompt_logprobs and max_logprobs #4769

Uh oh!

Conversation

ckl117 commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

paddle-bot bot commented Nov 3, 2025

Uh oh!

yuanlehome Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

yuanlehome Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

ckl117 Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gongshaotian Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ckl117 Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gongshaotian Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

gongshaotian left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ckl117 commented Nov 3, 2025 •

edited

Loading

gongshaotian Nov 4, 2025 •

edited

Loading

ckl117 Nov 4, 2025 •

edited

Loading