Skip to content

Conversation

@liyonghua0910
Copy link
Collaborator

@liyonghua0910 liyonghua0910 commented Nov 10, 2025

Motivation

  1. 当 clear weight 强行打断请求时,因为 engine worker queue 中有残留请求,v1 scheduler 停止调度,不再更新 metrics,导致监控指标异常;
  2. 当 clear weight 后调用 clear data 时,逐个请求执行 finish_requests 方法,并调用 free_blocks 来释放缓存块,但此时 prefix cache 已经重置,此时 free_blocks 的报错未捕获,导致该请求清理不完全。

Modifications

在 clear_data 后主动更新一次 metrics,保证 FD 休眠期间 running/waiting 指标清空。同时修复 clear_data 的异常捕获问题。

Usage or Command

Accuracy Tests

Checklist

  • Add at least a tag in the PR title.
    • Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
    • You can add new tags based on the PR content, but the semantics must be clear.
  • Format your code, run pre-commit before commit.
  • Add unit tests. Please write the reason in this PR if no unit tests.
  • Provide accuracy results.
  • If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

@paddle-bot
Copy link

paddle-bot bot commented Nov 10, 2025

Thanks for your contribution!

@Jiang-Jia-Jun Jiang-Jia-Jun merged commit 6c5ab72 into PaddlePaddle:develop Nov 13, 2025
13 of 15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants