[BugFix] Add safety checks in recycle_gpu_blocks to prevent block allocation errors by kevincheng2 · Pull Request #6531 · PaddlePaddle/FastDeploy

kevincheng2 · 2026-02-27T05:30:11Z

Motivation

💡 If this PR is a Cherry Pick, the PR title needs to follow the format by adding the [Cherry-Pick] label at the very beginning and appending the original PR ID at the end. For example, [Cherry-Pick][CI] Add check trigger and logic(#5191)

💡 如若此PR是Cherry Pick，PR标题需遵循格式，在最开始加上[Cherry-Pick]标签，以及最后面加上原PR ID，例如[Cherry-Pick][CI] Add check trigger and logic(#5191)

Add prefix tree status check before recycling GPU blocks to skip during tree clearing
Validate gpu_block_ids input to ensure it's a list
Add overflow check to prevent free block count from exceeding total GPU blocks, avoiding potential memory allocation errors

Modifications

Usage or Command

Accuracy Tests

Checklist

Add at least a tag in the PR title.
- Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
- You can add new tags based on the PR content, but the semantics must be clear.
Format your code, run pre-commit before commit.
Add unit tests. Please write the reason in this PR if no unit tests.
Provide accuracy results.
If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

…ocation errors - Check prefix tree status before recycling GPU blocks - Validate gpu_block_ids is a list - Add overflow check to prevent free block count exceeding total blocks Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

paddle-bot · 2026-02-27T05:30:18Z

Thanks for your contribution!

codecov-commenter · 2026-02-27T07:19:36Z

Codecov Report

❌ Patch coverage is 66.66667% with 3 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@8e67fb4). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
fastdeploy/cache_manager/prefix_cache_manager.py	66.66%	2 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             develop    #6531   +/-   ##
==========================================
  Coverage           ?   70.40%           
==========================================
  Files              ?      394           
  Lines              ?    53869           
  Branches           ?     8466           
==========================================
  Hits               ?    37927           
  Misses             ?    13210           
  Partials           ?     2732

Flag	Coverage Δ
GPU	`70.40% <66.66%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…atus_signal not initialized - Add hasattr check before accessing prefix_tree_status_signal - The signal is only initialized in launch_cache_messager, not in __init__ - Fixes CI test failure in test_prefix_cache_manager.py Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Call self.reset() before setting status to NORMAL in UPDATING state - Ensure cache consistency when model weights change - Consistent with CLEARING state handling Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…refix_tree_status_signal not initialized(#6531) (#6559) * fix mtp acceptance rate decline * [BugFix] Fix AttributeError in recycle_gpu_blocks when prefix_tree_status_signal not initialized - Add hasattr check before accessing prefix_tree_status_signal - The signal is only initialized in launch_cache_messager, not in __init__ - Fixes CI test failure in test_prefix_cache_manager.py Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [BugFix] Reset prefix cache when model weights are updating - Call self.reset() before setting status to NORMAL in UPDATING state - Ensure cache consistency when model weights change - Consistent with CLEARING state handling Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

…ent block allocation errors(#6531) (#6530) * fix mtp acceptance rate decline cp * [BugFix] Add safety checks in recycle_gpu_blocks to prevent block allocation errors - Check prefix tree status before recycling GPU blocks - Validate gpu_block_ids is a list - Add overflow check to prevent free block count exceeding total blocks Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [BugFix] Fix AttributeError in recycle_gpu_blocks when prefix_tree_status_signal not initialized - Add hasattr check before accessing prefix_tree_status_signal - The signal is only initialized in launch_cache_messager, not in __init__ - Fixes CI test failure in test_prefix_cache_manager.py Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [BugFix] Reset prefix cache when model weights are updating - Call self.reset() before setting status to NORMAL in UPDATING state - Ensure cache consistency when model weights change - Consistent with CLEARING state handling Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

kevincheng2 temporarily deployed to Metax_ci February 27, 2026 05:30 — with GitHub Actions Inactive

kevincheng2 temporarily deployed to Metax_ci February 27, 2026 13:01 — with GitHub Actions Inactive

Merge branch 'develop' into fix_reset_cache_bug_dev

65ea2de

kevincheng2 temporarily deployed to Metax_ci February 28, 2026 02:23 — with GitHub Actions Inactive

kevincheng2 mentioned this pull request Feb 28, 2026

[Cherry-Pick][BugFix] Fix AttributeError in recycle_gpu_blocks when prefix_tree_status_signal not initialized(#6531) #6559

Merged

5 tasks

[BugFix] Reset prefix cache when model weights are updating

79dfe4c

- Call self.reset() before setting status to NORMAL in UPDATING state - Ensure cache consistency when model weights change - Consistent with CLEARING state handling Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

kevincheng2 temporarily deployed to Metax_ci February 28, 2026 06:43 — with GitHub Actions Inactive

Jiang-Jia-Jun approved these changes Mar 2, 2026

View reviewed changes

Jiang-Jia-Jun merged commit ecfd088 into PaddlePaddle:develop Mar 2, 2026
20 of 24 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BugFix] Add safety checks in recycle_gpu_blocks to prevent block allocation errors#6531

[BugFix] Add safety checks in recycle_gpu_blocks to prevent block allocation errors#6531
Jiang-Jia-Jun merged 4 commits intoPaddlePaddle:developfrom
kevincheng2:fix_reset_cache_bug_dev

kevincheng2 commented Feb 27, 2026 •

edited

Loading

Uh oh!

paddle-bot bot commented Feb 27, 2026

Uh oh!

codecov-commenter commented Feb 27, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

kevincheng2 commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

paddle-bot bot commented Feb 27, 2026

Uh oh!

codecov-commenter commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kevincheng2 commented Feb 27, 2026 •

edited

Loading

codecov-commenter commented Feb 27, 2026 •

edited

Loading