[RL] Reset buffer size of slot_mapping#7868
Conversation
|
Thanks for your contribution! |
PaddlePaddle-bot
left a comment
There was a problem hiding this comment.
🤖 Paddle-CI-Agent | pr_review |
2026-05-20 18:37:39
📋 Review 摘要
PR 概述:修正 slot_mapping_buffer 和 position_ids_buffer 的初始化大小,与同类 buffer 保持一致,并为已有 assert 补充详细错误信息。
变更范围:fastdeploy/worker/input_batch.py、fastdeploy/worker/gpu_model_runner.py
影响面 Tag:[RL] [Executor]
问题
未发现阻塞性问题。
📝 PR 规范检查
PR 目标分支为 release/2.6(非 develop),标题缺少 [Cherry-Pick] 前缀,不符合 Cherry-Pick 格式要求 [Cherry-Pick][Tag] 标题描述(#原PR号)。
标题建议(可直接复制):
[Cherry-Pick][RL] Reset buffer size of slot_mapping (#<原develop分支PR号>)
PR 描述建议(可直接复制,必须复刻 checklist §D2 模板的完整结构):
## Motivation
`slot_mapping_buffer` 和 `position_ids_buffer` 原先使用 `scheduler_config.max_num_batched_tokens` 初始化,与同文件中其他 buffer(`ids_remove_padding`、`batch_id_per_token` 等)使用的 `max_chunk_tokens` 不一致,可能在 Chunked Prefill 或多模态场景下导致缓冲区大小偏大。统一使用 `max_chunk_tokens` 以保持一致性。
## Modifications
- `fastdeploy/worker/input_batch.py`:将 `position_ids_buffer` 和 `slot_mapping_buffer` 的初始化尺寸由 `scheduler_config.max_num_batched_tokens` 改为 `self.max_chunk_tokens`,与同文件其他 buffer 对齐。
- `fastdeploy/worker/gpu_model_runner.py`:为 `assert batch_id_per_token.shape == block_idx.shape` 补充详细错误信息,便于排查 shape 不一致问题。
## Usage or Command
N/A
## Accuracy Tests
N/A
## Checklist
- [x] Add at least a tag in the PR title.
- Tag list: [`[FDConfig]`,`[APIServer]`,`[Engine]`, `[Scheduler]`, `[PD Disaggregation]`, `[Executor]`, `[Graph Optimization]`, `[Speculative Decoding]`, `[RL]`, `[Models]`, `[Quantization]`, `[Loader]`, `[OP]`, `[KVCache]`, `[DataProcessor]`, `[BugFix]`, `[Docs]`, `[CI]`, `[Optimization]`, `[Feature]`, `[Benchmark]`, `[Others]`, `[XPU]`, `[HPU]`, `[GCU]`, `[DCU]`, `[Iluvatar]`, `[Metax]`]
- You can add new tags based on the PR content, but the semantics must be clear.
- [x] Format your code, run `pre-commit` before commit.
- [x] Add unit tests. Please write the reason in this PR if no unit tests.
- [x] Provide accuracy results.
- [x] If the current PR is submitting to the `release` branch, make sure the PR has been submitted to the `develop` branch, then cherry-pick it to the `release` branch with the `[Cherry-Pick]` PR tag.总体评价
代码改动正确,slot_mapping_buffer/position_ids_buffer 改用 max_chunk_tokens 与同类 buffer 保持一致,逻辑清晰。仅 PR 标题缺少 [Cherry-Pick] 前缀,建议补充后合入。
| _max_batched_tokens = self.scheduler_config.max_num_batched_tokens | ||
| self.position_ids_buffer = paddle.zeros([_max_batched_tokens], dtype=paddle.int32) | ||
| self.slot_mapping_buffer = paddle.zeros([_max_batched_tokens], dtype=paddle.int64) | ||
| self.position_ids_buffer = paddle.zeros([self.max_chunk_tokens], dtype=paddle.int32) |
There was a problem hiding this comment.
🟡 建议 此处改动正确,与文件中其他使用 max_chunk_tokens 的 buffer(如 L183 ids_remove_padding、L187 batch_id_per_token)保持一致。
建议在注释中补充说明 max_chunk_tokens 与原 max_num_batched_tokens 的语义差异,例如:
# Initialize addressing buffers
# Use max_chunk_tokens (instead of max_num_batched_tokens) to be consistent
# with other per-token buffers (ids_remove_padding, batch_id_per_token, etc.)
self.position_ids_buffer = paddle.zeros([self.max_chunk_tokens], dtype=paddle.int32)
self.slot_mapping_buffer = paddle.zeros([self.max_chunk_tokens], dtype=paddle.int64)
CI报告基于以下代码生成(30分钟更新一次): 1 任务总览当前 9/10 个 required 任务已通过;仍有 1 个 required 任务失败(Approval),需完成人工审批后再继续观察 CI。
2 任务状态汇总日志列说明:失败任务直接使用工具预生成的日志链接;运行中任务使用 Job 链接。 2.1 Required任务 : 9/10 通过
2.2 可选任务 — 23/28 通过
3 失败详情(仅 required)Approval — 需要人工审批(置信度: 高)该 Job 需要人工 Approval,完成审批后 CI 才会继续执行。
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## release/2.6 #7868 +/- ##
==============================================
Coverage ? 72.37%
==============================================
Files ? 381
Lines ? 54216
Branches ? 8473
==============================================
Hits ? 39237
Misses ? 12218
Partials ? 2761
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Motivation
Reset buffer size of
slot_mappingandposition_ids.Modifications
Shape of
slot_mappingandposition_ids.Usage or Command
None
Accuracy Tests
None
Checklist
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]]pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.