Release of the pegaflow workspace / pegaflow-llm 0.22.10 — 12 commits since v0.22.9, centered on MLA KV-cache storage efficiency, model-aware transfer-backend selection, and cross-node redundancy observability.
English
✨ Features
- MLA KV page-first storage (#360) — store MLA KV cache page-first so per-block metadata collapses, cutting metadata overhead for MLA models.
- Per-layer MLA TP save distribution (#359) — spread MLA tensor-parallel save work across ranks by layer to balance save load.
- Model-aware KV transfer backend (#357) — the connector auto-selects the KV transfer backend per model; the server no longer needs a static backend setting.
- Metaserver block-redundancy metrics (#361) — new
pegaflow_metaserver_block_redundancy{owners="1|2|3|>=4"}distribution pluspegaflow_metaserver_block_redundancy_avggauge, surfacing the cross-node KV replication factor (how much effective cache capacity shrinks). - P/D handshake wire schema (#345) — seal the prefill/decode handshake wire schema in
pegaflow-pd-wire. - Transfer benchmarks (#349,
eb69309) — p2p RDMA fetch example plus native D2H/H2D transfer-path measurement.
🐛 Fixes
- Drop late duplicate saves (#358) — skip late duplicate saves of already-resident blocks, avoiding redundant work.
♻️ Refactors
- Restructure
pd_connectorfor maintainability (#355). SealedBlockowns itsRawBlockslots (#352).- Use
usizeforblock_ids, validated at the RPC boundary (#351).
🔧 Chore
- Bump version
0.22.9→0.22.10(#362).
⚠️ Strict version handshake: client and server must match onCARGO_PKG_VERSIONat registration — upgrade both sides together.
中文
✨ 新功能
- MLA KV page-first 存储 (#360) — MLA KV cache 按 page-first 布局存储,合并每块元数据,降低 MLA 模型的元数据开销。
- MLA TP save 按层跨 rank 分摊 (#359) — 把 MLA 张量并行的 save 工作按层分散到各 rank,均衡 save 负载。
- 按模型自动选 KV 传输 backend (#357) — connector 按模型自动选择 KV 传输 backend,server 不再需要静态指定。
- Metaserver 块冗余度指标 (#361) — 新增
pegaflow_metaserver_block_redundancy{owners="1|2|3|>=4"}分布与pegaflow_metaserver_block_redundancy_avg,反映跨节点 KV 副本数(即有效缓存容量缩水倍数)。 - P/D 握手 wire schema (#345) — 在
pegaflow-pd-wire中固化 prefill/decode 握手协议。 - 传输 benchmark (#349,
eb69309) — p2p RDMA fetch 示例 + 原生 D2H/H2D 传输路径测量。
🐛 修复
- 丢弃迟到的重复 save (#358) — 跳过对已驻留块的迟到重复 save,避免冗余工作。
♻️ 重构
- 重构
pd_connector提升可维护性 (#355)。 SealedBlock自持RawBlockslot (#352)。block_ids改用usize,在 RPC 边界校验 (#351)。
🔧 杂项
- 版本
0.22.9→0.22.10(#362)。
⚠️ 严格版本握手:注册时 client 与 server 必须CARGO_PKG_VERSION完全一致——升级请两端同时进行。
Full Changelog: v0.22.9...v0.22.10