Skip to content

stabilize dGPU shmem hotplug management node#4005

Merged
MatheMatrix merged 1 commit into
feature-5.5.22-aiosfrom
sync/xinhao.huang/feature/dgpu-hotplug-shmem@@2
May 19, 2026
Merged

stabilize dGPU shmem hotplug management node#4005
MatheMatrix merged 1 commit into
feature-5.5.22-aiosfrom
sync/xinhao.huang/feature/dgpu-hotplug-shmem@@2

Conversation

@zstack-robot-1
Copy link
Copy Markdown
Collaborator

Rebased onto upstream feature-5.5.22-aios. Source: xinhao.huang/zstack:feature/dgpu-hotplug-shmem@@2

sync from gitlab !9898

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 18, 2026

Review Change Stack

总览

此 PR 为 KVM 虚拟化管理平台添加共享内存设备热插拔功能,并在测试库中增加 GPU 附加 API 方法。包括 Agent 命令、消息类型、常量定义和 KVMHost 后端处理逻辑,以及测试辅助方法实现。

变更

KVM 共享内存热插拔实现

Layer / File(s) 摘要
Agent 命令与数据结构
plugin/kvm/src/main/java/org/zstack/kvm/KVMAgentCommands.java
新增 VmShmemDevice 数据结构,包含设备名称、路径、大小字段;新增 HotPlugVmShmemCmd/HotPlugVmShmemRspHotUnplugVmShmemCmd/HotUnplugVmShmemRsp 命令响应对,用于热插拔/热卸载操作。
常量与消息类型
plugin/kvm/src/main/java/org/zstack/kvm/KVMConstant.java, plugin/kvm/src/main/java/org/zstack/kvm/KVMHotPlugVmShmemMsg.java, plugin/kvm/src/main/java/org/zstack/kvm/KVMHotUnplugVmShmemMsg.java
添加 KVM_VM_SHMEM_HOTPLUG_PATHKVM_VM_SHMEM_HOTUNPLUG_PATH 路由常量;定义 KVMHotPlugVmShmemMsgKVMHotUnplugVmShmemMsg 消息类,实现 HostMessage 接口,分别承载主机 UUID、虚拟机 UUID 和设备信息。
KVMHost 消息处理与发送
plugin/kvm/src/main/java/org/zstack/kvm/KVMHost.java
handleLocalMessage 中添加 shmem 消息分发逻辑;新增私有 handler 方法处理热插拔/热卸载消息;实现泛型 sendVmShmemCommand 方法封装 HTTP 调用、路由到目标主机 Agent、解析响应并回复原消息。

GPU 附加 API 测试方法

Layer / File(s) 摘要
attachDGpuToVm 实现
testlib/src/main/java/org/zstack/testlib/ApiHelper.groovy
新增 attachDGpuToVm 方法,创建 AttachDGpuToVmAction、设置会话 ID、以 OWNER_FIRST 策略代理 Closure;当 System.getProperty("apipath") 存在时,补全 API ID 并通过 ApiPathTracker 追踪调用路径,最终返回错误包装的执行结果。

代码审查工作量估计

🎯 2 (简单) | ⏱️ ~12 分钟

寒霄写意,内存共享与光芒同舞,
插拔轻声,助手方法轻轻守候,
KVM 添新翼,测试库展笑容,
热插热卸,GPU 光彩更耀眼,✨
代码流转,功能与梦相逢。


Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 error, 1 warning)

Check name Status Explanation Resolution
Title check ❌ Error 标题不遵循指定的格式,缺少必需的 [scope]: 格式(例如 'feat[kvm]: ...' 或 'fix[shmem]: ...')。 请将标题改为符合格式要求的形式,例如 'feat[kvm]: stabilize dGPU shmem hotplug' 或类似的标准格式。
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Description check ✅ Passed 描述内容与变更集相关,提供了变更来源和同步信息,虽然简洁但包含了有意义的上下文。
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch sync/xinhao.huang/feature/dgpu-hotplug-shmem@@2

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ast-grep (0.42.2)
plugin/kvm/src/main/java/org/zstack/kvm/KVMHost.java

Comment @coderabbitai help to get the list of available commands and usage tips.

@MatheMatrix MatheMatrix force-pushed the sync/xinhao.huang/feature/dgpu-hotplug-shmem@@2 branch 4 times, most recently from 22261fc to 3122dff Compare May 18, 2026 10:50
Add VM shmem KVM agent command binding and SDK updates for dGPU attach API.

APIImpact

Related: ZSTAC-84067

Change-Id: I84557417caafe61a240b325e950cbc0f12f59bc1
@MatheMatrix MatheMatrix force-pushed the sync/xinhao.huang/feature/dgpu-hotplug-shmem@@2 branch from 3122dff to fa124b1 Compare May 19, 2026 02:07
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
plugin/kvm/src/main/java/org/zstack/kvm/KVMHost.java (1)

2734-2743: ⚡ Quick win

建议给 origin 参数增加编译期类型约束,避免强转风险。

sendVmShmemCommand 当前将 NeedReplyMessage 强转为 HostMessage(第 2739、2741 行)。建议改用交叉类型约束,使传入的消息类型同时满足 NeedReplyMessageHostMessage 两个接口,避免运行时类型转换失败:

参考改法
-    private <T extends AgentResponse> void sendVmShmemCommand(NeedReplyMessage origin,
+    private <M extends NeedReplyMessage & HostMessage, T extends AgentResponse> void sendVmShmemCommand(M origin,
                                                              Object cmd,
                                                              String path,
                                                              Class<T> responseClass) {
         KVMHostAsyncHttpCallMsg kmsg = new KVMHostAsyncHttpCallMsg();
         kmsg.setCommand(cmd);
-        kmsg.setHostUuid(((HostMessage) origin).getHostUuid());
+        kmsg.setHostUuid(origin.getHostUuid());
         kmsg.setPath(path);
-        bus.makeTargetServiceIdByResourceUuid(kmsg, HostConstant.SERVICE_ID, ((HostMessage) origin).getHostUuid());
+        bus.makeTargetServiceIdByResourceUuid(kmsg, HostConstant.SERVICE_ID, origin.getHostUuid());
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugin/kvm/src/main/java/org/zstack/kvm/KVMHost.java` around lines 2734 -
2743, Change sendVmShmemCommand to require a message type that is both a
NeedReplyMessage and HostMessage (e.g. use a generic type parameter like M
extends NeedReplyMessage & HostMessage) so you can remove the unsafe casts
((HostMessage) origin).getHostUuid(). Update usages inside the method (calls to
kmsg.setHostUuid(...) and bus.makeTargetServiceIdByResourceUuid(...,
((HostMessage) origin).getHostUuid())) to use origin.getHostUuid() directly, and
update any call sites to pass a message type satisfying both interfaces; keep
the rest of the method (KVMHostAsyncHttpCallMsg creation, setCommand, setPath,
bus.send(..., new CloudBusCallBack(origin))) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@plugin/kvm/src/main/java/org/zstack/kvm/KVMHost.java`:
- Around line 2734-2743: Change sendVmShmemCommand to require a message type
that is both a NeedReplyMessage and HostMessage (e.g. use a generic type
parameter like M extends NeedReplyMessage & HostMessage) so you can remove the
unsafe casts ((HostMessage) origin).getHostUuid(). Update usages inside the
method (calls to kmsg.setHostUuid(...) and
bus.makeTargetServiceIdByResourceUuid(..., ((HostMessage)
origin).getHostUuid())) to use origin.getHostUuid() directly, and update any
call sites to pass a message type satisfying both interfaces; keep the rest of
the method (KVMHostAsyncHttpCallMsg creation, setCommand, setPath, bus.send(...,
new CloudBusCallBack(origin))) unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: http://open.zstack.ai:20001/code-reviews/zstack-cloud.yaml (via .coderabbit.yaml)

Review profile: CHILL

Plan: Pro

Run ID: 309e5b37-60f4-432d-bd88-4274a517b1b6

📥 Commits

Reviewing files that changed from the base of the PR and between 3122dff and fa124b1.

⛔ Files ignored due to path filters (4)
  • conf/springConfigXml/Kvm.xml is excluded by !**/*.xml
  • sdk/src/main/java/org/zstack/sdk/AttachDGpuToVmAction.java is excluded by !sdk/**
  • sdk/src/main/java/org/zstack/sdk/AttachDGpuToVmResult.java is excluded by !sdk/**
  • test/src/test/resources/springConfigXml/Kvm.xml is excluded by !**/*.xml
📒 Files selected for processing (6)
  • plugin/kvm/src/main/java/org/zstack/kvm/KVMAgentCommands.java
  • plugin/kvm/src/main/java/org/zstack/kvm/KVMConstant.java
  • plugin/kvm/src/main/java/org/zstack/kvm/KVMHost.java
  • plugin/kvm/src/main/java/org/zstack/kvm/KVMHotPlugVmShmemMsg.java
  • plugin/kvm/src/main/java/org/zstack/kvm/KVMHotUnplugVmShmemMsg.java
  • testlib/src/main/java/org/zstack/testlib/ApiHelper.groovy
🚧 Files skipped from review as they are similar to previous changes (3)
  • plugin/kvm/src/main/java/org/zstack/kvm/KVMConstant.java
  • plugin/kvm/src/main/java/org/zstack/kvm/KVMAgentCommands.java
  • testlib/src/main/java/org/zstack/testlib/ApiHelper.groovy

@MatheMatrix MatheMatrix merged commit f1020bb into feature-5.5.22-aios May 19, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants