Skip to content

UCT/IB/ZE: Enable GPUDirect RDMA for Intel Xe devices#11187

Merged
yosefe merged 1 commit into
openucx:masterfrom
intel-staging:topic/ib-ze-gdr
Feb 17, 2026
Merged

UCT/IB/ZE: Enable GPUDirect RDMA for Intel Xe devices#11187
yosefe merged 1 commit into
openucx:masterfrom
intel-staging:topic/ib-ze-gdr

Conversation

@yafshar
Copy link
Copy Markdown
Contributor

@yafshar yafshar commented Feb 16, 2026

What?

Revives #11005 with minor cleanup: Add GPUDirect RDMA (GDR) support for Intel Xe devices by:

  1. Detecting Intel Xe driver presence (/sys/module/xe/srcversion) in IB MD to enable GDR capability for UCS_MEMORY_TYPE_ZE_DEVICE
  2. Extending ZE copy MD access_mem_types to include UCS_MEMORY_TYPE_HOST to allow IB transports to access host memory through ZE MD registration

Why?

In GDR mode, eliminating extra memcpy operations between host staging buffers and GPU memory improves bandwidth and reduces latency for RDMA transfers on Intel GPU hardware. Currently, the IB layer does not recognize Intel Xe devices for GPUDirect, and the ZE copy MD does not expose HOST memory access required for GDR interoperability.

How?

  • IB MD (src/uct/ib/base/ib_md.c): Add uct_ib_check_gpudirect_driver() call to probe /sys/module/xe/srcversion alongside existing CUDA and ROCm checks. This sets the appropriate bit in md->reg_mem_types when the Xe kernel module is loaded.
  • ZE Copy MD (src/uct/ze/copy/ze_copy_md.c): Add UCS_BIT(UCS_MEMORY_TYPE_HOST) to md_attr->access_mem_types alongside existing ZE memory types. This allows IB transports to register host memory through the ZE MD for GDR operations.

Related: #11180 (UCT/ZE: Add device topology registration) - Complementary ZE infrastructure changes. These PRs are independent but both advance ZE/IB integration.

Signed-off-by: Yihua Xu yihua.xu@intel.com
Signed-off-by: Yaser Afshar yaser.afshar@intel.com

Add Intel Xe driver detection in IB MD for GDR capability.
Extend ZE copy MD access_mem_types to include HOST memory
to support GDR memory registration paths.

Signed-off-by: Yihua Xu <yihua.xu@intel.com>
Signed-off-by: Yaser Afshar <yaser.afshar@intel.com>
@yafshar yafshar marked this pull request as ready for review February 16, 2026 05:15
@yosefe yosefe merged commit e5d9887 into openucx:master Feb 17, 2026
150 checks passed
@yafshar yafshar deleted the topic/ib-ze-gdr branch February 17, 2026 15:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants