Skip to content

AIESW-29257: Fix xrt-smi device ID reporting#9716

Merged
chvamshi-xilinx merged 7 commits intoXilinx:masterfrom
sandilya-xilinx:29257
Apr 10, 2026
Merged

AIESW-29257: Fix xrt-smi device ID reporting#9716
chvamshi-xilinx merged 7 commits intoXilinx:masterfrom
sandilya-xilinx:29257

Conversation

@sandilya-xilinx
Copy link
Copy Markdown
Contributor

[why]
xrt-smi JSON report showed "id": "00000000-0000-0000-0000-000000000000" for all Ryzen/NPU devices (Phoenix, Strix, Strix2, StrixH, Krackan, Medusa, Soundwave). The root cause: rom_time_since_epoch returns a hardcoded 0 for Ryzen devices instead of throwing, so the existing Alveo 1RP path silently set id="0". No Ryzen-specific path existed, leaving every NPU device with an identical, non-unique identifier in the JSON output.

[how]
Extract device ID generation into a dedicated get_device_id() function in XBUtilities.cpp. For Ryzen/NPU devices, derive the UUID from PCIe BDF (domain, bus, device, function) using the same memcpy byte-packing as hip_device_get_uuid(), ensuring the xrt-smi id is consistent with the HIP UUID API. Alveo 2RP (xclbin logic UUID) and 1RP (ROM timestamp) paths are retained as fallbacks for non-Ryzen devices.

AIESW-29257

Problem solved by the commit

Bug / issue (if any) fixed, which PR introduced the bug, how it was discovered

How problem was solved, alternative solutions (if any) and why they were rejected

Risks (if any) associated the changes in the commit

What has been tested and how, request additional testing if necessary

Documentation impact (if any)

[why]
xrt-smi JSON report showed "id": "00000000-0000-0000-0000-000000000000"
for all Ryzen/NPU devices (Phoenix, Strix, Strix2, StrixH, Krackan,
Medusa, Soundwave). The root cause: rom_time_since_epoch returns a
hardcoded 0 for Ryzen devices instead of throwing, so the existing
Alveo 1RP path silently set id="0". No Ryzen-specific path existed,
leaving every NPU device with an identical, non-unique identifier in
the JSON output.

[how]
Extract device ID generation into a dedicated get_device_id() function
in XBUtilities.cpp. For Ryzen/NPU devices, derive the UUID from PCIe
BDF (domain, bus, device, function) using the same memcpy byte-packing
as hip_device_get_uuid(), ensuring the xrt-smi id is consistent with
the HIP UUID API. Alveo 2RP (xclbin logic UUID) and 1RP (ROM timestamp)
paths are retained as fallbacks for non-Ryzen devices.

AIESW-29257
@xrt-pr-bot
Copy link
Copy Markdown

xrt-pr-bot Bot commented Apr 6, 2026

⚠️ Authorization Failed

@sandilya-xilinx is not a repository collaborator.

To proceed:

  • XRT Admins: Add the build label to authorize this PR build
  • OR Add @sandilya-xilinx as a repository collaborator

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

Comment thread src/runtime_src/core/tools/common/XBUtilities.cpp Outdated
Comment thread src/runtime_src/core/tools/common/XBUtilities.cpp Outdated
Comment thread src/runtime_src/core/tools/common/XBUtilities.cpp Outdated
Comment thread src/runtime_src/core/tools/common/XBUtilities.cpp Outdated
Comment thread src/runtime_src/core/tools/common/XBUtilities.cpp Outdated
Comment thread src/runtime_src/core/tools/common/XBUtilities.cpp Outdated
Replace C-style byte array and magic number indices with xuid_t +
std::apply fold, eliminating all cppcoreguidelines-avoid-magic-numbers
and cppcoreguidelines-avoid-c-arrays warnings.

AIESW-29257
@xrt-pr-bot
Copy link
Copy Markdown

xrt-pr-bot Bot commented Apr 6, 2026

⚠️ Authorization Failed

@sandilya-xilinx is not a repository collaborator.

To proceed:

  • XRT Admins: Add the build label to authorize this PR build
  • OR Add @sandilya-xilinx as a repository collaborator

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

Comment thread src/runtime_src/core/tools/common/XBUtilities.cpp Outdated
@xrt-pr-bot
Copy link
Copy Markdown

xrt-pr-bot Bot commented Apr 6, 2026

⚠️ Authorization Failed

@sandilya-xilinx is not a repository collaborator.

To proceed:

  • XRT Admins: Add the build label to authorize this PR build
  • OR Add @sandilya-xilinx as a repository collaborator

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 6, 2026

clang-tidy review says "All clean, LGTM! 👍"

I, Sandilya Bhagi <sandilya.bhagi@amd.com>, hereby add my Signed-off-by to this commit: 498ab2f
I, Sandilya Bhagi <sandilya.bhagi@amd.com>, hereby add my Signed-off-by to this commit: 8ecf10b
I, Sandilya Bhagi <sandilya.bhagi@amd.com>, hereby add my Signed-off-by to this commit: fc560b2

Signed-off-by: Sandilya Bhagi <sandilya.bhagi@amd.com>

Signed-off-by: Sandilya Bhagi <sandilya.bhagi@amd.com>
@xrt-pr-bot
Copy link
Copy Markdown

xrt-pr-bot Bot commented Apr 6, 2026

⚠️ Authorization Failed

@sandilya-xilinx is not a repository collaborator.

To proceed:

  • XRT Admins: Add the build label to authorize this PR build
  • OR Add @sandilya-xilinx as a repository collaborator

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 6, 2026

clang-tidy review says "All clean, LGTM! 👍"

@aktondak aktondak self-requested a review April 6, 2026 16:35
@aktondak aktondak requested a review from stsoe April 7, 2026 19:59

// Alveo 2RP: UUID from loaded xclbin logic region
try {
auto logic_uuids = xrt_core::device_query<xrt_core::query::logic_uuids>(device);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the UUID logic capture from XCLBIN ?
IMO, let's restrict to ryzen devices cases only.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alveo code is existing code and now with newer code there is exclusive if/else for Ryzen vs Alveo

@chvamshi-xilinx chvamshi-xilinx requested a review from sonals April 8, 2026 14:14
}

static std::string
get_device_id(const std::shared_ptr<xrt_core::device>& device,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we reuse the same function as used by HIP? See hip_device_get_uuid(hipDevice_t device) defined in hip_device.cpp. Can that code be moved to some common place from where both HIP and xrt-smi can call it?

Copy link
Copy Markdown
Contributor Author

@sandilya-xilinx sandilya-xilinx Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @sonals, Implemented Centralized BDF-to-UUID logic in pcie_bdf::to_uuid() [src/runtime_src/core/common/query_requests.h]. Please review

[why]
UUID construction from PCIe BDF was duplicated between xrt-smi
(XBUtilities.cpp) and HIP (hip_device.cpp). If the packing logic
changes, both files would need updating independently.

[how]
Add pcie_bdf::to_uuid() static method in query_requests.h as the
single source of truth for BDF-to-UUID conversion. Update
hip_device_get_uuid() in hip_device.cpp to call it. Inline the
device ID logic back into get_available_devices() in XBUtilities.cpp
(removing get_device_id()) using pcie_bdf::to_uuid() for the
Ryzen/NPU path, restoring the original 1RP/2RP inline structure
for Alveo.

AIESW-29257
I, Sandilya Bhagi <sandilya.bhagi@amd.com>, hereby add my Signed-off-by to this commit: 89e0497

Signed-off-by: Sandilya Bhagi <sandilya.bhagi@amd.com>

Signed-off-by: Sandilya Bhagi <sandilya.bhagi@amd.com>
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

Comment thread src/runtime_src/core/common/query_requests.h Outdated
…f::to_uuid()

Use braced initializer list instead of repeating the return type.

AIESW-29257

Signed-off-by: Sandilya Bhagi <sandilya.bhagi@amd.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 8, 2026

clang-tidy review says "All clean, LGTM! 👍"

@chvamshi-xilinx chvamshi-xilinx merged commit 5ee0470 into Xilinx:master Apr 10, 2026
21 checks passed
@sandilya-xilinx sandilya-xilinx deleted the 29257 branch April 10, 2026 13:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants