[SYCL] Link native bfloat16 device library for intel_gpu_cri AOT target#21675
[SYCL] Link native bfloat16 device library for intel_gpu_cri AOT target#21675
Conversation
Agent-Logs-Url: https://github.com/intel/llvm/sessions/1cc53685-41c8-4580-8eb5-55e42c5c3247 Co-authored-by: jinge90 <43599496+jinge90@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR fixes SYCL AOT bfloat16 device library selection for the intel_gpu_cri target by ensuring the driver recognizes cri as a native-bfloat16-capable Intel GPU in both -fsycl-targets=... and -device ... argument paths.
Changes:
- Add
intel_gpu_crito the set of Intel GPU AOT targets that should link the native bfloat16 device library. - Teach the
-device ...parsing path to treatcridevices as native-bfloat16-capable. - Extend driver tests to assert the native bfloat16 library is selected for
intel_gpu_cri/-device cri.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
clang/lib/Driver/ToolChains/SYCL.cpp |
Updates native-bfloat16 target detection for intel_gpu_cri and cri devices. |
clang/test/Driver/sycl-device-lib-bfloat16.cpp |
Adds coverage ensuring cri AOT paths select BFLOAT16-NATIVE. |
| static llvm::SmallSet<StringRef, 8> GPUArchsWithNBF16{ | ||
| "intel_gpu_pvc", "intel_gpu_acm_g10", "intel_gpu_acm_g11", | ||
| "intel_gpu_acm_g12", "intel_gpu_dg2_g10", "intel_gpu_dg2_g11", | ||
| "intel_dg2_g12", "intel_gpu_bmg_g21", "intel_gpu_lnl_m", |
There was a problem hiding this comment.
GPUArchsWithNBF16 includes "intel_dg2_g12", but elsewhere the target is consistently named intel_gpu_dg2_g12 (e.g., resolveGenDevice() cases). As written, -fsycl-targets=intel_gpu_dg2_g12 won’t be recognized as supporting native bfloat16 here and will incorrectly select the fallback library. Update the entry to the correct target name.
| "intel_dg2_g12", "intel_gpu_bmg_g21", "intel_gpu_lnl_m", | |
| "intel_gpu_dg2_g12", "intel_gpu_bmg_g21", "intel_gpu_lnl_m", |
There was a problem hiding this comment.
This issue is not cri specific, I will create separate PR to solve.
| static llvm::SmallSet<StringRef, 8> GPUArchsWithNBF16{ | ||
| "intel_gpu_pvc", "intel_gpu_acm_g10", "intel_gpu_acm_g11", | ||
| "intel_gpu_acm_g12", "intel_gpu_dg2_g10", "intel_gpu_dg2_g11", | ||
| "intel_dg2_g12", "intel_gpu_bmg_g21", "intel_gpu_lnl_m", | ||
| "intel_gpu_ptl_h", "intel_gpu_ptl_u", "intel_gpu_wcl"}; | ||
| "intel_gpu_ptl_h", "intel_gpu_ptl_u", "intel_gpu_wcl", | ||
| "intel_gpu_cri"}; |
There was a problem hiding this comment.
resolveGenDevice() treats intel_gpu_cri and intel_gpu_35_11_0 as the same device, but GPUArchsWithNBF16 only lists intel_gpu_cri. If users pass -fsycl-targets=intel_gpu_35_11_0, this logic will currently fall back to the software bfloat16 library. Consider adding the numeric alias (and/or using the resolved gen device name) so both forms select the native bfloat16 library.
There was a problem hiding this comment.
This issue is not cri specific, I will create separate PR to solve.
|
Hi, @intel/dpcpp-clang-driver-reviewers |
|
@intel/llvm-gatekeepers please consider merging |
intel_gpu_crisupports native bfloat16 but was missing from the AOT compilation path that selects the native bfloat16 device library, causing it to fall back to the software emulation library.Changes
clang/lib/Driver/ToolChains/SYCL.cpp"intel_gpu_cri"toGPUArchsWithNBF16— covers-fsycl-targets=intel_gpu_criusagecriprefix tocheckBFlambda — covers-Xsycl-target-backend=spir64_gen "-device cri"usageclang/test/Driver/sycl-device-lib-bfloat16.cpp-fsycl-targets=intel_gpu_criand-device criviaspir64_gen, assertingBFLOAT16-NATIVEis selected