Skip to content

[UR] Stop querying adapter fp16/fp64 support via extension. #15811

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 23 commits into
base: sycl
Choose a base branch
from

Conversation

aarongreig
Copy link
Contributor

@aarongreig aarongreig commented Oct 22, 2024

We're trying to move the UR adapters away from returning hard coded OpenCL extension strings to report device capabilities, this is the first change in that direction.

Closes oneapi-src/unified-runtime#1374

bso-intel
bso-intel previously approved these changes Nov 7, 2024
@sarnex sarnex requested a review from a team as a code owner May 16, 2025 21:00
@aelovikov-intel aelovikov-intel dismissed bso-intel’s stale review August 1, 2025 16:00

Not a member of llvm-reviewers-runtime anymore, need someone else re-approve.

Comment on lines +2254 to +2259
// Check if the device supports double precision floating point.
bool isFp64Supported() const;

// Check if the device supports half precision floating point.
bool isFp16Supported() const;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't add extra methods. Existing interfaces should be enough, just change their implementation to the proper calls to the underlying UR queries.

Comment on lines -1239 to -1240
CASE(fp16) { return has_extension("cl_khr_fp16"); }
CASE(fp64) { return has_extension("cl_khr_fp64"); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This relied on the internal caching of the extensions string, we need to ensure that new underlying query is cached too. See

mutable JointCache<
UREagerCache<UR_DEVICE_INFO_TYPE, UR_DEVICE_INFO_USE_NATIVE_ASSERT,
UR_DEVICE_INFO_EXTENSIONS>, //
URCallOnceCache<UR_DEVICE_INFO_NAME,
// USM:
UR_DEVICE_INFO_USM_DEVICE_SUPPORT,
UR_DEVICE_INFO_USM_HOST_SUPPORT,
UR_DEVICE_INFO_USM_SINGLE_SHARED_SUPPORT,
UR_DEVICE_INFO_USM_CROSS_SHARED_SUPPORT,
UR_DEVICE_INFO_USM_SYSTEM_SHARED_SUPPORT,
//
UR_DEVICE_INFO_ATOMIC_64>, //
EagerCache<InfoInitializer>, //
CallOnceCache<InfoInitializer,
ext::oneapi::experimental::info::device::architecture>, //
AspectCache<EagerCache, aspect::fp16, aspect::fp64,
aspect::int64_base_atomics, aspect::int64_extended_atomics,
aspect::ext_oneapi_atomic16>,
AspectCache<
CallOnceCache,
// Slow, >100ns (for baseline cached ~30..40ns):
aspect::ext_intel_pci_address, aspect::ext_intel_gpu_eu_count,
aspect::ext_intel_free_memory, aspect::ext_intel_fan_speed,
aspect::ext_intel_power_limits,
// medium-slow, 60-90ns (for baseline cached ~30..40ns):
aspect::ext_intel_gpu_eu_simd_width, aspect::ext_intel_gpu_slices,
aspect::ext_intel_gpu_subslices_per_slice,
aspect::ext_intel_gpu_eu_count_per_subslice,
aspect::ext_intel_device_info_uuid,
aspect::ext_intel_gpu_hw_threads_per_eu,
aspect::ext_intel_memory_clock_rate,
aspect::ext_intel_memory_bus_width,
aspect::ext_oneapi_bindless_images,
aspect::ext_oneapi_bindless_images_1d_usm,
aspect::ext_oneapi_bindless_images_2d_usm,
aspect::ext_oneapi_is_composite, aspect::ext_oneapi_is_component>>
MCache;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tighten spec for extension queries
5 participants