Copyright © 2022-2023 Intel Corporation. All rights reserved.
Khronos® is a registered trademark and SYCL™ and SPIR™ are trademarks of The Khronos Group Inc. OpenCL™ is a trademark of Apple Inc. used by permission by Khronos.
To report problems with this extension, please open a new issue at:
This extension is written against the SYCL 2020 revision 5 specification. All references below to the "core SYCL specification" or to section numbers in the SYCL specification refer to that revision.
This is an experimental extension specification, intended to provide early access to features and gather community feedback. Interfaces defined in this specification are implemented in DPC++, but they are not finalized and may change incompatibly in future versions of DPC++ without prior notice. Shipping software products should not rely on APIs defined in this specification.
There are important limitations with the DPC++ implementation of this experimental extension. In particular, some parts of this extension may only be used when the application is compiled in AOT mode. See the section below titled "Limitations with the experimental version" for a full description of the limitations.
This extension provides a way for device code to query the device architecture on which it is running. This is similar to the sycl_ext_oneapi_device_if extension except the comparison is for the device’s architecture not the device’s aspects. In some cases, low-level application code can use special features or do specific optimizations depending on the device architecture, and this extension enables such applications.
This extension provides a feature-test macro as described in the core SYCL
specification. An implementation supporting this extension must predefine the
macro SYCL_EXT_ONEAPI_DEVICE_ARCHITECTURE
to one of the values defined in the
table below. Applications can test for the existence of this macro to
determine if the implementation supports this feature, or applications can test
the macro’s value to determine which of the extension’s features the
implementation supports.
Value | Description |
---|---|
1 |
The APIs of this experimental extension are not versioned, so the feature-test macro always has this value. |
This extension adds a new enumeration of the architectures that can be tested.
namespace sycl::ext::oneapi::experimental {
enum class architecture : /* unspecified */ {
x86_64,
intel_gpu_bdw,
intel_gpu_skl,
intel_gpu_kbl,
intel_gpu_cfl,
intel_gpu_apl,
intel_gpu_bxt = intel_gpu_apl,
intel_gpu_glk,
intel_gpu_whl,
intel_gpu_aml,
intel_gpu_cml,
intel_gpu_icllp,
intel_gpu_ehl,
intel_gpu_jsl = intel_gpu_ehl,
intel_gpu_tgllp,
intel_gpu_rkl,
intel_gpu_adl_s,
intel_gpu_rpl_s = intel_gpu_adl_s,
intel_gpu_adl_p,
intel_gpu_adl_n,
intel_gpu_dg1,
intel_gpu_acm_g10,
intel_gpu_dg2_g10 = intel_gpu_acm_g10,
intel_gpu_acm_g11,
intel_gpu_dg2_g11 = intel_gpu_acm_g11,
intel_gpu_acm_g12,
intel_gpu_dg2_g12 = intel_gpu_acm_g12,
intel_gpu_pvc,
nvidia_gpu_sm_50,
nvidia_gpu_sm_52,
nvidia_gpu_sm_53,
nvidia_gpu_sm_60,
nvidia_gpu_sm_61,
nvidia_gpu_sm_62,
nvidia_gpu_sm_70,
nvidia_gpu_sm_72,
nvidia_gpu_sm_75,
nvidia_gpu_sm_80,
nvidia_gpu_sm_86,
nvidia_gpu_sm_87,
nvidia_gpu_sm_89,
nvidia_gpu_sm_90,
amd_gpu_gfx700,
amd_gpu_gfx701,
amd_gpu_gfx702,
amd_gpu_gfx801,
amd_gpu_gfx802,
amd_gpu_gfx803,
amd_gpu_gfx805,
amd_gpu_gfx810,
amd_gpu_gfx900,
amd_gpu_gfx902,
amd_gpu_gfx904,
amd_gpu_gfx906,
amd_gpu_gfx908,
amd_gpu_gfx90a,
amd_gpu_gfx1010,
amd_gpu_gfx1011,
amd_gpu_gfx1012,
amd_gpu_gfx1013,
amd_gpu_gfx1030,
amd_gpu_gfx1031,
amd_gpu_gfx1032,
amd_gpu_gfx1034,
intel_gpu_8_0_0 = intel_gpu_bdw,
intel_gpu_9_0_9 = intel_gpu_skl,
intel_gpu_9_1_9 = intel_gpu_kbl
intel_gpu_9_2_9 = intel_gpu_cfl,
intel_gpu_9_3_0 = intel_gpu_apl,
intel_gpu_9_4_0 = intel_gpu_glk,
intel_gpu_9_5_0 = intel_gpu_whl,
intel_gpu_9_6_0 = intel_gpu_aml,
intel_gpu_9_7_0 = intel_gpu_cml,
intel_gpu_11_0_0 = intel_gpu_icllp,
intel_gpu_12_0_0 = intel_gpu_tgllp,
intel_gpu_12_10_0 = intel_gpu_dg1
};
} // namespace sycl::ext::oneapi::experimental
The following table tells which version of this extension first included each of these enumerators, and it provides a brief description of their meanings.
Enumerator name | Added in version | Description |
---|---|---|
|
- |
Any CPU device with the x86_64 instruction set. |
|
- |
Broadwell Intel graphics architecture. |
|
- |
Broadwell Intel graphics architecture. |
|
- |
Kaby Lake Intel graphics architecture. |
|
- |
Coffee Lake Intel graphics architecture. |
|
- |
Apollo Lake Intel graphics architecture. |
|
- |
Gemini Lake Intel graphics architecture. |
|
- |
Whiskey Lake Intel graphics architecture. |
|
- |
Amber Lake Intel graphics architecture. |
|
- |
Comet Lake Intel graphics architecture. |
|
- |
Ice Lake Intel graphics architecture. |
|
- |
Tiger Lake Intel graphics architecture. |
|
- |
Rocket Lake Intel graphics architecture. |
|
- |
Alder Lake S Intel graphics architecture or Raptor Lake Intel graphics architecture. |
|
- |
Alder Lake P Intel graphics architecture. |
|
- |
Alder Lake N Intel graphics architecture. |
|
- |
DG1 Intel graphics architecture. |
|
- |
Alchemist G10 Intel graphics architecture. |
|
- |
Alchemist G11 Intel graphics architecture. |
|
- |
Alchemist G12 Intel graphics architecture. |
|
- |
Ponte Vecchio Intel graphics architecture. |
|
- |
Alias for |
|
- |
Alias for |
|
- |
Alias for |
|
- |
Alias for |
|
- |
Alias for |
|
- |
Alias for |
|
- |
Alias for |
|
- |
Alias for |
|
- |
Alias for |
|
- |
Alias for |
|
- |
Alias for |
|
- |
Alias for |
|
- |
NVIDIA Maxwell architecture (compute capability 5.0). |
|
- |
NVIDIA Maxwell architecture (compute capability 5.2). |
|
- |
NVIDIA Maxwell architecture (compute capability 5.3). |
|
- |
NVIDIA Pascal architecture (compute capability 6.0). |
|
- |
NVIDIA Pascal architecture (compute capability 6.1). |
|
- |
NVIDIA Pascal architecture (compute capability 6.2). |
|
- |
NVIDIA Volta architecture (compute capability 7.0). |
|
- |
NVIDIA Volta architecture (compute capability 7.2). |
|
- |
NVIDIA Turing architecture (compute capability 7.5). |
|
- |
NVIDIA Ampere architecture (compute capability 8.0). |
|
- |
NVIDIA Ampere architecture (compute capability 8.6). |
|
- |
Jetson/Drive AGX Orin architecture. |
|
- |
NVIDIA Ada Lovelace architecture. |
|
- |
NVIDIA Hopper architecture. |
|
- |
AMD GCN GFX7 (Sea Islands (CI)) architecture. |
|
- |
AMD GCN GFX7 (Sea Islands (CI)) architecture. |
|
- |
AMD GCN GFX7 (Sea Islands (CI)) architecture. |
|
- |
AMD GCN GFX8 (Volcanic Islands (VI)) architecture. |
|
- |
AMD GCN GFX8 (Volcanic Islands (VI)) architecture. |
|
- |
AMD GCN GFX8 (Volcanic Islands (VI)) architecture. |
|
- |
AMD GCN GFX8 (Volcanic Islands (VI)) architecture. |
|
- |
AMD GCN GFX8 (Volcanic Islands (VI)) architecture. |
|
- |
AMD GCN GFX9 (Vega) architecture. |
|
- |
AMD GCN GFX9 (Vega) architecture. |
|
- |
AMD GCN GFX9 (Vega) architecture. |
|
- |
AMD GCN GFX9 (Vega) architecture. |
|
- |
AMD GCN GFX9 (Vega) architecture. |
|
- |
AMD GCN GFX9 (Vega) architecture. |
|
- |
AMD GCN GFX10.1 (RDNA 1) architecture. |
|
- |
AMD GCN GFX10.1 (RDNA 1) architecture. |
|
- |
AMD GCN GFX10.1 (RDNA 1) architecture. |
|
- |
AMD GCN GFX10.1 (RDNA 1) architecture. |
|
- |
AMD GCN GFX10.3 (RDNA 2) architecture. |
|
- |
GCN GFX10.3 (RDNA 2) architecture. |
|
- |
GCN GFX10.3 (RDNA 2) architecture. |
|
- |
GCN GFX10.3 (RDNA 2) architecture. |
Note
|
|
This extension adds one new free function which may be called from device code. This function is not available in host code.
namespace sycl::ext::oneapi::experimental {
template<architecture ...Archs, typename T>
/* unspecified */ if_architecture_is(T fn);
} // namespace sycl::ext::oneapi::experimental
This function operates exactly like if_device_has
from the
sycl_ext_oneapi_device_if
extension except that the condition gating execution of the callable function
fn
is determined by the Archs
parameter pack. This condition is true
if
the device which executes if_architecture_is
matches any of the
architectures listed in this pack.
The value returned by if_architecture_is
is an object F of an unspecified
type, which provides the following member functions:
class /* unspecified */ {
public:
template<architecture ...Archs, typename T>
/* unspecified */ else_if_architecture_is(T fn);
template<typename T>
void otherwise(T fn);
};
The otherwise
function behaves exactly like the otherwise
function from the
sycl_ext_oneapi_device_if
extension. The else_if_architecture_is
function behaves exactly like
else_if_device_has
from that extension except that the condition gating
execution of the callable object fn
is determined by the Archs
parameter
pack. This condition is true
only if the object F comes from a previous
call to if_architecture_is
or else_if_architecture_is
whose condition is
false
and if the device calling else_if_architecture_is
has one of the
architectures in the Archs
parameter pack.
This extension adds the following new member function to the device
class,
which returns a Boolean telling whether the device has the specified
architecture.
namespace sycl {
class device {
bool ext_oneapi_architecture_is(
ext::oneapi::experimental::architecture arch);
};
// namespace sycl
Device descriptor | Return type | Description |
---|---|---|
|
|
Returns the architecture of the device |
This device descriptor allows host code such as:
namespace syclex = sycl::ext::oneapi::experimental;
syclex::architecture arch = dev.get_info<syclex::info::device::architecture>();
switch (arch) {
case syclex::architecture::x86_64:
/* ... */
break;
case syclex::architecture::intel_gpu_bdw:
/* ... */
break;
/* etc. */
}
The DPC++ implementation of this extension currently has some important
limitations with the if_architecture_is
free function. In order to use this
feature, the application must be compiled in ahead-of-time (AOT) mode using
-fsycl-targets=<special-target>
where <special-target>
is one of the
"special target values" listed in the users manual
description of the -fsycl-targets
option. These are the target names of the
form "intel_gpu_*", "nvidia_gpu_*", or "amd_gpu_*".
The two APIs device::ext_oneapi_architecture_is
and the
ext::oneapi::experimental::info::device::architecture
device descriptor are
currently supported only for Intel devices (both GPU and CPU). There is no
support yet for Nvidia or AMD devices.
This experimental extension is still evolving. We expect that future versions will include the following:
-
A compile-time constant property that can be used to decorate kernels and non-kernel device functions:
namespace sycl::ext::oneapi::experimental { struct device_architecture_is_key { template <architecture... Archs> using value_t = property_value<device_architecture_is_key, std::integral_constant<architecture, Archs>...>; }; template <architecture... Archs> struct property_value<device_architecture_is_key, std::integral_constant<architecture, Archs>...> { static constexpr std::array<architecture, sizeof...(Archs)> value; }; template <architecture... Archs> inline constexpr device_architecture_is_key::value_t<Archs...> device_architecture_is; } // namespace sycl::ext::oneapi::experimental
This property indicates that a kernel or non-kernel device function uses features that are available on devices with the given architecture list but may not be available on devices with other architectures.
-
Additional enumerators in the
architecture
enumeration. This could include entries for different x86_64 architectures.