=== v1.21.0 === (18 August 2023)
ISPC release with template function specializations support; changed rules for signed integer overflow, which match C/C++ behavior and lead to more aggressive optimizations; an enhanced ISPC Runtime; multiple stability and performance fixes and more. The release is based on patched LLVM 15.0.7.
Language changes:
- Added support for function template specializations with explicit template arguments.
// Primary template
template <typename T, typename C> noinline int goo(T argGooOne, C argGooTwo);
// Specialization with explicit template arguments
template <> noinline int goo<int, float>(int argGooOne, float argGooTwo);
// Not supported yet: specialization with implicit template arguments (requires template arguments type deduction)
template <> noinline int goo(int argGooOne, float argGooTwo);
- Modified behavior for signed integer overflow.
Now, in case of signed integer overflow, ispc
will assume undefined behavior similar to C and C++. This change may cause compatibility issues. You can manage this behavior using the --[no-]wrap-signed-int
compiler switch. The default behavior (before version 1.21.0) can be preserved by using --wrap-signed-int
, which maintains defined wraparound behavior for signed integers, though it may limit some compiler optimizations.
New hardware support:
Added support of Intel Meteor Lake Xe-LPG graphics:
- added two new ISPC targets:
xelpg-x16
andxelpg-x8
- added two new device names:
mtl-m
andmtl-p
Infrastructure changes:
- ISPC now uses LLVM's new pass manager. Optimization pipeline was modified by introducing early LoopFullUnrollPass which matches ISPC unrolled loops with manually unrolled loops in many cases.
- Introduced ISPC superbuild, which facilitates building ISPC with Xe dependencies (LLVM, L0, vc-intrinsics, SPIRV-Translator). It can generate an archive with dependencies or consume a pre-built archive to build ISPC only. It also enables generating LTO or LTO+PGO enabled builds of LLVM and ISPC.
- Supported building ISPC with LLVM 16.
New compiler switches:
--mcmodel
switch, which acceptssmall
andlarge
values. The definition is similar to gcc/clang. Whenlarge
model is used, it enables programs larger than 2Gb.--opt=disable-gathers
and--opt=disable-scatters
options, which disable generation of gathers and scatters instructions on platforms that support them (for performance experiments).--[no-]wrap-signed-int
switches, which [does not] preserve(s) wrap-around behavior on signed integer overflow.
ISPC Runtime improvements:
- Added
ispcrtSetTaskingCallbacks
to the ISPCRT API, allowing the override of default implementations ofISPCLaunch
,ISPCAlloc
, andISPCSync
. - Removed compile-time Level Zero dependency from ISPCRT, no longer necessary after the ISPCRT split into CPU and GPU parts.
Recommended versions of Runtime Dependencies when targeting GPU:
Linux:
- Intel(R) Graphics Compute Runtime
https://github.com/intel/compute-runtime/releases/tag/23.22.26516.18 - Level Zero Loader
https://github.com/oneapi-src/level-zero/releases/tag/v1.13.5 - Threading Building Blocks (TBB)
Alternatively, you can use a validated gfx driver stack supporting Intel® Arc™ available at https://dgpu-docs.intel.com/driver/installation.html
Windows:
- Intel(R) Graphics Windows(R) DCH Drivers 31.0.101.4644
https://www.intel.com/content/www/us/en/download/726609/intel-arc-iris-xe-graphics-whql-windows.html - Level Zero Loader
https://github.com/oneapi-src/level-zero/releases/tag/v1.13.5 - OpenCL™ Offline Compiler (OCLOC)
https://www.intel.com/content/www/us/en/developer/articles/tool/oneapi-standalone-components.html
(this is needed for AoT compilation on Windows only) - Supported GPU platforms: Intel(R) Arc Graphics, 11th-13th Gen Intel(R) Core
processor graphics
Components revisions used in GPU-enabled build:
- KhronosGroup/SPIRV-LLVM-Translator@e82ecc2
- intel/vc-intrinsics@910db48
- oneapi-src/level-zero@e1f09b4 (v1.13.5)
- llvm/llvm-project@8dfdcc7 (llvmorg-15.0.7) +
patches from llvm_patches folder