Skip to content

Releases: ispc/ispc

=== v1.16.1 === (15 July 2021)

16 Jul 19:46
Compare
Choose a tag to compare

A minor ISPC update, which has a bug fix for issue #2111 and is based on patched version of LLVM 12.0.1.

The bug fix affects x86 targets only and shows up as incorrect code generation for the sequence of shuffle() and reduce_add() stdlib functions.

If you are building ispc from the sources, note that the fix is implemented as a patch for LLVM backend and LLVM must be built with this patch applied in order for this fix to take an effect. Stock build of LLVM 12.0.1 will not contain this bug fix.

=== v1.16.0 === (11 June 2021)

12 Jun 01:15
Compare
Choose a tag to compare

An ISPC release with language extensions for performance fine tuning, cpu definitions for AlderLake and SapphireRapids targets, support for macOS ARM targets, and massive update of Intel GPUs support. Windows and Linux binaries in this release support both CPU and GPU targets, while macOS binary supports only CPU. This release is based on patched LLVM 12.0.0.

The language changes include the following:

  • The ability to directly call LLVM intrinsics from ISPC source. This should be handy for performance fine tuning and reaching the hardware instructions not yet covered by the standard library. Note that it is an experimental feature and is enabled only with --enable-llvm-intrinsics switch. Please refer to LLVM Intrinsic Functions section of the user manual for more details.
  • assume() optimization hint, which can be used for communicating assumptions to the optimizer. It will not lead to runtime check, unlike assert() calls. This is intended for optimizations like removing null pointer checks, removing loop reminders, communicating alignment information to the optimizer, and etc. Please refer to Compiler Optimization Hints section of the user manual for more details.
  • Support for stack memory allocations through alloca() calls.
  • trunc() standard library functions.

Changes for CPU targets:

  • CPU definitions for AlderLake and SapphireRapids were added: alderlake and sapphirerapids respectively.
  • CPU definition for Apple ARM chips were added: apple-a7, apple-a10, apple-a11, apple-a12, apple-a13, apple-a14.
  • Support for macOS ARM targets was added.

Using GPU-enabled binaries you can build ISPC programs and run them on Intel(R) Core(tm) Processors with Gen9 graphics (formerly Skylake, Kaby Lake, Coffee Lake) and Gen12 graphics (TigerLake mobile CPU) using --target options (genx-x8 and genx-x16) and --cpu option for specifying particular platform (e.g. --cpu=TGLLP).

The main GPU feature of the current release is Windows support. There are also a bunch of stability and performance improvements. Here are some of them:

  • ISPC Runtime got support of unified shared memory and multi GPU. Also, there is a new TaskQueue::submit() method which allows to start executing, but don't wait for the completion.
  • Thread private memory was mapped to SVM in VC backend. It greatly improves stability of the current release. It may affect performance on Gen9 graphics but we do not expect any significant changes on Gen12.
  • L0 binary generation was reworked through libocloc. Supported on Linux only.

More details about the current state of GPU support are available here: https://ispc.github.io/ispc_for_gen.html

For build instructions check our docker recipe: https://github.com/ispc/ispc/blob/main/docker/ubuntu/xpu_ispc_build/Dockerfile

GPU support is still in Beta stage so you may experience some issues but we strongly encourage you to try it out and give us feedback! You can reach us through Github discussions and issues, or on Twitter (@ispc_updates).

Runtime Dependencies when targeting GPU:

Linux:

Windows:

Components revisions used in GPU-enabled build:

KhronosGroup/SPIRV-LLVM-Translator@0592c4f
intel/vc-intrinsics@2d0795c
oneapi-src/level-zero@0d30b1f (v1.2.3)
llvm/llvm-project@d28af7c (llvmorg-12.0.0) + patches from llvm_patches folder

=== v1.15.0 === (18 December 2020)

19 Dec 06:51
Compare
Choose a tag to compare

An ISPC release with several improvements for CPU and Beta support of Intel
graphics hardware architectures. The binaries in this release include CPU versions
for Windows, Linux, and macOS, and a GPU-enabled Linux binary, which supports
both CPU and GPU.
CPU binaries are based on patched LLVM 11.0.0, GPU binary is based on patched
LLVM 10.0.1.

CPU changes include:

  • New loop unroll pragmas: #pragma unroll and #pragma nounroll directives
    provide loop unrolling optimization hints to the compiler. This pragma may be used
    immediately before a loop statement. Currently, this functionality is limited to
    uniform for and do-while.
  • More efficient packed_[load|store]_active() stdlib functions implementation
    (up to 2.5x faster), which now supports 64 bit types.
  • New cpus: icelake-server, tigerlake , alderlake, sapphirerapids.
  • Several stability fixes related to SOA types, bool varying type initialization,
    broken alignment information, type scoping.
  • Compile time improvements.

ISPC support was added to CMake 3.19 so now you can use the standard CMake approach
to find ISPC on the system and use it in your build.
https://cmake.org/cmake/help/latest/release/3.19.html#languages

Using GPU-enabled Linux binary you can build ISPC programs and run them on Intel(R)
Core(tm) Processors with Gen9 graphics (formerly Skylake, Kaby Lake, Coffee Lake) and
Gen12 graphics (TigerLake mobile CPU) using --target options (genx-x8 and
genx-x16) and --cpu option for specifying particular platform (e.g. --cpu=TGLLP).

Stability and performance were significantly improved in this release. Here is the list
of new features:

  • Initial support of ahead of time compilation to oneAPI Level Zero binary format using
    --emit-zebin switch. You can use this binary from ISPC Runtime by setting
    ISPCRT_USE_ZEBIN env variable to 1. Please note that SPIR-V format is still a recommended and default way.
  • Initial function pointers implementation.
  • Global atomics support.
  • Double math functions support.
  • Memory functions support.
  • Reworked masking approach. We disabled genx hardware mask by default and use
    a software mask by default.
  • Improved address spaces differentiation.
  • Initial debug support.
  • TGLLP (TigerLake mobile CPU) support (--cpu=TGLLP).

We also added examples to demonstrate interoperability with oneAPI DPC++ Compiler.
More details about current state of GPU support are available here: https://ispc.github.io/ispc_for_gen.html

For build instructions check our docker recipe:
https://github.com/ispc/ispc/blob/main/docker/ubuntu/gen/Dockerfile

GPU support is in Beta stage so you may experience some issues but we
strongly encourage to try it out and give us feedback! You can reach us through
Github discussions and issues, ISPC mailing list (ispc-users@googlegroups.com), or on Twitter
(@ispc_updates).

Runtime Dependencies:

Intel(R) Graphics Compute Runtime https://github.com/intel/compute-runtime/releases/tag/20.50.18716
Level Zero Loader https://github.com/oneapi-src/level-zero/releases/tag/v1.0.22
OpenMP Runtime. Consult your Linux distribution documentation for the installation of OpenMP runtime
instructions. No specific version is required.

Components revisions used in GPU-enabled build:

KhronosGroup/SPIRV-LLVM-Translator@ab5e12a
intel/vc-intrinsics@2de2dd4
oneapi-src/level-zero@c6fa2cd (v1.0.22)
llvm/llvm-project@ef32c61 (llvmorg-10.0.1) + patches from llvm_patches folder

UPDATE: macOS build was updated on 21 Dec 2020.

=== v1.14.1 === (28 August 2020)

29 Aug 00:10
Compare
Choose a tag to compare

A minor ISPC update with a bug fix for AVX512 detection problem on macOS
(for more details see issue #1854) and update of GPU version to use Level0 v1.0.
CPU binaries are based on patched LLVM 10.0.1.

Runtime Dependencies for GPU-enabled build:

Components revisions used in GPU-enabled build:
KhronosGroup/SPIRV-LLVM-Translator@1a5c52f
intel/vc-intrinsics@f39ff1e
oneapi-src/level-zero@fcc7b7a (v1.0)
llvm/llvm-project@ef32c61 (llvmorg-10.0.1) + patches from llvm_patches folder

=== v1.14.0 === (30 July 2020)

31 Jul 21:39
Compare
Choose a tag to compare

An ISPC release with several improvements for CPU and initial support of Intel
graphics hardware architectures. The binaries in this release include CPU versions
for Windows, Linux, and macOS, as previous releases, plus a GPU-enabled Linux binary,
which supports both CPU and GPU. CPU binaries are based on patched LLVM 10.0.1.

CPU changes include:

  • new avx2-i8x32, avx2-i16x16, avx512skx-i8x64, avx512skx-i16x32 targets.
  • "generic" targets were removed.
  • several stability fixes, including bugs discovered during fuzzing ISPC by YARPGen.
  • integer division performance improvements.
  • support for __vectorcall calling convention on Windows x64 (enabled by
    '--vectorcall')

Using GPU-enabled Linux binary you can build ISPC programs and run them on Intel(R)
Core(tm) Processors with Gen9 graphics (formerly Skylake, Kaby Lake, Coffee Lake)
using new '--target' options: 'genx-x8' and 'genx-x16'. For code generation ISPC
uses Vector Compute backend which is the part of 'Intel(R) Graphics Compute Runtime'
through SPIR-V interface. This release also includes ISPC Runtime based on
'oneAPI Level Zero' for GPU and 'OpenMP Runtime' for CPU, which creates unified
abstraction for executing ISPC code on CPU and GPU.

More details are available here: https://ispc.github.io/ispc_for_gen.html

For build instructions check our docker recipe:
https://github.com/ispc/ispc/blob/main/docker/ubuntu/gen/Dockerfile

The stability and performance of GPU part of this release is not mature yet but we
strongly encourage to try it out and give us feedback! You can reach us through
Github issues, ISPC mailing list (ispc-users@googlegroups.com), or on Twitter
(@ispc_updates).

Runtime Dependencies

Components revisions used in this build:
KhronosGroup/SPIRV-LLVM-Translator@1e661b2
intel/vc-intrinsics@a0b66f2
oneapi-src/level-zero@317bc0d (v0.91.21)
llvm/llvm-project@d32170d (llvmorg-10.0.0)

=== v1.13.0 === (23 April 2020)

24 Apr 06:27
Compare
Choose a tag to compare

An ISPC update, which graduates cross-compilation support to production and
has multiple code generation improvements and bug fixes. AVX512 targets may
get the biggest performance boost due to changed internal representation of
masks (we observed up to 5% speedups), and new switch --opt=disable-zmm,
which disables using zmm registers in favour of ymm for avx512skx-i32x16 target.
All targets will definitely benefit from LLVM 10.0 backend used in this release.

Here is the list of other changes:

  • new switch --support-matrix was added to display information about supported
    cross-compilation targets, which are managed by --target-os=<os>,
    --target=<ispc-target>, and --arch=<arch> switches.
  • representation of 'bool' type in storage was changed to match C/C++ (i.e. one
    bool occupies one byte) for better interoperability.
  • type aliases for unsigned types were added: uint8, uint16, uint32,
    uint64, and uint. To detect if these types are supported you can check if
    ISPC_UINT_IS_DEFINED macro is defined.
  • extract()/insert() for boolean arguments, and abs() for all integer and
    FP types were added to standard library.
  • FreeBSD was added to the list of supported target OSes, but it's not well
    tested.

Supported platforms in this release are below. Rows are hosts, columns are
targets. x86 and arm are both 32 and 64 bits, where appropriate.

Windows Linux macOS Android iOS PS4 FreeBSD
Windows x86 x86, arm x86 x86, arm x86 x86, arm
Linux x86, arm x86 x86, arm x86, arm
macOS x86, arm x86 x86, arm arm x86, arm

v1.12.0 (15 August 2019)

03 Apr 04:27
Compare
Choose a tag to compare

This ISPC update includes experimental cross OS compilation support, ARM and AARCH64
support and a bunch of language features and stability fixes.

Here are the details:

  • ISPC is now a cross OS compiler - You can build ISPC programs for Windows, Linux,
    macOS, iOS, Android and PS4 targets from Windows, Linux and macOS hosts.
  • ARM and AARCH64 support has been enabled for ISPC. ARM support currently exists for
    neon-i32x4, neon-i8x16 and neon-i16x8 targets. AARCH64 is supported for neon-i32x4
    as well as for a new "double-pumped" 8-wide target: neon-i32x8.
  • A new 128-bit AVX2 target (avx2-i32x4) was added.
  • Added a CPU definition for Ice Lake client CPUs (--cpu=icl). Note that there is no
    special target for new instructions in Ice Lake flavor of AVX512 yet. For now, You
    can use SKX targets (avx512skx-i32x8 and avx512skx-i32x16) with --cpu=icl.
  • Removed the generic targets for KNC and KNL, so ISPC does not have KNC support anymore.
    KNL is still supported through native target (avx512knl-i32x16).
  • Removed AVX1.1 (IvyBridge) targets (use AVX1 targets instead).
  • Introduced new language features:
    • noinline function qualifier.
    • rsqrt_fast() and rcp_fast() functions.
    • Static initialization for varying.
  • A new command line option --emit-llvm-text was added to dump LLVM IR in text format.

An ISPC top-of-trunk build is now available in the Compiler Explorer

The release is based on a patched LLVM 8.0.0 backend.

v1.11.0 (19 April 2019)

03 Apr 04:27
Compare
Choose a tag to compare

An ISPC update with a bunch of new features and stability bug fixes based on a
patched LLVM 8.0.0 backend.

Notable new features are:

  • A new 256-bit AVX512 target (avx512skx-i32x8).
  • Modified -O1 switch to optimize for size.
  • #pragma once in auto-generated headers.
  • Better debugging support with -O0.

Also we resumed support for PS4 build.

To efficiently write ISPC programs you can now use the ISPC plug-in for VSCode.

v1.10.0 (18 January 2019)

03 Apr 04:26
e8200d0
Compare
Choose a tag to compare

An ISPC update, which brings several new features, has a bunch of stability and
performance bug fixes, and infrastructure improvements for those who are
interested in participating in hacking on the ISPC trunk. We also are also
deprecating KNC support and the KNL-generic target (in favor of the native KNL
target, i.e. avx512knl-i32x16).

We've added:

  • a streaming store and load implementation (see "Streaming Load and Store
    Operations" section in documentation)
  • support for 64 bit wide types in aos_to_soa/soa_to_aos intrinsics
  • an option to specify assembler style (see --x86-asm-syntax switch
    documentation is help message)
  • a pragma to disable warnings locally (search for #pragma ignore in
    documentation)

Our examples include a new SGEMM example which demonstrates different versions
of matrix multiply with various level of optimality. It is useful for learning
how to start from a naive implementation and then add various optimizations
afterwards. Also, our build system is now based on CMake, as are the examples.
So you can use it as a reference for integrating ISPC to your CMake-based
project.

For those who are interested in hacking ISPC or trying a bleeding edge
development version, we have CI on Linux (Travis-CI) and Windows (Appveyor),
including automatic package builds on Windows. We also have Dockerfiles, which
demonstrate bringing up your environment for ISPC development.

The release is based on a patched LLVM 5.0.2 backend.

v1.9.2 (10 November 2017)

03 Apr 04:26
417b33e
Compare
Choose a tag to compare

An ISPC update, which brings out-of-the-box debug support on Windows,
better performance of most of the targets and a bunch of stability
and performance bug fixes.

The release is based on patched LLVM 5.0 backend.

Windows build is now supports only VS2015 and newer. If you are using earlier
versions, the only known problem that you may encounter is a problem with
print ISPC library function.

AVX512 targets are the main beneficiaries of a newer LLVM backend and
demonstrate the biggest performance improvements. SVML support is also
now available on these targets (requires linking by ICC compiler).