



# SYCL 2020 and the future

### Michael WOng

Acknowledgements: SYCL WG Rod Burns

SYCL and the SYCL logo are trademarks of the Khronos Group Inc.

CC BY-SA 4.0 licensed presentation



SYCL and the SYCL logo are trademarks of the Khronos Group Inc.

### SYCL Academy



# SYCL 2020 is here!

### Open Standard for Single Source C++ Parallel Heterogeneous Programming

SYCL 2020 is released after 3 years of intense work Significant adoption in Embedded, Desktop and HPC markets Improved programmability, smaller code size, faster performance Based on C++17, backwards compatible with SYCL 1.2.1 Simplify porting of standard C++ applications to SYCL Closer alignment and integration with ISO C++ Multiple Backend acceleration and API independent

> SYCL 2020 increases expressiveness and simplicity for modern C++ heterogeneous programming



the SYCL logo are trademarks of nos Group Inc.

S O Z V

2

### SYCL Academy SYCL 2020 Industry Momentum







- . Unified Shared Memory (USM)
  - Code with pointers can work naturally without buffers or accessors
  - Simplifies porting from most code (e.g. CUDA, C++)
- Parallel Reductions
  - Added built-in reduction operation to avoid boilerplate code and achieve maximum performance on hardware with built-in reduction operation acceleration.
- . Work group and subgroup algorithms
  - Efficient parallel operations between work items
- . Class template argument deduction (CTAD) and template deduction guides
  - Simplified class template instantiation
- . Simplified use of Accessors with a built-in reduction operation
  - Reduces boilerplate code and streamlines the use of C++ software design patterns
- . Expanded interoperability
  - Efficient acceleration by diverse backend acceleration APIs
- SYCL atomic operations are now more closely aligned to standard C++ atomics
  - Enhances parallel programming freedom



# Parallel Industry Initiatives



2015





SYCL 2020

C++17 Single source

2020



C++23



**SYCL 202X** C++20 Single source programming Many backend options



**SPIR**,

202X

OpenCL 1.2 **OpenCL C Kernel** Language

programming programming Many backend options **OpenCL** OpenCL OpenCL 2.2 OpenCL 3.0 SPIR SPIR

2011

2017







### SYCL Ecosystem, Research and Benchmarks

















# oneAPI and SYCL





- SYCL sits at the heart of oneAPI
- Provides an open standard interface for developers
- Defined by the industry

## target Nvidia and **AMD GPUs**

Nvidia and AMD Support in oneAPI

Supporting Perlmutter, Polaris and Frontier supercomputers

Extending DPC++ to

SYCL Academy

Open source and available to everyone

clang++ -fsycl -fsycl-targets=nvptx64-nvidiaclang++ -fsycl -fsycl-targets=amdgcn-amd-amdhsa cuda Perlmutter

Different targets using a simple compiler flag

SYCL source

code











### SYCL Academy SYCL Future Evolution



#### SYCL 2020 compared with SYCL 1.2.1

- Easier to integrate with C++17 (CTAD, Deduction Guides...)
- Less verbose, smaller code size, simplify patterns
- · Backend independent
- Multiple object archives aka modules simplify interoperability
- Ease porting C++ applications to SYCL
- Enable capabilities to improve programmability
- Backwards compatible but minor API break based on user feedback







# A Demo with C++ Parallel STL







Intel Core i7 7th generation

#### What can I do with a Parallel For Each?

size\_t nElems = 1000u; std::vector<float> nums(nElems);

std::fill\_n(std::begin(v1), nElems, 1);

std::for\_each(std::begin(v), std::end(v),
 [=](float f) { f \* f + f });
Traditional for each uses only one core,
 rest of the die is unutilized!







Intel Core i7 7th generation

#### What can I do with a Parallel For Each?

size\_t nElems = 1000u; std::vector<float> nums(nElems);

#### Workload is distributed across cores!

(mileage may vary, implementation-specific behaviour)







What can I do with a Parallel For Each?

size\_t nElems = 1000u;
std::vector<float> nums(nElems);

| <pre>std::for_each(std::execution_policy::par,</pre> |                                        |  |  |  |
|------------------------------------------------------|----------------------------------------|--|--|--|
|                                                      | <pre>std::begin(v), std::end(v),</pre> |  |  |  |
|                                                      | [=](float f) { f * f + f });           |  |  |  |

#### Workload is distributed across cores!

(mileage may vary, implementation-specific behaviour)

Intel Core i7 7th generation





### m Agent w/Display, Memory Control, **CPU** Core **CPU** Core **CPU Core** CPU Core 10000 elems 31 ci mi 0.5 c

Intel Core i7 7th generation

# What can I do with a Parallel For Each?

size\_t nElems = 1000u; std::vector<float> nums(nElems);

(mileage may vary, implementation-specific behaviour)







#### What can I do with a Parallel For Each?







SYCL and the SYCL logo are trademarks of the Khronos Group Inc.

CC BY-SA 4.0 licensed presentation





Demo Results - Running std::sort

(Running on Intel i7 6600 CPU & Intel HD Graphics 520)

| size                  | 2^16      | 2^17      | 2^18      | 2^19      |
|-----------------------|-----------|-----------|-----------|-----------|
| std::seq              | 0.27031s  | 0.620068s | 0.669628s | 1.48918s  |
| std::par              | 0.259486s | 0.478032s | 0.444422s | 1.83599s  |
| std::unseq            | 0.24258s  | 0.413909s | 0.456224s | 1.01958s  |
| sycl_execution_policy | 0.273724s | 0.269804s | 0.277747s | 0.399634s |

### SYCL Academy



# SYCL 2020 is here!

### Open Standard for Single Source C++ Parallel Heterogeneous Programming

SYCL 2020 is released after 3 years of intense work Significant adoption in Embedded, Desktop and HPC markets Improved programmability, smaller code size, faster performance Based on C++17, backwards compatible with SYCL 1.2.1 Simplify porting of standard C++ applications to SYCL Closer alignment and integration with ISO C++ Multiple Backend acceleration and API independent

> SYCL 2020 increases expressiveness and simplicity for modern C++ heterogeneous programming



S O Z V

2

### SYCL Academy Enabling Industry Engagement



- SYCL working group values industry feedback
  - https://community.khronos.org/c/sycl
  - https://sycl.tech
- SYCL FAQ
  - https://www.khronos.org/blog/sycl-2020-what-do-you-need-to-know
- What features would you like in future SYCL versions?

Open to all! <u>https://community.khronos.org/www.khr.io/slack</u> <u>https://app.slack.com/client/TDMDFS87M/CE9UX4CHG</u> <u>https://community.khronos.org/c/sycl/</u> <u>https://stackoverflow.com/questions/tagged/sycl</u> <u>https://www.reddit.com/r/sycl</u>

https://github.com/codeplaysoftware/syclacademy https://github.com/codeplaysoftware/syclacademy



CC BY-SA 4.0 licensed presentation