Heterogeneous parallel computing for the web #92

anssiko · 2020-09-01T15:44:33Z

The Heterogeneous parallel programming with open standards using oneAPI and Data Parallel C++ talk by @jeffhammond explains the motivation for new heterogeneous parallel computing abstractions:

The motivation for what we're doing here is that we have an ever-increasing diversity and complexity in computer architecture.

This has been going on for 20 years or so with the introduction of multicore and SIMD units, and obviously GPUs and other forms of accelerators.

And I think as this audience knows AI accelerators, special purpose processors have just exploded over the last few years, and there's really no indication that there's gonna be any convergence in architecture or, or simplification.

We're gonna be dealing with this problem for a while now, but, and furthermore, even within families of architectures like GPUs, or, FPGAs, obviously there are different vendors and different programming models and different execution models.

Web platform defines a number of Web APIs as outlined in Web Platform: a 30,000 feet view / Web Platform and JS environment constraints talk by @dontcallmedom, but has limited parallel computing capabilities, and no APIs for heterogeneous parallel programming.

ML libraries used on the web such as TensorFlow.js interface with WebGL API to allow user code to make use of parallelized floating point computation. A way to make use of CPU parallelism has been introduced with WebAssembly SIMD. Also the next-gen graphics API for the web WebGPU is evolving, adding new features such as subgroups discussed in #66. For performance comparison how the current web exposed parallel computing APIs perform in the ML context, see the TensorFlow.js talk by @annxingyuan.

Specifically, there's no explicit API to access dedicated AI accelerators or other domain-specific accelerators. An effort is underway to specify a Web API for neural network inference hardware acceleration that abstracts away underlying hardware and as such could make use of XPUs. But this API is domain-specific, not generic.

This brings up a question:

@jeffhammond does the most recent version of SYCL improve on OpenCL security characteristics that would make it suitable to be exposed to the web? WebCL JS bindings did not ship in browsers citing security concerns (OOB memory access, DoS...), but that work happened in 2013-14 so I'm sure there has been a lot of advancements in this space over the recent years.

jeffhammond · 2020-09-01T18:23:57Z

Unfortunately, I don't know. I do not have any experience with browser security. I will forward your question to colleagues to see if anybody else knows better..

rolandschulz · 2020-09-01T19:10:48Z

My view is: SYCL doesn't try to define its own custom backends. To solve the problem you describe I would suggest: Ask the compiler vendor of your choice to provide a SYCL backend which compiles to WASM SIMD and/or WebGPU/WGSL. That would allow you to write SYCL code which can be executed efficiently for the web.

anssiko · 2020-09-02T08:18:35Z

Thanks @jeffhammond and @rolandschulz for your swift responses.

Ask the compiler vendor of your choice to provide a SYCL backend which compiles to WASM SIMD and/or WebGPU/WGSL. That would allow you to write SYCL code which can be executed efficiently for the web.

As for the SYCL backend, let us get reactions from @lukewagner for WebAssembly and @Kangz for WebGPU/WGSL perspectives.

I'll also nudge @cynthia who might remember better what happened with WebCL back in 2013-14 and what were the specific security concerns at that time.

Kangz · 2020-09-04T09:37:52Z

+1 to what @rolandschulz said. The problem with SYCL for the GPU (and WebCL) is its "physical" addressing mode where pointers are just numbers that you can manipulate any way you want. This makes it incredibly hard/inefficient to sandbox them on the GPU because there we can't do page-table tricks like we use for WASM sandboxing.

WebGPU has a "logical" addressing mode where pointers are well-typed and can only be used to get pointer to valid internal objects. There are transforms possible to convert OpenCL C to "logical" addressing mode: https://github.com/google/clspv

jeffhammond · 2020-09-04T21:38:48Z

I am confused about the difficulty with sandboxing. SYCL buffers and accessors are designed to be opaque and to support devices that do not support pointers, so I am surprised to hear them described as raw and unsafe.

Kangz · 2020-09-04T21:49:55Z

I must be mistaken then. I thought SYCL was a kind of C++ subset and had assumed it was using a physical addressing model, but if things are opaque then it should be even easier to convert to WGSL for WebGPU.

jeffhammond · 2020-09-05T00:05:17Z

SYCL is based on C++ and it can be implemented using pure C++17 (e.g. https://github.com/triSYCL/triSYCL) but can use OpenCL/SPIR-V or other device back-ends.

I have no personal experience but CodePlay supports the automotive market (https://www.codeplay.com/solutions/automotive/), which has some kind of safety requirements, although perhaps different ones than web browsers.

@keryell Do you have anything to add here?

keryell · 2020-09-07T10:11:12Z

Often what graphics and semiconductor people do not grasp about SYCL is that it is single-source like OpenMP or CUDA, in the sense there is also the host code of the application also in the SYCL code, which is plain C++ with some offloading DSL defined with pure C++ classes, without any extension (by contrast with CUDA). This makes difficult to compare a full application in SYCL to an application made from kernels written in isolation with some programming languages (OpenCL C, GLSL, HLSL...) and a host API (OpenCL, OpenGL, Vulkan, DX...) with a lot of unsafe type erasure at the boundary between host code and kernel code.
That said, if a SYCL programmer uses just modern C++ coding style instead of unsafe old C/C++ coding style (spaghetti code with pointers and casts everywhere...), the outlined kernel code should be pretty safe. Good news, the compiler or tooling environment can provide a mode to enforce this style of safe coding.
Even clspv cannot compile random unsafe OpenCL C code to safe typed-buffer SPIR-V. There are some restrictions too.

anssiko added the Web Platform Foundations Web Platform Foundations for Machine Learning label Sep 3, 2020

anssiko mentioned this issue Sep 18, 2020

Add workshop agenda for Live Sessions #2 and #3 #104

Merged

anssiko modified the milestone: 2020-09-22 Live Session #2 Sep 21, 2020

dontcallmedom added the Discussion topic Topic discussed at the workshop label Oct 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Heterogeneous parallel computing for the web #92

Heterogeneous parallel computing for the web #92

anssiko commented Sep 1, 2020

jeffhammond commented Sep 1, 2020

rolandschulz commented Sep 1, 2020

anssiko commented Sep 2, 2020

Kangz commented Sep 4, 2020

jeffhammond commented Sep 4, 2020

Kangz commented Sep 4, 2020

jeffhammond commented Sep 5, 2020 •

edited

Loading

keryell commented Sep 7, 2020

Heterogeneous parallel computing for the web #92

Heterogeneous parallel computing for the web #92

Comments

anssiko commented Sep 1, 2020

jeffhammond commented Sep 1, 2020

rolandschulz commented Sep 1, 2020

anssiko commented Sep 2, 2020

Kangz commented Sep 4, 2020

jeffhammond commented Sep 4, 2020

Kangz commented Sep 4, 2020

jeffhammond commented Sep 5, 2020 • edited Loading

keryell commented Sep 7, 2020

jeffhammond commented Sep 5, 2020 •

edited

Loading