Add support for NVIDIA and AMD SYCL targets with the intel compiler #1046

rafbiels · 2023-06-30T17:52:33Z

I would love to be able to select:

language: C++
compiler: icx (recent versions)
compiler flags: -fsycl -fsycl-targets=nvptx64-nvidia-cuda

and successfully compile and browse PTX device code. It would be also amazing if the amdgcn-amd-amdhsa target could be also supported in a similar way.

The support for NVIDIA and AMD device code generation from SYCL is part of the open-source Intel LLVM fork, but it is not enabled by default and thus not shipped either in the oneAPI toolkit releases or in the nightly builds in GitHub. Codeplay provides binaries to patch (plug-in) the support on top of the oneAPI releases (currently for 2023.0 and 2023.1) for a limited set of platforms at https://developer.codeplay.com/. Scripted downloads are currently possible after creating a free user account and generating an API token.

The plug-in installation requires simply downloading and executing a shell script. IIUC we could add it quite easily here:

infra/bin/yaml/cpp.yaml

Line 807 in 1040b3d

intel-cpp:

and also add the required CUDA and ROCm device library locations in the compiler environment.

The only question would be library version compatibility. Currently, our plug-in binary release matrix is:

oneAPI	CUDA	ROCm	OS
2023.1.0	12.0	4.5.2, 5.4.3	Ubuntu 22.04
2023.0.0	11.7	4.5.2	Ubuntu 20.04

Some level of compatibility is expected for other versions of CUDA/ROCm and OS, but not guaranteed to work. We could discuss providing binaries in another configuration through other channels if these turn out to not work for Compiler Explorer.

Happy to help with configuring / testing this in the near future.

The text was updated successfully, but these errors were encountered:

partouf · 2023-06-30T23:53:50Z

Hi, thanks for filing the issue

We can only run Ubuntu 20.04 because of sandboxing limitations. We have all the CUDA versions installed, but the only one that is installed as a driver is on our instances with a Nvidia GPU and we have been keeping that up to date to the latest, so that's v12 currently.

What kind of installation does this require, just the files or the actual driver installed?

And will we need to install the plugin inside of the ICX installation directories, or can we place them elsewhere and have something point to them?

partouf · 2023-07-01T00:03:37Z

A lot seems to be explained here https://developer.codeplay.com/products/oneapi/nvidia/2023.1.0/guides/get-started-guide-nvidia

Haven't looked at the script yet, but

For compiling:

requires env variables PATH and LD_LIBRARY_PATH with CUDA path references, we can do that
during execution on a GPU instance requires SYCL_DEVICE_FILTER=cuda

partouf · 2023-07-01T10:12:35Z

Ok it does like it installs it inside of the intel oneapi directory, which is a little complicated but not impossible.

It also needs the driver, so we'll need to transfer the compilers themselves to the gpu.

And I think it should still be able to work if it compiled with cuda 11 libraries and executed on system with cuda 12 drivers.

I do think we're gonna need a testing instance to test this probably, we'll need to do a bunch of things to make that happen first. Cc @mattgodbolt

But I can probably mock up some things locally next week and see if my assumptions are right.

mattgodbolt · 2023-07-01T10:15:16Z

We can make a gpu-beta or gpu-test environment or whatever but we need to ensure we shut that one down fairly eagerly :-)

partouf · 2023-07-01T10:48:10Z

Ping @rscohn2 just fyi that we’ll be working on this

rscohn2 · 2023-07-01T11:14:30Z

I did not know there was a plugin to make the product compilers target cuda. That will be useful.

In #1039 I was proposing publishing the open source nightly builds. @rafbiels says that does not support cuda. We were discussing building from source instead of downloading binaries. It would be nice if that could take advantage of whatever you are doing to support product compilers targeting cuda. @rafbiels will the plugin support nightly builds?

For my own work, I would be more likely to use the intel product compiler than open source build to target Cuda so I hope this happens.

Intel has recently made free Intel GPUs available in the cloud. I hooked up my CI to test on intel GPU for every commit and it works well: https://github.com/oneapi-src/distributed-ranges/blob/6b2cb84a84d0d86f9b5e0cfb8dd7ebfa366472f2/.github/workflows/ci.yml#L96
For this work, you need amd/nvidia for testing so it may not be that useful, but I am mentioning anyway.

rafbiels · 2023-07-01T11:50:09Z

Many thanks for the prompt follow-up! Some replies / extra info:

The cuda driver is not needed for plugin installation or for compilation, just the cuda libs. AFAIK the driver is only needed for execution.
CUDA drivers are backward compatible, so to my knowledge it's fine to compile with CUDA 11 libs and execute with CUDA 12 driver.
Indeed the plugin installs itself in the oneAPI directory. There is an option --extract-only which unpacks the lib but skips the installation, see --help. Potentially hooking its path into the environment is sufficient, then you don't need to touch the oneAPI installation.
In fact, compilation for the nvptx64 target seems possible even without the plugin and cuda libs using the -nocudalib option, but I don't think PTX can be extracted then: https://godbolt.org/z/nMzY8jf9G. Execution will certainly not work without the plugin + cuda libs + cuda driver + NVIDIA GPU.
To extract PTX from the icpx output binary (assuming a.out), one can do:

clang-offload-extract a.out --output=device-code
cuobjdump --dump-ptx device-code.0

The clang-offload-extract tool comes in the oneAPI installation in the bin-llvm directory, e.g. oneapi/2023.1.0.46401/compiler/2023.1.0/linux/bin-llvm/clang-offload-extract.

The optional SYCL_DEVICE_SELECTOR/ONEAPI_DEVICE_SELECTOR env variable influences the runtime behaviour of the default device selector in SYCL. User code may explicitly select other devices (which may succeed or not, depending on whether they're available). If the binary is compiled only for nvptx64 and the NVIDIA GPU + driver and libs are available, it will select the NVIDIA GPU. I think the env variable shouldn't be needed.

@rscohn2 the nightly build binaries also don't include the NVIDIA/AMD support. If building from source, on needs to pass the --cuda / --hip options to the configuration script:
https://github.com/intel/llvm/blob/sycl/sycl/doc/GetStartedGuide.md

Let me know if I can help with testing in any way.

partouf · 2023-07-01T12:43:42Z

All sounds good, thanks!

rscohn2 mentioned this issue Jul 2, 2023

install intel icx open source #1039

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for NVIDIA and AMD SYCL targets with the intel compiler #1046

Add support for NVIDIA and AMD SYCL targets with the intel compiler #1046

rafbiels commented Jun 30, 2023

partouf commented Jun 30, 2023 •

edited

partouf commented Jul 1, 2023

partouf commented Jul 1, 2023

mattgodbolt commented Jul 1, 2023

partouf commented Jul 1, 2023

rscohn2 commented Jul 1, 2023

rafbiels commented Jul 1, 2023

partouf commented Jul 1, 2023

Add support for NVIDIA and AMD SYCL targets with the intel compiler #1046

Add support for NVIDIA and AMD SYCL targets with the intel compiler #1046

Comments

rafbiels commented Jun 30, 2023

partouf commented Jun 30, 2023 • edited

partouf commented Jul 1, 2023

partouf commented Jul 1, 2023

mattgodbolt commented Jul 1, 2023

partouf commented Jul 1, 2023

rscohn2 commented Jul 1, 2023

rafbiels commented Jul 1, 2023

partouf commented Jul 1, 2023

partouf commented Jun 30, 2023 •

edited