Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for NVIDIA and AMD SYCL targets with the intel compiler #1046

Open
rafbiels opened this issue Jun 30, 2023 · 8 comments
Open

Add support for NVIDIA and AMD SYCL targets with the intel compiler #1046

rafbiels opened this issue Jun 30, 2023 · 8 comments

Comments

@rafbiels
Copy link

I would love to be able to select:

  • language: C++
  • compiler: icx (recent versions)
  • compiler flags: -fsycl -fsycl-targets=nvptx64-nvidia-cuda

and successfully compile and browse PTX device code. It would be also amazing if the amdgcn-amd-amdhsa target could be also supported in a similar way.

The support for NVIDIA and AMD device code generation from SYCL is part of the open-source Intel LLVM fork, but it is not enabled by default and thus not shipped either in the oneAPI toolkit releases or in the nightly builds in GitHub. Codeplay provides binaries to patch (plug-in) the support on top of the oneAPI releases (currently for 2023.0 and 2023.1) for a limited set of platforms at https://developer.codeplay.com/. Scripted downloads are currently possible after creating a free user account and generating an API token.

The plug-in installation requires simply downloading and executing a shell script. IIUC we could add it quite easily here:

intel-cpp:

and also add the required CUDA and ROCm device library locations in the compiler environment.

The only question would be library version compatibility. Currently, our plug-in binary release matrix is:

oneAPI CUDA ROCm OS
2023.1.0 12.0 4.5.2, 5.4.3 Ubuntu 22.04
2023.0.0 11.7 4.5.2 Ubuntu 20.04

Some level of compatibility is expected for other versions of CUDA/ROCm and OS, but not guaranteed to work. We could discuss providing binaries in another configuration through other channels if these turn out to not work for Compiler Explorer.

Happy to help with configuring / testing this in the near future.

@partouf
Copy link
Contributor

partouf commented Jun 30, 2023

Hi, thanks for filing the issue

We can only run Ubuntu 20.04 because of sandboxing limitations. We have all the CUDA versions installed, but the only one that is installed as a driver is on our instances with a Nvidia GPU and we have been keeping that up to date to the latest, so that's v12 currently.

What kind of installation does this require, just the files or the actual driver installed?

And will we need to install the plugin inside of the ICX installation directories, or can we place them elsewhere and have something point to them?

@partouf
Copy link
Contributor

partouf commented Jul 1, 2023

A lot seems to be explained here https://developer.codeplay.com/products/oneapi/nvidia/2023.1.0/guides/get-started-guide-nvidia

Haven't looked at the script yet, but

For compiling:

  • requires env variables PATH and LD_LIBRARY_PATH with CUDA path references, we can do that
  • during execution on a GPU instance requires SYCL_DEVICE_FILTER=cuda

@partouf
Copy link
Contributor

partouf commented Jul 1, 2023

Ok it does like it installs it inside of the intel oneapi directory, which is a little complicated but not impossible.

It also needs the driver, so we'll need to transfer the compilers themselves to the gpu.

And I think it should still be able to work if it compiled with cuda 11 libraries and executed on system with cuda 12 drivers.

I do think we're gonna need a testing instance to test this probably, we'll need to do a bunch of things to make that happen first. Cc @mattgodbolt

But I can probably mock up some things locally next week and see if my assumptions are right.

@mattgodbolt
Copy link
Member

We can make a gpu-beta or gpu-test environment or whatever but we need to ensure we shut that one down fairly eagerly :-)

@partouf
Copy link
Contributor

partouf commented Jul 1, 2023

Ping @rscohn2 just fyi that we’ll be working on this

@rscohn2
Copy link
Contributor

rscohn2 commented Jul 1, 2023

I did not know there was a plugin to make the product compilers target cuda. That will be useful.

In #1039 I was proposing publishing the open source nightly builds. @rafbiels says that does not support cuda. We were discussing building from source instead of downloading binaries. It would be nice if that could take advantage of whatever you are doing to support product compilers targeting cuda. @rafbiels will the plugin support nightly builds?

For my own work, I would be more likely to use the intel product compiler than open source build to target Cuda so I hope this happens.

Intel has recently made free Intel GPUs available in the cloud. I hooked up my CI to test on intel GPU for every commit and it works well: https://github.com/oneapi-src/distributed-ranges/blob/6b2cb84a84d0d86f9b5e0cfb8dd7ebfa366472f2/.github/workflows/ci.yml#L96
For this work, you need amd/nvidia for testing so it may not be that useful, but I am mentioning anyway.

@rafbiels
Copy link
Author

rafbiels commented Jul 1, 2023

Many thanks for the prompt follow-up! Some replies / extra info:

  • The cuda driver is not needed for plugin installation or for compilation, just the cuda libs. AFAIK the driver is only needed for execution.
  • CUDA drivers are backward compatible, so to my knowledge it's fine to compile with CUDA 11 libs and execute with CUDA 12 driver.
  • Indeed the plugin installs itself in the oneAPI directory. There is an option --extract-only which unpacks the lib but skips the installation, see --help. Potentially hooking its path into the environment is sufficient, then you don't need to touch the oneAPI installation.
  • In fact, compilation for the nvptx64 target seems possible even without the plugin and cuda libs using the -nocudalib option, but I don't think PTX can be extracted then: https://godbolt.org/z/nMzY8jf9G. Execution will certainly not work without the plugin + cuda libs + cuda driver + NVIDIA GPU.
  • To extract PTX from the icpx output binary (assuming a.out), one can do:
clang-offload-extract a.out --output=device-code
cuobjdump --dump-ptx device-code.0

The clang-offload-extract tool comes in the oneAPI installation in the bin-llvm directory, e.g. oneapi/2023.1.0.46401/compiler/2023.1.0/linux/bin-llvm/clang-offload-extract.

  • The optional SYCL_DEVICE_SELECTOR/ONEAPI_DEVICE_SELECTOR env variable influences the runtime behaviour of the default device selector in SYCL. User code may explicitly select other devices (which may succeed or not, depending on whether they're available). If the binary is compiled only for nvptx64 and the NVIDIA GPU + driver and libs are available, it will select the NVIDIA GPU. I think the env variable shouldn't be needed.

@rscohn2 the nightly build binaries also don't include the NVIDIA/AMD support. If building from source, on needs to pass the --cuda / --hip options to the configuration script:
https://github.com/intel/llvm/blob/sycl/sycl/doc/GetStartedGuide.md

Let me know if I can help with testing in any way.

@partouf
Copy link
Contributor

partouf commented Jul 1, 2023

All sounds good, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants