Skip to content
OpenCL/SPIR-V implementation of HIP
C++ C LLVM Objective-C CMake
Branch: master
Clone or download
Latest commit 8d9d398 Aug 1, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
doc Initial commit Jul 26, 2019
lib Fix LLVM_LINK in CMake Jul 30, 2019
samples Initial commit Jul 26, 2019
spdlog Initial commit Jul 26, 2019
CMakeLists.txt Fix LLVM_LINK in CMake Jul 30, 2019
LICENSE Initial commit Jul 26, 2019
README.md Minor fixes to README.md Jul 31, 2019

README.md

HIPCL library


What is HIP?

Heterogeneous-compute Interface for Portability, or HIP, is a C++ runtime API and kernel language that allows developers to write code that runs on both AMD and NVidia GPUs. CUDA applications can be converted to HIP in a largely automated fashion.

What is HIPCL

HIPCL is a library that allows applications using the HIP API to be run on devices which support OpenCL and SPIR-V, thus providing a portability path from CUDA to OpenCL. HIPCL development is led by Customized Parallel Computing group of Tampere University, Finland.

Building HIPCL


There are a few extra install/usage options documented in 'doc' directory.

HIPCL has some prerequisites to build:

  • LLVM + patched Clang
  • LLVM-SPIRV translator tool from Khronos
  • An OpenCL implementation with (at least partial) 2.x support; HIPCL requires Shared Virtual Memory and clCreateProgramWithIL() support

Clang + LLVM

You'll need to build a patched Clang that can compile HIP source code to ELF+SPIR-V fat binaries.

Download LLVM + Clang:

git clone https://github.com/llvm-mirror/llvm.git
cd llvm
git checkout -b release_80 origin/release_80
cd tools
git clone https://github.com/cpc/hipcl-clang.git clang
cd clang
git checkout -b release_80 origin/release_80

Build+install LLVM/Clang:

cmake -DCMAKE_INSTALL_PREFIX=<llvm_destination_dir> [other cmake flags] llvm-git-directory
make
sudo make install

LLVM-SPIRV Translator

download, build+install the LLVM-SPIRV translator:

git clone https://github.com/KhronosGroup/SPIRV-LLVM-Translator.git
cd SPIRV-LLVM-Translator
git checkout -b release_80 origin/llvm_release_80
mkdir build; cd build
cmake -DLLVM_DIR=<llvm_destination_dir>/lib/cmake/llvm ..
make llvm-spirv
sudo cp tools/llvm-spirv/llvm-spirv <llvm_destination_dir>/bin/

Known Supported OpenCL Implementations

At least Intel's "NEO" OpenCL implementation supports 2.x and SPIR-V on Intel GPUs.

It's also possible to use a sufficiently recent (2019/07+) POCL, but it must be built with LLVM-SPIRV support:

git clone https://github.com/pocl/pocl.git
cd pocl
mkdir build; cd build
cmake -DCMAKE_INSTALL_PREFIX=/usr \
      -DWITH_LLVM_CONFIG=<llvm_destination_dir>/bin/llvm-config \
      -DLLVM_SPIRV=<llvm_destination_dir>/bin/llvm-spirv \
      ..
make
sudo make install

The last step (sudo make install) is optional - it's possible to use Pocl from build directory (by exporting some env variables: POCL_BULDING=1 and OCL_ICD_VENDORS=<pocl-build-dir>/ocl-vendors). Note that -DCMAKE_INSTALL_PREFIX=/usr implies system-wide installation. See https://github.com/pocl/pocl/blob/master/doc/sphinx/source/install.rst for details.

Whatever you end up using, make sure that clinfo lists your chosen OpenCL implementation.

Build HIPCL Library

build+install the HIPCL library:

git clone https://github.com/cpc/hipcl.git
cd hipcl
mkdir build ; cd build;
cmake -DCMAKE_INSTALL_PREFIX=<hipcl_install_dir> \
      -DCMAKE_CXX_COMPILER=<llvm_destination_dir>/bin/clang++ \
      -DCMAKE_C_COMPILER=<llvm_destination_dir>/bin/clang \
      ..
make

The samples directory contains some examples; these can be run from build directory, individually or via ctest.

make install will create <hipcl_install_dir>/{lib/libhipcl.so, share/kernellib.bc, include/*.hh} and copy the examples to <hipcl_install_dir>/bin directory.

Note that CMake removes RPATH at make install time, which means that the samples installed into <hipcl_install_dir>/bin will look for libhipcl.so in the default system library paths (/usr/lib and such).

Example

To convert a CUDA source to HIP source, use the hipify-clang tool from AMD's HIP repository: https://github.com/ROCm-Developer-Tools/HIP/tree/master/hipify-clang

Usage:

hipify-clang [hipify args] -- [clang cuda args]

E.g.

./hipify-clang -inplace -print-stats example.cu -- -x cuda --cuda-path=/usr/local/cuda-8.0 -I /usr/local/cuda-8.0/samples/common/inc

This should produce a source with CUDA API translated to HIP API calls. To build a HIPCL executable from this source:

Compile:

export CXX_FLAGS="-x hip --target=x86_64-linux-gnu --hip-device-lib-path=<hipcl_install_dir>/share --hip-device-lib=mathlib.bc -pthread -std=c++11"
export CXX_INCLUDES="-I<hipcl_install_dir>/include"
<llvm_destination_dir>/bin/clang++ $CXX_INCLUDES $CXX_FLAGS -o example.o -c example_hip.cc

Link:

<llvm_destination_dir>/bin/clang++ -o example example.o -L<hipcl_install_dir>/lib -lhipcl -pthread -lOpenCL

Known Issues


Some of these are simply not yet implemented, some are missing because they would require an OpenCL extension.

Device Side / Math Library

OpenCL Extension required:

  • __fsqrt_rd and various intrinsics for add/sub/div/mul with predefined rounding mode (currently these are mapped to OpenCL variants with default rounding mode)
  • __shfl and various special functions. Some may be possible to implement via subgroups in opencl 2.0

Not yet implemented features:

  • Statically sized shared memory works, but dynamic shared memory is not yet implemented

Host runtime API

Implemented with caveats:

  • hipEventElapsedTime() can return imprecise values

  • hipModuleLaunchKernel accepts the "extra" argument, but the size of pointed to memory (HIP_LAUNCH_PARAM_BUFFER_SIZE) must be exactly the sum of sizes of individual arguments - no padding is allowed. Otherwise it's impossible to figure out how to set the OpenCL kernel arguments.

  • A certain amount of Device properties are impossible to get via OpenCL API. Values reported by HIPCL are completely made up.

Not implemented and/or require extension to OpenCL:

  • hipError_t hipModuleLoadData(hipModule_t* module, const void* image);

  • hipError_t hipModuleLoadDataEx(hipModule_t *module, const void *image,...);

    This API is not possible to implement with SPIR-V binaries, because there is no size parameter (only a void* pointer), and SPIR-V binaries don't have their size embedded. It might be possible to implement with disassembled text format of SPIR-V.

  • hipSetDeviceFlags(unsigned flags)

    The flags change how the runtime waits for results (yield thread to OS or busy waiting / spinning)

  • hipError_t hipStreamCreateWithPriority(hipStream_t* stream, unsigned int flags, int priority);

  • hipError_t hipStreamCreateWithFlags(hipStream_t* stream, unsigned int flags);

  • hipError_t hipEventCreateWithFlags(hipEvent_t* event, unsigned flags);

  • hipError_t hipDeviceSetCacheConfig(hipFuncCache_t cacheConfig);

  • hipError_t hipDeviceGetCacheConfig(hipFuncCache_t* cacheConfig);

  • hipError_t hipDeviceGetSharedMemConfig(hipSharedMemConfig* pConfig);

  • hipError_t hipDeviceSetSharedMemConfig(hipSharedMemConfig config);

  • hipError_t hipFuncSetCacheConfig(const void* func, hipFuncCache_t config);

  • hipError_t hipCtxSetCacheConfig(hipFuncCache_t cacheConfig);

  • hipError_t hipCtxSetSharedMemConfig(hipSharedMemConfig config);

  • hipError_t hipPointerGetAttributes(hipPointerAttribute_t* attributes, const void* ptr);

  • hipError_t hipExtMallocWithFlags(void** ptr, size_t sizeBytes, unsigned int flags);

  • hipError_t hipGetDeviceProperties(hipDeviceProp_t *prop, int deviceId);

  • hipError_t hipStreamQuery(hipStream_t stream)

  • hipError_t hipModuleGetGlobal(hipDeviceptr_t* dptr, size_t* bytes, hipModule_t hmod, const char* name);

  • hipError_t hipHostGetFlags(unsigned int* flagsPtr, void* hostPtr);

  • hipError_t hipFuncGetAttributes(hipFuncAttributes* attr, const void* func);

  • hipError_t hipDeviceGetPCIBusId(char* pciBusId, int len, int device);

  • hipError_t hipDeviceGetByPCIBusId(int* device, const char* pciBusId);

  • hipError_t hipSetDeviceFlags(unsigned flags);

Symbol API Not Implemented
  • hipError_t hipMemcpyToSymbolAsync(void*, const void*, size_t, size_t, hipMemcpyKind, hipStream_t, const char*);
  • hipError_t hipMemcpyFromSymbol(void*, const void*, size_t, size_t, hipMemcpyKind, const char*);
  • hipError_t hipMemcpyFromSymbolAsync(void*, const void*, size_t, size_t, hipMemcpyKind, hipStream_t, const char*);
  • hipError_t hipGetSymbolAddress(void** devPtr, const void* symbolName);
  • hipError_t hipGetSymbolSize(size_t* size, const void* symbolName);
  • hipError_t hipMemcpyToSymbol(void*, const void*, size_t, size_t, hipMemcpyKind, const char*);
  • hipError_t hipMemcpyToSymbol(const void* symbolName, const void* src, size_t sizeBytes, size_t offset __dparm(0), hipMemcpyKind kind __dparm(hipMemcpyHostToDevice));
Peer2Peer Functions Are Not Implemented Yet
  • hipError_t hipDeviceCanAccessPeer(int* canAccessPeer, int deviceId, int peerDeviceId);
  • hipError_t hipDeviceEnablePeerAccess(int peerDeviceId, unsigned int flags);
  • hipError_t hipDeviceDisablePeerAccess(int peerDeviceId);
  • hipError_t hipMemcpyPeer(void* dst, int dstDeviceId, const void* src, int srcDeviceId, size_t sizeBytes);
  • hipError_t hipMemcpyPeerAsync(void* dst, int dstDeviceId, const void* src, int srcDevice, size_t sizeBytes, hipStream_t stream __dparm(0));
PROFILER Not implemented
  • hipError_t hipProfilerStart();
  • hipError_t hipProfilerStop();
API CALLBACKs Not implemented
  • hipError_t hipRegisterApiCallback(uint32_t id, void* fun, void* arg);
  • hipError_t hipRemoveApiCallback(uint32_t id);
  • hipError_t hipRegisterActivityCallback(uint32_t id, void* fun, void* arg);
  • hipError_t hipRemoveActivityCallback(uint32_t id);
TEXTURES not implemented
You can’t perform that action at this time.