Autotuning example cuda programs

Grigori Fursin edited this page Jun 5, 2018 · 6 revisions

[ Home ]

Please do not forget to check Getting Started Guides to understand CK concepts!

Example of compiling and running simple CUDA application via CK

Some of our colleagues use CUDA for their HPC applications and they asked us to provide support in CK.

We shared all necessary functionality (Linux and Windows) as well as various benchmarks which you can check by pulling the following repository:

 $ ck pull repo:ctuning-programs

We expect that you have CUDA drivers and toolkit (with nvcc compiler) installed on your system.

You just need to register nvcc compiler in CK via

 $ ck setup soft:compiler.cuda

CK will ask you the version number and path to CUDA toolkit (you should enter /usr if your nvcc compiler is in /usr/bin on Linux).

In case of success, you can see registered CUDA environment in CK via

 $ ck show env --tags=cuda

Now, you can check our demo CUDA program which should print CUDA device and its compute capabilities:

 $ ck compile program:tool-print-cuda-devices

In case of successful compilation, you can run this code as follows

 $ ck run program:tool-print-cuda-devices

Normally, you should see brief information about your device.

Sometimes, you may need sudo to run such code. In such case add flag --sudo, i.e.

 $ ck run program:tool-print-cuda-devices --sudo

If you can see information about device, you can use all shared CUDA benchmarks for compilation, execution, autotuning and run-time adaptation as any other standard CK benchmark described in previous "Getting Started Guide" sections.

You can see shared CUDA benchmarks via

 $ ck list program:*cuda*

Just note, that above benchmarks use our OpenME plugin library to expose internal application parameters and characteristics to outside world. However, it can not be compiled with nvcc, hence you need either to install it using any standard compiler or compile and run non CUDA benchmark first (above library will be then installed automatically).

For example, compile first benchmark cbench-automotive-susan via

 $ ck compile program:cbench-automotive-susan

In case of successful compilation, you can now compile and run your CUDA benchmarks using our universal program compilation and execution workflow (pipeline) simply as:

 $ ck run pipeline:program program_uoa=polybench-cuda-2mm

By the way, you can similarly use OpenCL library provided by CUDA by registering it in CK via

 $ ck setup soft:lib.opencl.cuda

In such case, you should be able to compile and run various shared OpenCL benchmarks such as 2mm via

 $ ck run pipeline:program program_uoa=polybench-opencl-2mm

Questions and comments

You are welcome to get in touch with the CK community if you have questions or comments!

Clone this wiki locally
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.