Command line interface to the NVIDIA OpenCL compiler
C CMake
Switch branches/tags
Nothing to show
Clone or download
ljbade Merge pull request #4 from MichaelE1000/master
Handle read-only input files. (Wrong flags in previous commit)
Latest commit 8dfa8d3 Apr 28, 2017


clcc - an NVIDIA OpenCL command line compiler

I was having problems with the NVIDIA compiler crashing on my code. As I was poking around trying to figure out what was causing the crash, I figured out the shared library interface to the NVIDIA OpenCL clang/llvm compiler.

I created a little program that uses this compiler (bundled with the drivers) to do command line compiling of code.

I hope this will be useful to people who want to integrate a OpenCL C to PTX compiler into their build chain. You might be able to run ptxas on the resulting PTX file to get cubin, but I have not yet investigated if clCreateProgramWithBinary will accept cubin.

The other useful feature is that you can control the compiler options to select a different target CUDA Compute Level (sm_XX) than the card currently installed. You can even run the compiler without needing a GPU installed at all - just extract needed nvcompiler.dll/nvcompiler32.dll/ file from the latest driver installer. I imagine this will be useful for automated/dedicated build machines that likely lack a GPU.

The format of the command is this:
clcc ["build-options"] output.ptx
"build-options" - a quoted string containing the build options to pass to the NVIDIA compiler (equivalent to clBuildProgram's options string).
It accepts all the OpenCL 1.0/1.1 clBuildProgram options (, the NVIDIA extended ones from cl_nv_compiler_options (, and these hidden extras:
-cl-nv-arch sm_XX - selects the target CUDA Compute Level architecture to compile for (sm_10 for 1.0, sm_11 for 1.1, sm_12 for 1.2, sm_13 for 1.3, sm_20 for 2.0, sm_21 for 2.1, sm_30 for 3.0 and sm_35 for 3.5)
-cl-nv-cstd=CLX.X - selects the target OpenCL C language version (CL1.0 or CL1.1)
(there may be others) - the source OpenCL C file
output.ptx - the destination PTX file

Currently I only support one input per PTX file, but the API supports multiple inputs so I may add that feature.

Try it out, test it, use it etc. and give me feedback, bugs, suggestions etc via GitHub (or on the NVIDIA forum thread).

This version worked on Windows when I wrote it (back in December 2010) with the then current NVIDIA drivers. If there are any problems with newer drivers or fixes needed please log an issue on GitHub.

I have not been able to test the Linux version for a while but other users have reported it works fine. If there are any problems on Linux or fixes needed please log an issue on GitHub.

The files include both Visual Studio 2008 for Windows build, and CMake script for Linux build.

Feel free to discuss on the NVIDIA forums: