Skip to content
cuda and opencl wrapper library for use in GPU clusters
C Perl Shell
Find file
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


Welcome to libfairydust!

libfairydust is a wrapper around NVIDIAs cuda and OpenCL implementation.
AMDs `stream' (APP) OpenCL implementation is also supported

Report bugs and problems to <>

~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ 
FAQ Index:

1.0 Why should i use libfairydust?
1.1 Great! How can i compile libfairydust?

2.0 How to use libfairydust
2.1 Playing with the enviroment (local testing)
2.2 Using fairyd (cluster installation)

~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ 

1.0 Why should i use libfairydust?
libfairydust was written to be used in GPU-Clusters where multiple GPUs
are attached to one physical host and each GPU can be used by a different

Most people do stuff like this:

 nvret = cudaSetDevice(0)
 if(nvret != CUDA_SUCCESS) {
    die_in_some_fata_way("oh noes!");

This works great on your local workstation with one GPU but fails in 
spectacular ways on a GPU cluster. In the best case your code will simply crash.
In the worst case *all* your processes will just run on GPU#0 and you won't
even know about it.

Libfairydust solves this problem by simulating `fake-devices': If an application
requests GPU#0, libfairydust can re-route all requests to some other GPU!
This requires *NO* changes to existing applications: The application won't even
know about this: libfairydust works 100% transparent.

1.1 Great! How can i compile libfairydust?
Libfairydust has been tested with Cuda 3.2 and gcc 4.1.2 + 4.4.4 on x86_64 linux. Support for 32bit installations is currently not implemented.

Compiling it is as easy as doing:

 $ cd libfairydust/src
 $ make

 This should produce a file called ./obj/

(Note: The makefile expects to find all cuda headers at
 /usr/local/cuda/include OR/AND at $(CUDA_INSTALL_PATH)/include)

To compile an AMD-Compatible version run:
 $ cd libfairydust/src
 $ make atidust

 This should produce a file called ./obj/
 (Note: The rest of this documentation will talk about
  `' - the ati version works in exactly
  the same way - just replace the library name when needed)

2.0 How to use libfairyudst
-------------------------------------------------- needs to get pre-loaded by ld-linux.
To do this, run something like this in your shell:

 $ export LD_PRELOAD=/path/to/obj/

That's it: Libfairydust will now hijack all cuda/openCL calls!

2.1 Playing with the environment (local testing)
Running any cuda/openCL application with libfairydust preloaded
should now show such an error message:

$ ./deviceQuery
 ./deviceQuery Starting...
  CUDA Device Query (Runtime API) version (CUDART static linking)
 fairydust.c starting up - (C) 2010-2011 Adrian Ulrich <>
 fairydust.c release git-108, compiled with gcc-4.4.4 and cuda-3020 at 1295351458
 connection to fairyd failed: aborting

Booh! What happened? Libfairyd asked 'fairyd' for some new GPUs but fairyd isn't
running yet.

Don't worry: We can use libfairydust without a running fairyd instance:

 $ export FDUST_FORCE_DEBUG=1   # get some debug infos
 $ FDUST_ALLOCATE=0 ./deviceQuery
`deviceQuery' will now show exactly ONE GPU (even if you have 10 GPUs)

If you have multiple GPUs you can try to run

 $ FDUST_ALLOCATE=1 ./deviceQuery

It will still show only one GPU, but this time libfairydust tricked libcuda into
beliving that GPU#1 is actually GPU#0

You might also have noticed that libfairydust changed the device names into something
like this.

   Device = GeForce 9500 GT - fdust{v:h}={0:1}

In this case libfairydust simulated a VIRTUAL (v) device 0.
The REAL (h (hardware)) device for this virtual device is set to device 1

`FDUST_ALLOCATE' also supports multiple GPUs: Setting it to
 "0 3 2"
 Will map:

Setting FDUST_ALLOCATE="@" will cause libfairydust to return all found GPUs

2.2 Using fairyd (cluster installation)
In a 'real-world' installation you will have to adapt to your batchsystem.
The included version of works with LSF 7.x and was written for use on the
Brutus-Cluster of the ETH Zurich.

Blabla: Fixme: Write more text :-)

 - add startup workflow (connect to -> reply)
 - protocol description
 - testing implementation (use xterm as launcher?)

Something went wrong with that request. Please try again.