CUDA Random Forests for Image Labeling (CURFIL)
This project is an open source implementation with NVIDIA CUDA that accelerates random forest training and prediction for image labeling by using the massive parallel computing power offered by GPUs.
CURFIL is the result of Benedikt Waldvogel’s master thesis "Accelerating Random Forests on CPUs and GPUs for Object-Class Image Segmentation" at the University of Bonn, Autonomous Intelligent Systems.
Implemented Visual Features
We currently focus on image labelling tasks such as image segmentation or classification applications. We implement two types of image features as described in the documentation of visual features in more detail.
To build the C++ library and the binaries, you will need:
- cmake (and cmake-curses-gui for easy configuration)
- ndarray (included as git submodule)
- GCC 4.4 or higher
- Boost 1.46 or higher
- NVIDIA CUDA™ 5.0 or higher
- Thrust - included in CUDA since 4.0
- Vigra Impex
- Intel TBB
- MDBQ (optional, required for hyperopt parameter search)
- A CUDA capable GPU with compute capability 2.0 or higher
git clone --recursive https://github.com/deeplearningais/curfil.git # --recursive will also init the submodules cd curfil mkdir -p build cd build cmake -DCMAKE_BUILD_TYPE=Release .. # change to 'Debug' to build the debugging version ccmake . # adjust paths to your system (cuda, thrust, ...)! make -j ctest # run tests to see if it went well sudo make install
Refer to your local Unix expert if you do not know what to do with this instruction.
Pay special attention to the
CUDA_ARCHITECTURE variable in ccmake. If you have a very recent GPU (eg TITAN), you probably want to enable support for device capability 5.2 by adding/replacing with
Training and prediction requires to load a set of images from a dataset. We
currently only support datasets that contain RGB-D images, as for example
captured by the Microsoft Kinect or the Asus Xtion PRO LIVE. RGB-D images have
three channels that encode the color information and one channel for the depth
of each pixel. Depth is the distance of the object to the camera. Note that
stereo cameras such as the Kinect do not guarantee to deliver a valid depth
measure for every pixel in the image. Distance cannot be measured if the object
is occluded for one of the two cameras. Missing or invalid distance is either
encoded with a zero value or by using the special floating point value
To load images from disk, we use a similar format as the RGB-D object dataset of Kevin Lai et al..
We expect to find the color image, depth information and the ground truth in three files in the same folder.
All images must have the same size. Datasets with varying image sizes must be padded manually.
You can specify to skip the padding color when sampling the dataset by using the
The filename schema and format is
uint8RGB image where pixels take on values between 0-255
uint16depth image. Each pixel gives the depth in millimeters, with 0 denoting missing depth. The depth image can be read using MATLAB with the standard function (imread), and in OpenCV by loading it into an image of type
uint8RGB image where pixels take on values between 0-255. Each color represents a different class label. Black indicates "void" or "background".
Use the binary
The training process produces a random forest consisting of multiple decision trees that are serialized to compressed JSON files, one file per tree.
See the documentation of training parameters.
Use the binary
The program reads the trees from the compresses JSON files and performs a dense pixel-wise classification of the specified input images. Prediction is accelerated on GPU and runs in real-time speed even on mobile GPUs such as the NVIDIA GeForce GTX 675M.
Also see documentation of prediction parameters.
Hyperopt Parameter Search
Use the binary
This Hyperopt client is only built if MDBQ is installed. The client fetches hyperopt trials (jobs) from a MongoDB database and performs 5-fold cross-validation to evaluate the loss. You can run the hyperopt client in parallel on as many machines as desired.
The trials need to be inserted into the database in advance. We include sample python scripts in scripts/. Note that there is only one new trial in the database at any given point in time. Thus, the python script needs to be running during the entire parameter search.
- Make sure the MongoDB database is up an running.
- Run the python script that inserts new trials. Example:
curfil_hyperopton as many machine as desired.