Skip to content
Switch branches/tags


Failed to load latest commit information.
Latest commit message
Commit time

Grasp Pose Detection (GPD)

Grasp Pose Detection (GPD) is a package to detect 6-DOF grasp poses (3-DOF position and 3-DOF orientation) for a 2-finger robot hand (e.g., a parallel jaw gripper) in 3D point clouds. GPD takes a point cloud as input and produces pose estimates of viable grasps as output. The main strengths of GPD are:

  • works for novel objects (no CAD models required for detection),
  • works in dense clutter, and
  • outputs 6-DOF grasp poses (enabling more than just top-down grasps).

UR5 demo

GPD consists of two main steps: sampling a large number of grasp candidates, and classifying these candidates as viable grasps or not.

Example Input and Output

The reference for this package is: Grasp Pose Detection in Point Clouds.

Table of Contents

  1. Requirements
  2. Installation
  3. Generate Grasps for a Point Cloud File
  4. Parameters
  6. Input Channels for Neural Network
  7. CNN Frameworks
  8. Network Training
  9. Grasp Image
  10. References
  11. Troubleshooting

1) Requirements

  1. PCL 1.9 or newer
  2. Eigen 3.0 or newer
  3. OpenCV 3.4 or newer

2) Installation

The following instructions have been tested on Ubuntu 16.04. Similar instructions should work for other Linux distributions.

  1. Install PCL and Eigen. If you have ROS Indigo or Kinetic installed, you should be good to go.

  2. Install OpenCV 3.4 (tutorial).

  3. Clone the repository into some folder:

    git clone
  4. Build the package:

    cd gpd
    mkdir build && cd build
    cmake ..
    make -j

You can optionally install GPD with sudo make install so that it can be used by other projects as a shared library.

If building the package does not work, try to modify the compiler flags, CMAKE_CXX_FLAGS, in the file CMakeLists.txt.

3) Generate Grasps for a Point Cloud File

Run GPD on an point cloud file (PCD or PLY):

./detect_grasps ../cfg/eigen_params.cfg ../tutorials/krylon.pcd

The output should look similar to the screenshot shown below. The window is the PCL viewer. You can press [q] to close the window and [h] to see a list of other commands.

Below is a visualization of the convention that GPD uses for the grasp pose (position and orientation) of a grasp. The grasp position is indicated by the orange cross and the orientation by the colored arrows.

4) Parameters

Brief explanations of parameters are given in cfg/eigen_params.cfg.

The two parameters that you typically want to play with to improve the number of grasps found are workspace and num_samples. The first defines the volume of space in which to search for grasps as a cuboid of dimensions [minX, maxX, minY, maxY, minZ, maxZ], centered at the origin of the point cloud frame. The second is the number of samples that are drawn from the point cloud to detect grasps. You should set the workspace as small as possible and the number of samples as large as possible.

Most of the code is parallelized. To improve runtime, set num_threads to the number of (physical) CPU cores that your computer has available.

5) Views

rviz screenshot

You can use this package with a single or with two depth sensors. The package comes with CAFFE model files for both. You can find these files in models/caffe/15channels. For a single sensor, use single_view_15_channels.caffemodel and for two depth sensors, use two_views_15_channels_[angle]. The [angle] is the angle between the two sensor views, as illustrated in the picture below. In the two-views setting, you want to register the two point clouds together before sending them to GPD.

Providing the camera position to the configuration file (*.cfg) is important, as it enables PCL to estimate the correct normals direction (which is to point toward the camera). Alternatively, using the ROS wrapper, multiple camera positions can be provided.

rviz screenshot

To switch between one and two sensor views, change the parameter weight_file in your config file.

6) Input Channels for Neural Network

The package comes with weight files for two different input representations for the neural network that is used to decide if a grasp is viable or not: 3 or 15 channels. The default is 15 channels. However, you can use the 3 channels to achieve better runtime for a loss in grasp quality. For more details, please see the references below.

7) CNN Frameworks

GPD comes with a number of different classifier frameworks that exploit different hardware and have different dependencies. Switching between the frameworks requires to run CMake with additional arguments. For example, to use the OpenVino framework:


You can use ccmake to check out all possible CMake options.

GPD supports the following three frameworks:

  1. OpenVino: installation instructions for open source version (CPUs, GPUs, FPGAs from Intel)
  2. Caffe (GPUs from Nvidia or CPUs)
  3. Custom LeNet implementation using the Eigen library (CPU)

Additional classifiers can be added by sub-classing the classifier interface.


OpenVINO is recommended for speed. To use OpenVINO, you need to run the following command before compiling GPD.

export InferenceEngine_DIR=/path/to/dldt/inference-engine/build/

8) Network Training

To create training data with the C++ code, you need to install OpenCV 3.4 Contribs. Next, you need to compile GPD with the flag DBUILD_DATA_GENERATION like this:

cd gpd
mkdir build && cd build
make -j

There are four steps to train a network to predict grasp poses. First, we need to create grasp images.

./generate_data ../cfg/generate_data.cfg

You should modify generate_data.cfg according to your needs.

Next, you need to resize the created databases to train_offset and test_offset (see the terminal output of generate_data). For example, to resize the training set, use the following commands with size set to the value of train_offset.

cd pytorch
python pathToTrainingSet.h5 out.h5 size

The third step is to train a neural network. The easiest way to training the network is with the existing code. This requires the pytorch framework. To train a network, use the following commands.

cd pytorch
python pathToTrainingSet.h5 pathToTestSet.h5 num_channels

The fourth step is to convert the model to the ONNX format.

python pathToPytorchModel.pwf pathToONNXModel.onnx num_channels

The last step is to convert the ONNX file to an OpenVINO compatible format: tutorial. This gives two files that can be loaded with GPD by modifying the weight_file and model_file parameters in a CFG file.

9) Grasp Image/Descriptor

Generate some grasp poses and their corresponding images/descriptors:

./test_grasp_image ../tutorials/krylon.pcd 3456 1 ../models/lenet/15channels/params/

For details on how the grasp image is created, check out our journal paper.

10) References

If you like this package and use it in your own work, please cite our journal paper [1]. If you're interested in the (shorter) conference version, check out [2].

[1] Andreas ten Pas, Marcus Gualtieri, Kate Saenko, and Robert Platt. Grasp Pose Detection in Point Clouds. The International Journal of Robotics Research, Vol 36, Issue 13-14, pp. 1455-1473. October 2017.

[2] Marcus Gualtieri, Andreas ten Pas, Kate Saenko, and Robert Platt. High precision grasp pose detection in dense clutter. IROS 2016, pp. 598-605.

11) Troubleshooting Tips

  1. Remove the cmake cache: rm CMakeCache.txt
  2. make clean
  3. Remove the build folder and rebuild.
  4. Update gcc and g++ to a version > 5.