Framework

Efficient Heatmap-Guided 6-Dof Grasp Detection in Cluttered Scenes
RA-L 2023

Official code of paper Efficient Heatmap-Guided 6-Dof Grasp Detection in Cluttered Scenes

Framework

Requirements

Python >= 3.8
PyTorch >= 1.10
pytorch3d
numpy==1.23.5
pandas
cupoch
numba
grasp_nms
matplotlib
open3d
opencv-python
scikit-image
tensorboardX
torchsummary
tqdm
transforms3d
trimesh
autolab_core
cvxopt

Installation

This code has been tested on Ubuntu20.04 with Cuda 11.1/11.3/11.6, Python3.8/3.9 and Pytorch 1.11.0/1.12.0.

Get the code.

git clone https://github.com/THU-VCLab/HGGD.git

Create new Conda environment.

conda create -n hggd python=3.8
cd HGGD

Please install pytorch and pytorch3d manually.

# pytorch-1.11.0
pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113
# pytorch3d
pip install fvcore
pip install --no-index --no-cache-dir pytorch3d -f https://dl.fbaipublicfiles.com/pytorch3d/packaging/wheels/py38_cu113_pyt1110/download.html

Install other packages via Pip.

pip install -r requirements.txt

Usage

Checkpoint

Checkpoints (realsense/kinect) can be downloaded from Tsinghua Cloud

Preprocessed Dataset

Preprocessed datasets (realsense.7z/kinect.7z) can be downloaded from Tsinghua Cloud

Containing converted and refined grasp poses from each image in graspnet dataset

Train

Training code has been released, please refer to training script

Typical hyperparameters:

batch-size # batch size, default: 4
step-cnt # step number for gradient accumulation, actual_batch_size = batch_size * step_cnt, default: 2
lr # learning rate, default: 1e-2
anchor-num # spatial rotation anchor number, default: 7
anchor-k # in-plane roation anchor number, default: 6
anchor-w # grasp width anchor size, default: 50
anchor-z # grasp depth anchor size, default: 20
all-points-num # point cloud downsample number, default: 25600
group-num # local region fps number, default: 512
center-num # sampled local center/region number, default: 128
noise # point cloud noise scale, default: 0
ratio # grasp attributes prediction downsample ratio, default: 8
grid-size # grid size for our grid-based center sampling, default: 8
scene-l & scene-r # scene range, train: 0~100, seen: 100~130, similar: 130~160, novel: 160~190
input-w & input-h # downsampled input image size, should be 640x360
loc-a & reg-b & cls-c & offset-d # loss multipier, default: 1, 5, 1, 1
epochs # training epoch number, default: 15
num-workers # dataloader worker number, default: 4
save-freq # checkpoint saving frequency, default: 1
optim # optimizer, default: 'adamw'
dataset-path # our preprocessed dataset path (read grasp poses)
scene-path  # original graspnet dataset path (read images)
joint-trainning # whether to joint train our two part of network (trainning is a typo, should be training, please ignore it)

Test

Download and unzip our preprocessed datasets (for convenience), you can also try removing unnecessary parts in our test code and directly reading images from the original graspnet dataset api.

Run test code (read rgb and depth image from graspnet dataset and eval grasps).

bash test_graspnet.sh

Attention: if you want to change camera, please remember to change camera in config.py

Typical hyperparameters:

center-num # sampled local center/region number, higher number means more regions&grasps, but gets slower speed, default: 48
grid-size # grid size for our grid-based center sampling, higher number means sparser centers, default: 8
ratio # grasp attributes prediction downsample ratio, default: 8
anchor-k # classification anchor number for grasp in-plane rotation, default: 6
anchor-w # regress anchor size for grasp width, default: 50
anchor-z # regress anchor size for grasp depth, default: 20
all-points-num # downsampled point cloud point number, default: 25600
group-num # local region point cloud point number, default: 512
local-k # grasp detection number in each local region, default: 10
scene-l & scene-r # scene range, train: 0~100, seen: 100~130, similar: 130~160, novel: 160~190
input-h & input-w # downsampled input image size, should be 640x360
local-thres & heatmap-thres # heatmap and grasp score filter threshold, set to 0.01 in our settings
dataset-path # our preprocessed dataset path (read grasp poses)
scene-path # original graspnet dataset path (read images)
num-workers # eval worker number
dump-dir # detected grasp poses dumped path (used in later evaluation)

Demo

Run demo code (read rgb and depth image from file and get grasps).

bash demo.sh

Typical hyperparameters:

center-num # sampled local center/region number, higher number means more regions&grasps, but gets slower speed, default: 48
grid-size # grid size for our grid-based center sampling, higher number means sparser centers, default: 8
all-points-num # downsampled point cloud point number, default: 25600
group-num # local region point cloud point number, default: 512
local-k # grasp detection number in each local region, default: 10

Results

Attention: HGGD detects grasps only from heatmap guidance, without any workspace mask (adopted in Graspness) or object/foreground segmentation method (adopted in Scale-balanced Grasp). It may be useful to add some of this prior information to get better results.

Evaluation results on RealSense camera:

	Seen	Similar	Novel
In paper	59.36	51.20	22.17
In repo	64.45	53.59	24.59

Evaluation results on Kinect camera:

	Seen	Similar	Novel
In paper	60.26	48.59	18.43
In repo	61.17	47.02	19.37

Citation

Please cite our paper in your publications if it helps your research:

@article{chen2023efficient,
  title={Efficient Heatmap-Guided 6-Dof Grasp Detection in Cluttered Scenes},
  author={Chen, Siang and Tang, Wei and Xie, Pengwei and Yang, Wenming and Wang, Guijin},
  journal={IEEE Robotics and Automation Letters},
  year={2023},
  publisher={IEEE}
}

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
customgraspnetAPI		customgraspnetAPI
dataset		dataset
images		images
models		models
.gitignore		.gitignore
.isort.cfg		.isort.cfg
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
demo.py		demo.py
demo.sh		demo.sh
requirements.txt		requirements.txt
test_graspnet.py		test_graspnet.py
test_graspnet.sh		test_graspnet.sh
train_graspnet.py		train_graspnet.py
train_graspnet.sh		train_graspnet.sh
train_utils.py		train_utils.py

License

THU-VCLab/HGGD

Folders and files

Latest commit

History

Repository files navigation

Efficient Heatmap-Guided 6-Dof Grasp Detection in Cluttered Scenes RA-L 2023

Framework

Requirements

Installation

Usage

Checkpoint

Preprocessed Dataset

Train

Test

Demo

Results

Citation

About

Resources

License

Stars

Watchers

Forks

Languages

Efficient Heatmap-Guided 6-Dof Grasp Detection in Cluttered Scenes
RA-L 2023