SuperQuadric Pushing Dynamics Network (SQPD-Net)

The official repository for <SE(2)-Equivariant Pushing Dynamics Models for Tabletop Object Manipulations> (Seungyeon Kim, Byeondgo Lim, Yonghyeon Lee, and Frank C. Park, CoRL 2022).

This paper proposes a SE(2)-equivariant pushing dynamics model for tabletop object manipulations. Our dynamics model is then used for various downstream pushing manipulation tasks such as the object moving, singulation, and grasping.

Preview

Pushing Manipulation using SQPD-Net

Figure 1: Real-world manipulation results using SQPD-Net for moving, singulation, and grasping tasks (for the fourth row case, the target object is the cylinder surrounded by the three cubes). The red arrow at each recognition step means the optimal pushing action.

Requirements

Environment

The project is developed under a standard PyTorch environment.

python 3.9
pybullet 3.2.3
pytorch
tensorboardx
tqdm
h5py
Open3D
scipy
scikit-learn
opencv-python
imageio
matplotlib
scikit-image
dominate
numba

Datasets

Datasets should be stored in datasets/ directory. Datasets can be downloaded through the Google drive link. After set up, the datasets/ directory should be as follows.

datasets
├── pushing_object_num_1
|   ├── training
|   ├── validation
|   └── test
├── pushing_object_num_2
|   ├── training
|   ├── validation
|   └── test
├── pushing_object_num_3
|   ├── training
|   ├── validation
|   └── test
└── pushing_object_num_4
    ├── training
    ├── validation
    └── test

If you want to generate your own custom dataset, run the following script:

python data_generation.py --enable_gui                # PyBullet UI on/off
                          --folder_name test          # folder name of the generated dataset
                          --object_types box cylinder # used object types for data generation
                          --num_objects 4             # can be 1~4; currently the max number of object is 4
                          --push_num 20               # max number of pushing per sequence
                          --training_num 150          # the number of training set; total number of training set is (training_num * push_num)
                          --validation_num 15         # the number of validation set; total number of validation set is (validation_num * push_num)
                          --test_num 15               # the number of test set; total number of test set is (test_num * push_num)

Pretrained model

Pre-trained models should be stored in pretrained/. The pre-trained models are already provided in this repository. After set up, the pretrained/ directory should be follows.

pretrained
├── segmentation_config
│   └── pretrained
│       ├── segmentation_config.yml
│       └── model_best.pkl
├── sqpdnet_2d_motion_only_config
│   └── pretrained
│       ├── sqpdnet_2d_motion_only_config.yml
│       └── model_best.pkl
└── recognition_config
    └── pretrained
        ├── recognition_config.yml
        └── model_best.pkl

Running

Training

The training script is train.py.

--config specifies a path to a configuration yml file.
--logdir specifies a directory where the results will be saved.
--run specifies a name for an experiment.
--device specifies an GPU number to use.

Training code for recognition network and segmentation network is as follows:

python train.py --config configs/sqpdnet/{X}_config.yml

X is either segmentation or recognition.
If you want to see the results of the intermediate training process in tensorboard, run this code:
```
tensorboard --logdir train_results/{X}_config --host {ip address}
```

Training code for motion prediction network is as follows:

python train.py --config configs/sqpdnet/sqpdnet_{X}_motion_only_config.yml

X is either 2d or 3d.
If you want to see the results of the intermediate training process in tensorboard, run this code:
```
tensorboard --logdir train_results/sqpdnet_{X}_motion_only_config --host {ip address}
```

If you want to see the overall pushing dynamics dataset in tensorboard, run this code:

python dataset_visualizer.py --config configs/data_visualization/data_visualization.yml
tensorboard --logdir train_results/data_visualization --host {ip address}

Control in Simulation Environment

The control scripts in Pybullet simulator are as follows:

python control.py --config configs/control_sim/control_sim_{X}_config.yml

X is either moving, singulation, grasping_clutter, grasping_large, or moving_interactive.
- moving is a task to move objects to their desired poses.
- singulation is a task to separate objects by more than a certain distance $\tau$.
- grasping_clutter is a task to make a target object graspable in cluttered environment by pushing manipulation.
- grasping_large is a task to make a large and flat target object graspable by pushing manipulation.
- moving_interactive is a task to move an object to its desired pose, but the robot should not push the target object.

Control in Real-world Robot Environment

The control scripts in real-world environment are as follows:

python control.py --config configs/control_real/control_real_{X}_config.yml --ip {Y} --port {Z}

The real-world control code is based on python socket communication library between sever computer (python 3) and robot computer (python 2).

A simple python guideline for communicating with server from robot computer is as follows.

from function.communicator_client import Talker

# Connect to server
client = Talker({Y}, {Z})
client.conn_server()

# Send vision data
client.send_vision(point_cloud)           # send point cloud (n x 3)

# Receive data from server
data = client.recv_grasp(dict_type=True)  # receive pushing or grasping action

Y is ip address of the server computer and Z is a port number.
X is either moving, singulation, grasping_clutter, grasping_large, or moving_interactive.
- The task descriptions are the same with above.

(Optional) Training Baselines

Training code for pushing dynamics model baselines is as follows:

python train.py --config configs/baseline/{X}_config.yml

X is either 2dflow, se3-nets, se3_pose_nets, 3dflow, or dsr_net_single.

Citation

If you found this repository useful in your research, please consider citing:

@inproceedings{kim2023se,
  title={SE (2)-Equivariant Pushing Dynamics Models for Tabletop Object Manipulations},
  author={Kim, Seungyeon and Lim, Byeongdo and Lee, Yonghyeon and Park, Frank C},
  booktitle={Conference on Robot Learning},
  pages={427--436},
  year={2023},
  organization={PMLR}
}

Acknowledgement

We thank the authors for releasing the code.

The pushing data generation code is modified from dsr.
The baseline models are from se3posenets-pytorch for 2DFlow, SE3-Net, and SE3Pose-Net and from dsr for 3DFlow and DSR-Net.
The segmentation and recognition code is modified from DSQNet-public.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
assets		assets
configs		configs
control		control
data_generation		data_generation
figures		figures
functions		functions
loader		loader
loss		loss
metrics		metrics
models		models
pretrained		pretrained
trainers		trainers
.gitignore		.gitignore
README.md		README.md
control.py		control.py
data_generation.py		data_generation.py
dataset_visualizer.py		dataset_visualizer.py
optimizers.py		optimizers.py
train.py		train.py

seungyeon-k/SQPDNet-public

Folders and files

Latest commit

History

Repository files navigation