Tested on ros-noetic versions. ROS Install
# Make catkin workspace
mkdir -p ~/catkin_ws/src
cd ~/catkin_ws/src
# Clone git repo
git clone https://github.com/Hongyoungjin/stable-pushnet-network.git
# Catkin workspace
cd ..
catkin_ws
cd src
pip install -r requirements.txt
pip3 install torch torchvision torchaudio
After synthetic data generation in stable-pushnet-datagen, train dataset folder 'data', including 'data_stats' and 'tensors' folder will be created.
Set path to 'data' folder in the first line in config.yaml.
data_path: /path/to/train/dataset/folder/data
cd ~/catkin_ws/src/stable-pushnet-network/scripts
python3 train.py
Since train dataset includes both masked depth image and pure depth image, you can choose the type of network input by defining planner - image_type in config.yaml ('masked' or 'pure').
The following tools are for analyzing the performance of trained model.
We can analyze the classifier model using confusion matrix (accuracy, precision, and recall)
For a given network input image, plots the confusion output throughout the input velocity half-sphere.
Each velocity data point corresponds to one of (True Positive, False Positive, True Negative, and False Negative).
cd ~/catkin_ws/src/stable-pushnet-network/scripts
python3 plot_confusion.py
Set configuration in "confusion" directory in config.yaml.
Input Depth Image | Confusion Plot |
Visualizes the model's latent space (3D and 2D) through feature dimension reduction.
Used PCA to reduce 128 dimensions to 50, and t-SNE to 3 and 2.
Each feature data point corresponds to one of (True Positive, False Positive, True Negative, and False Negative).
cd ~/catkin_ws/src/stable-pushnet-network/scripts
python3 feature_map.py
Set configuration in "feature" directory in config.yaml.
Feature Plot (3D) | Feature Plot (2D) |
For a given network input image, plots the network output (0~1) throughout the input velocity half-sphere.
cd ~/catkin_ws/src/stable-pushnet-network/scripts
python3 plot_network_output.py
Set configuration in "network_output" directory in config.yaml.
Input Depth Image | Network Output |
In config.yaml,
- To use network-based model, set "planner / learning_base" to True
- To change the network threshold, change the "network / network_threshold" value.
Detailed description of each configuration element is as follows:
The overall configuration of the pushing environment.
-
gripper_width:
- Width of the parallel-jaw gripper (in meters, default: 0.08)
- If you want to apply the different width, you have to retrain the network or use the depth-based model.
-
num_push_directions:
- Number of initial push directions (default: 4)
-
learning_base:
- Whether to use the pre-trained network model (default: True)
-
visualize:
- Whether to visualize the push contact (default: False)
-
height:
- Height of the resultant path (in meters, default: 0.015)
- Since the height of the path may vary depending on the table height, this value may not affect the actual push path.
Configuration for the depth-based (analytical & non-trained) model.
- friction_coefficient:
- Friction coefficient between the pusher (robot gripper) and the slider (dish)
Configuration for the network model.
- model_name:
- Name of the network model. The model file is stored in network_data.
- network_threshold:
- The network threshold to determine whether the given push is considered successful or not.
Configuration for the Hybrid-A* Algorithm.
- grid_size: Grid size of each node (in meters)
- dtheta: Unit angle difference to make the child nodes (in radians)
roslaunch stable_pushnet_ros server.launch
The module returns a push path with push success metric through ROS service.
- /stable_push_planner/get_stable_push_path (stable_push_planner/GetStablePushPath)
- Request push target as a result of push-target module
- dish_segmentation (vision_msgs/Detection2DArray)
- Dish segmentation result (KIST vision module output)
- table_detection (vision_msgs/BoundingBox3D)
- Table detection result (KIST vision module output)
- depth_image (sensor_msgs/Image)
- Depth image of the entire scene
- cam_info (sensor_msgs/CameraInfo)
- Camera info of the depth camera
- camera_pose (geometry_msgs/PoseStamped)
- Depth camera pose
- push_targets (PushTargetArray)
- Array of push targets
- path nav_msgs/Path
- Path of the end-effector (center of the fingertips) to push the target
- plan_successful (
bool
)- True if the plan was successful
- False if the plan failed. In this case, the module reutrns a dummy path.
1. PushTargetArray (PushTargetArray)
Array of push targets
2. PushTarget PushTarget
- priority (
int32
)- Priority of the target. Smaller, higher priority
- push_target_id (
int32
)- ID of the target
- goal_pose (geometry_msgs/Pose2D)
- Goal pose of the target object
- start_pose_min_theta (geometry_msgs/Pose2D)
- Initial contact pose of the target, with the smallest push direction
- start_pose_max_theta (geometry_msgs/Pose2D)
- Initial contact pose of the target, with the largest push direction
- Launch the push-path planning server node
roslaunch stable_pushnet_ros server.launch
- Request the push-path
rosrun stable_pushnet_ros example.py
- Observe the resultant push-path
- We can see in the rViz visalization.
roslaunch stable_pushnet_ros example.launch