[CVPR 2026] StaCOM: Stability-Driven Motion Generation for Object-Guided Human-Human Co-Manipulation
Official codebase for CVPR 2026 paper "Stability-Driven Motion Generation for Object-Guided Human-Human Co-Manipulation".
Jiahao Xu, Xiaohan Yuan, Xingchen Wu, Chongyang Xu, Kun Li, Buzhen Huang
Tianjin University, National University of Singapore, Sichuan University
The code is tested on Ubuntu with a single RTX 4090 GPU (24GB).
Create the conda environment:
conda create -n stacom python=3.10
conda activate stacomInstall PyTorch with CUDA 11.8:
pip install torch==2.1.2+cu118 torchvision==0.16.2+cu118 torchaudio==2.1.2+cu118 --index-url https://download.pytorch.org/whl/cu118Install other dependencies:
pip install -r requirements.txtDownload the official SMPL-X model from the SMPL-X website and place it in data/smplx/.
Demo assets (checkpoints + SMPL-X neutral model) are provided here:
- Google Drive: https://drive.google.com/drive/folders/17oLiCvTHiHTnGxfUHlmu687rCeGwu4Rk?usp=drive_link
Download and place the files as follows:
contact_epoch200.pkl->output/contact_epoch200.pklhoi_epoch200.pkl->output/hoi_epoch200.pklSMPLX_NEUTRAL.pkl->data/SMPLX_NEUTRAL.pkl
Recommended folder structure:
StaCOM/
├── data/
│ └── SMPLX_NEUTRAL.pkl
└── output/
├── contact_epoch200.pkl
└── hoi_epoch200.pkl
Three files are required:
object.obj— object mesh in its local coordinate frametrajectory.npy— object 6D pose sequence, shape(T, 4, 4)affordance.npz— per-point affordance scores, must contain keysampled_scoresof shape(N,)
Run the motion generation demo with a trained checkpoint:
xvfb-run -a -s "-screen 0 1024x768x24" python demo.py \
--obj-mesh data/test/01/box001.obj \
--obj-traj data/test/01/trajectory.npy \
--affordance data/test/01/affordance.npz \
--contact-ckpt output/contact_epoch200.pkl \
--motion-ckpt output/hoi_epoch200.pkl \
--body-model data/SMPLX_NEUTRAL.pkl \
--output-dir output/xvfb-run starts a virtual X display for headless/offscreen rendering on servers without a desktop session (common for remote Linux machines).
Install it with:
# Ubuntu / Debian
sudo apt-get update && sudo apt-get install -y xvfbThe output video is saved to output/res_20260325_143022.mp4.
Optional arguments:
| Argument | Default | Description |
|---|---|---|
--gpu-index |
0 |
CUDA device index |
--physics |
off | Enable stability-driven physics simulation (CMA-ES). |
Run the visualization demo below for contact point:
python vis_contact.pyThe demo expects uploaded inputs such as:
- mesh (
.obj) - object trajectory (
trajectory.npy) - affordance (
affordance.npz) - GT contact (
gt_contact.npz)
Generate necessary condition data with:
python utils/data_collection.py --config=cfg_files/config.yamlSDF loss is required for penetration evaluation.
Download the dataset from:
(To be released)
Place the dataset under data/ as specified by --data_folder, then run the following to train the motion generation model:
python main.py \
--mode train \
--data_folder data \
--trainset "CORE4D_real CORE4D_syn" \
--testset CORE4D_S1 \
--model interhuman_flow_BPS_prior \
--epoch 2000 \
--batchsize 4 \
--lr 0.0001 \
--worker 6 \
--output output@inproceedings{xu2026stability,
title={Stability-Driven Motion Generation for Object-Guided Human-Human Co-Manipulation},
author={Xu, Jiahao and Yuan, Xiaohan and Wu, Xingchen and Xu, Chongyang and Li, Kun and Huang, Buzhen},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2026}
}




