Skip to content

troyehuang/REACT3D

Repository files navigation

REACT3D: Recovering Articulations for Interactive Physical 3D Scenes

Interactive 3D scenes are increasingly vital for embodied intelligence, yet existing datasets remain limited due to the labor-intensive process of annotating part segmentation, kinematic types, and motion trajectories. We present REACT3D, a scalable zero-shot framework that converts static 3D scenes into simulation-ready interactive replicas with consistent geometry, enabling direct use in diverse downstream tasks.

Setup

Installation

Clone the repository and install necessary dependencies:

git clone https://github.com/troyehuang/REACT3D.git --recursive
cd REACT3D

conda create -n react3d python==3.10
conda activate react3d

export TORCH_CUDA_ARCH_LIST="7.5 8.0 8.6 8.9"

pip install ninja
pip install torch==2.5.1+cu121 torchvision==0.20.1+cu121 torchaudio==2.5.1+cu121 --extra-index-url https://download.pytorch.org/whl/cu121

# groundingDINO
set BUILD_WITH_CUDA=True
set CUDA_HOME=<your_path>
set AM_I_DOCKER=False

cd grounded_sam/GroundingDINO
python setup.py build
python setup.py install

# scene2part
cd ../..
pip install git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch --no-build-isolation
pip install "git+https://github.com/facebookresearch/pytorch3d.git@stable" --no-build-isolation
pip install kaolin -f https://nvidia-kaolin.s3.us-east-2.amazonaws.com/torch-2.5.1_cu121.html
pip install git+https://github.com/NVlabs/nvdiffrast.git --no-build-isolation

pip install -r requirements.txt

# part2interactive
pip install 'git+https://github.com/facebookresearch/detectron2.git' --no-build-isolation

cd opdformer/mask2former/modeling/pixel_decoder/ops
python setup.py build install

Install LLaVA:

conda create -n llava python==3.10
conda activate llava

git clone https://github.com/xinyu1205/recognize-anything.git
cd LLaVA
pip install --upgrade pip
pip install -e .

Install checkpoints:

cd REACT3D

# ram++ checkpoint
cd ram++
wget --no-check-certificate https://huggingface.co/xinyu1205/recognize-anything-plus-model/resolve/main/ram_plus_swin_large_14m.pth

# grounded sam checkpoint
cd ../grounded_sam
wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth
wget https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth

# opdm checkpoint
cd ../part2interactive
wget --no-check-certificate https://huggingface.co/3dlg-hcvc/opdmulti-motion-state-rgb-model/resolve/main/pytorch_model.pth -O {REACT3D_dir}/part2interactive/opdm_rgb.pth

Data Preparation

We evaluate our work on ScanNet++ augmented by Articulated3D and MultiScan datasets. For convenience, a small subset of generated interactive scenes and correspoonding static input scenes is provided here.

To use custom data, please follow the format in "data" folder to process your own scenes. Make sure the your data format is the same as the example data.

The data folder is organized as follows:

data
|---scene_name
|   |---images_2
|   |---mesh_aligned_0.05.ply
|   |---pose_intrinsic_imu.json
|   |---depth

Quick Start

To run the script on a specific scene, use:

cd REACT3D

cd scene2part
bash scene2part.sh # remember to change the paths

cd ../part2interactive
bash part2interactive.sh # remember to change the paths

cd ../texture
bash generate_texture.sh

cd ../simulation_ready
bash simulation_ready.sh

Acknowledgements

Our work is based on OPDMulti and DRAWER. We thank the authors for their great work and open-sourcing the code.

Citation

@ARTICLE{11434845,
  author={Huang, Zhao and Sun, Boyang and Delitzas, Alexandros and Chen, Jiaqi and Pollefeys, Marc},
  journal={IEEE Robotics and Automation Letters}, 
  title={REACT3D: Recovering Articulations for Interactive Physical 3D Scenes}, 
  year={2026},
  volume={},
  number={},
  pages={1-8},
  keywords={Three-dimensional displays;Joints;Geometry;Image reconstruction;Estimation;Solid modeling;Point cloud compression;Foundation models;Biological system modeling;Accuracy;Semantic scene understanding;object detection;segmentation and categorization;RGB-D perception},
  doi={10.1109/LRA.2026.3674028}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors