Skip to content

[AAAI 2026] Code for ASSIST-3D: Adapted Scene Synthesis for Class-Agnostic 3D Instance Segmentation

Notifications You must be signed in to change notification settings

CVMI-Lab/ASSIST-3D

Repository files navigation

ASSIST-3D: Adapted Scene Synthesis for Class-Agnostic 3D Instance Segmentation

arXiv
1The University of Hong Kong 

We introduce ASSIST-3D, which addresses the class-agnostic 3D instance segmentation task through synthesizing 3D data suitable for this task. Specifically, ASSIST-3D features three key innovations, including 1) Heterogeneous Object Selection from extensive 3D CAD asset collections, incorporating randomness in object sampling to maximize geometric and contextual diversity; 2) Scene Layout Generation through LLM-guided spatial reasoning combined with depth-first-search for reasonable object placements; and 3) Realistic Point Cloud Construction via multi-view RGB-D image rendering and fusion from the synthetic scenes, closely mimicking real-world sensor data acquisition. In this way, our synthetic data simultaneously satisfy geometry diversity, context complexity, and layout reasonability, which proves to be helpful for class-agnostic 3D instance segmentation training. Experiments on ScanNetV2, ScanNet++, and S3DIS benchmarks demonstrate that models trained with ASSIST-3D-generated data significantly outperform existing methods. Further comparisons underscore the superiority of our purpose-built pipeline over existing 3D scene synthesis approaches.

Table of Contents

  1. Installation
  2. Data Generation
  3. Visualization
  4. Citation
  5. Acknowledgement

Installation

Clone the repository and install the required packages:

git  clone  https://github.com/CVMI-Lab/ASSIST-3D
conda  create  -n  datagen  python=3.10
conda  activate  datagen
pip  install  -r  requirements.txt
pip install --extra-index-url https://ai2thor-pypi.allenai.org ai2thor==0+8524eadda94df0ab2dbb2ef5a577e4d37c712897
git clone --recursive "https://github.com/NVIDIA/MinkowskiEngine"
cd MinkowskiEngine
git checkout 02fc608bea4c0549b0a7b00ca1bf15dee4a0b228
python setup.py install --force_cuda --blas=openblas

Download the assets already processed by Holodeck that will be used to generate 3D scenes:

python -m objathor.dataset.download_holodeck_base_data --version 2023_09_23
python -m objathor.dataset.download_assets --version 2023_09_23
python -m objathor.dataset.download_annotations --version 2023_09_23
python -m objathor.dataset.download_features --version 2023_09_23

by default these will save to ~/.objathor-assets/..., you can change this director by specifying the --path argument. If you change the --path, you'll need to set the OBJAVERSE_ASSETS_DIR in ai2holodeck/constant.py to the same path.

Data Generation

1. Filter Out Assets Belonging to Scannet++ (Optional)

Since we adopt ScanNet++ as the evaluation dataset, to show that the model trained on our synthetic data generalizes better on objects unseen during training, we should filter out assets whose categories belong to ScanNet++:

python filter_holodeck.py --path_to_asset YOUR_ASSET_DOWNLOAD_PATH
python filter_thor.py --path_to_asset YOUR_ASSET_DOWNLOAD_PATH
python filter_obj.py --path_to_asset YOUR_ASSET_DOWNLOAD_PATH

where --asset_path is the root path to your previously downloaded assets. We note that this step can be skipped if you do not need to evaluate on ScanNet++ to prove generalization ability.

2. Generate Scene JSON Description Files

First generate prompts describing scenes:

python ai2holodeck/generate_random_query.py --num_scenes 1000

where --num_scenes is the number of scenes you want to generate. The generated results are saved in ai2holodeck/scene_description.txt and although all descriptions are "a room", it does not affect final results since objects and layouts in the generated scenes are completely random and irrelevant to these descriptions.

Second generate the json files describing objects and layout of each scene:

python ai2holodeck/main.py --mode generate_multi_scenes --query_file ai2holodeck/scene_description.txt --openai_api_key YOUR_OPENAI_API_KEY --save_dir ./data/scenes

where --openai_api_key is your key for openai's api to call GPT-4. --save_dir is the directory for saving the generation results.

Third, gather the paths to generated scenes for further process:

python scene_gather.py --scene_root ./data/scenes

The results will be saved in scenes.txt

3. Render Point Clouds for Each Scene

First generate the reference point clouds that are sampled uniformly from the object meshes, which will be used to obtain the instance label of each point cloud later:

python pointgen.py --scene_file ./scenes.txt --path_to_asset YOUR_ASSET_DOWNLOAD_PATH --output_dir ./uniform_points

where --scene_file is the path to the txt file containing the paths to generated scenes before.

Then render the RGB-D information of each scene:

python render_rgbd.py --scene_file ./scenes.txt --path_to_uniform_points ./uniform_points --path_to_asset YOUR_ASSET_DOWNLOAD_PATH --output_dir ./images

Finally, project the RGB-D information of each scene back to 3D space to render point clouds. Obtain the instance label of each rendered point:

python render_points.py --scene_file ./scenes.txt --path_to_uniform_points ./uniform_points --path_to_asset YOUR_ASSET_DOWNLOAD_PATH --path_to_image ./images  --output_dir ./final_points

In this way, the .npy point clouds file of each generated scene will be saved in ./final_points in the format [point_coordinates, point_colors, point_instance_label]. These synthetic data can be combined with ScanNet to train Mask3D.

Visualization

If you want to visualize the generated .npy point clouds of one generated scene, run the following commands:

python visualize.py  --scene_points PATH_TO_NPY_FILE

where --scene_points is the path to your specified .npy point cloud file. The generated .ply file can be visualized with tools such as meshlab.

Citation

If you find our work useful, please consider citing:

@misc{zhou2025assist,
      title={ASSIST-3D: Adapted Scene Synthesis for Class-Agnostic 3D Instance Segmentation}, 
      author={Shengchao Zhou, Jiehong Lin, Jiahui Liu, Shizhen Zhao, Chirui Chang and Xiaojuan Qi},
      year={2025},
      eprint={2512.09364},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2512.09364}, 
}

Acknowledgement

  • Holodeck: the codebase we built upon. Thanks for their outstanding work.

About

[AAAI 2026] Code for ASSIST-3D: Adapted Scene Synthesis for Class-Agnostic 3D Instance Segmentation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages