We introduce ASSIST-3D, which addresses the class-agnostic 3D instance segmentation task through synthesizing 3D data suitable for this task. Specifically, ASSIST-3D features three key innovations, including 1) Heterogeneous Object Selection from extensive 3D CAD asset collections, incorporating randomness in object sampling to maximize geometric and contextual diversity; 2) Scene Layout Generation through LLM-guided spatial reasoning combined with depth-first-search for reasonable object placements; and 3) Realistic Point Cloud Construction via multi-view RGB-D image rendering and fusion from the synthetic scenes, closely mimicking real-world sensor data acquisition. In this way, our synthetic data simultaneously satisfy geometry diversity, context complexity, and layout reasonability, which proves to be helpful for class-agnostic 3D instance segmentation training. Experiments on ScanNetV2, ScanNet++, and S3DIS benchmarks demonstrate that models trained with ASSIST-3D-generated data significantly outperform existing methods. Further comparisons underscore the superiority of our purpose-built pipeline over existing 3D scene synthesis approaches.
Clone the repository and install the required packages:
git clone https://github.com/CVMI-Lab/ASSIST-3D
conda create -n datagen python=3.10
conda activate datagen
pip install -r requirements.txt
pip install --extra-index-url https://ai2thor-pypi.allenai.org ai2thor==0+8524eadda94df0ab2dbb2ef5a577e4d37c712897
git clone --recursive "https://github.com/NVIDIA/MinkowskiEngine"
cd MinkowskiEngine
git checkout 02fc608bea4c0549b0a7b00ca1bf15dee4a0b228
python setup.py install --force_cuda --blas=openblasDownload the assets already processed by Holodeck that will be used to generate 3D scenes:
python -m objathor.dataset.download_holodeck_base_data --version 2023_09_23
python -m objathor.dataset.download_assets --version 2023_09_23
python -m objathor.dataset.download_annotations --version 2023_09_23
python -m objathor.dataset.download_features --version 2023_09_23by default these will save to ~/.objathor-assets/..., you can change this director by specifying the --path argument. If you change the --path, you'll need to set the OBJAVERSE_ASSETS_DIR in ai2holodeck/constant.py to the same path.
Since we adopt ScanNet++ as the evaluation dataset, to show that the model trained on our synthetic data generalizes better on objects unseen during training, we should filter out assets whose categories belong to ScanNet++:
python filter_holodeck.py --path_to_asset YOUR_ASSET_DOWNLOAD_PATH
python filter_thor.py --path_to_asset YOUR_ASSET_DOWNLOAD_PATH
python filter_obj.py --path_to_asset YOUR_ASSET_DOWNLOAD_PATHwhere --asset_path is the root path to your previously downloaded assets. We note that this step can be skipped if you do not need to evaluate on ScanNet++ to prove generalization ability.
First generate prompts describing scenes:
python ai2holodeck/generate_random_query.py --num_scenes 1000where --num_scenes is the number of scenes you want to generate. The generated results are saved in ai2holodeck/scene_description.txt and although all descriptions are "a room", it does not affect final results since objects and layouts in the generated scenes are completely random and irrelevant to these descriptions.
Second generate the json files describing objects and layout of each scene:
python ai2holodeck/main.py --mode generate_multi_scenes --query_file ai2holodeck/scene_description.txt --openai_api_key YOUR_OPENAI_API_KEY --save_dir ./data/sceneswhere --openai_api_key is your key for openai's api to call GPT-4. --save_dir is the directory for saving the generation results.
Third, gather the paths to generated scenes for further process:
python scene_gather.py --scene_root ./data/scenesThe results will be saved in scenes.txt
First generate the reference point clouds that are sampled uniformly from the object meshes, which will be used to obtain the instance label of each point cloud later:
python pointgen.py --scene_file ./scenes.txt --path_to_asset YOUR_ASSET_DOWNLOAD_PATH --output_dir ./uniform_pointswhere --scene_file is the path to the txt file containing the paths to generated scenes before.
Then render the RGB-D information of each scene:
python render_rgbd.py --scene_file ./scenes.txt --path_to_uniform_points ./uniform_points --path_to_asset YOUR_ASSET_DOWNLOAD_PATH --output_dir ./imagesFinally, project the RGB-D information of each scene back to 3D space to render point clouds. Obtain the instance label of each rendered point:
python render_points.py --scene_file ./scenes.txt --path_to_uniform_points ./uniform_points --path_to_asset YOUR_ASSET_DOWNLOAD_PATH --path_to_image ./images --output_dir ./final_pointsIn this way, the .npy point clouds file of each generated scene will be saved in ./final_points in the format [point_coordinates, point_colors, point_instance_label]. These synthetic data can be combined with ScanNet to train Mask3D.
If you want to visualize the generated .npy point clouds of one generated scene, run the following commands:
python visualize.py --scene_points PATH_TO_NPY_FILEwhere --scene_points is the path to your specified .npy point cloud file. The generated .ply file can be visualized with tools such as meshlab.
If you find our work useful, please consider citing:
@misc{zhou2025assist,
title={ASSIST-3D: Adapted Scene Synthesis for Class-Agnostic 3D Instance Segmentation},
author={Shengchao Zhou, Jiehong Lin, Jiahui Liu, Shizhen Zhao, Chirui Chang and Xiaojuan Qi},
year={2025},
eprint={2512.09364},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2512.09364},
}- Holodeck: the codebase we built upon. Thanks for their outstanding work.
