OmniSeg3D: Omniversal 3D Segmentation via Hierarchical Contrastive Learning (CVPR2024)

Project Page | Arxiv Paper

OmniSeg3D: Omniversal 3D Segmentation via Hierarchical Contrastive Learning
Haiyang Ying¹, Yixuan Yin¹, Jinzhi Zhang¹, Fan Wang², Tao Yu¹, Ruqi Huang¹, Lu Fang¹
¹Tsinghua Univeristy ²Alibaba Group.

Towards segmenting everything in 3D all at once, we propose an omniversal 3D segmentation method (a), which takes as input multi-view, inconsistent, class-agnostic 2D segmentations, and then outputs a consistent 3D feature field via a hierarchical contrastive learning framework. This method supports hierarchical segmentation (b), multi-object selection (c), and holistic discretization (d) in an interactive manner.

Performance on Replica Room_0

replica_github2.1.mp4

For more demos, please visit our project page: OmniSeg3D.

Update

2024/01/14: We release the original version of OmniSeg3D. Try and play with it now!
2024/03/26: We release OmniSeg3D-GS as an adaptation of 3D Gaussian Splatting, check out it now!

Installation

NOTE: Our project is implemented based on the ngp_pl project and the requirements are the same as ngp_pl except for the SAM and a customized CUDA extension.

Hardware

OS: Ubuntu 20.04
NVIDIA GPU with Compute Compatibility >= 75 and memory > 8GB (Tested with a single RTX 2080 Ti and RTX 3090), CUDA 11.3 (might work with older version)

Software

Clone this repo by:

git clone https://github.com/THU-luvision/OmniSeg3D.git

Create a conda environment and activate it, Python>=3.8 (installation via anaconda is recommended.
```
conda create -n omniseg3d python=3.8
conda activate omniseg3d
```

pytorch, pytorch-lightning=1.9.3, pytorch-scatter

conda install pytorch==1.11.0 torchvision==0.12.0 cudatoolkit=11.3 -c pytorch
conda install pytorch-lightning=1.9.3
conda install pytorch-scatter -c pyg

tinycudann: following the official instruction (pytorch extension). NOTE: If you want to install it on server with local installed CUDA, you need to specify the CUDA path as cmake . -B build -DCMAKE_CUDA_COMPILER=/usr/local/cuda-11.3/bin/nvcc instead of 'cmake . -B build'.
```
git clone --recursive https://github.com/nvlabs/tiny-cuda-nn
cd tiny-cuda-nn/bindings/torch
python setup.py install
```

apex: following the official instruction. NOTE: Error may occur due to the recent official commit, try git checkout 2386a912164b0c5cfcd8be7a2b890fbac5607c82 to resolve the problem.

git clone https://github.com/NVIDIA/apex
cd apex
# if pip >= 23.1 (ref: https://pip.pypa.io/en/stable/news/#v23-1) which supports multiple `--config-settings` with the same key... 
pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --config-settings "--build-option=--cpp_ext" --config-settings "--build-option=--cuda_ext" ./
# otherwise
pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --global-option="--cpp_ext" --global-option="--cuda_ext" ./

SAM for segmentation:

git clone https://github.com/facebookresearch/segment-anything.git
cd segment-anything
pip install -e .
mkdir sam_ckpt; cd sam_ckpt
wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth

Other python requirements:
```
pip install -r requirements.txt
```
Cuda Extension: Upgrade pip to >= 22.1 and run:
```
pip install models/csrc/
```

Data Preparartion

Hierarchical Representation Generation

Run the sam model to get the hierarchical representation files.

python run_sam.py --ckpt_path {SAM_CKPT_PATH} --file_path {IMAGE_FOLDER} --gpu_id {GPU_ID}

After running, you will get three folders sam, masks, patches:

sam: stores the hierarchical representation as ".npz" files
masks and patches: used for visualization or masks quaility evaluation, not needed during training.

Ideal masks should include object-level masks and patches should contain part-level masks. We basically use the default parameter setting for SAM, but you can tune the parameters for customized datasets.

Data Sample

We provide some data sample (replica_room_0, 360_counter, llff_flower), you can download them for model trainning.

Data structure:

NOTE: Folder "sam", "masks", and "patches" should be generated with run_sam.py

    data
    ├── 360_v2             	# Link: https://jonbarron.info/mipnerf360/
    │   └── [bicycle|bonsai|counter|garden|kitchen|room|stump]
    │       ├── [sparse/0] (colmap results)
    │       └── [images|images_2|images_4|images_8|sam|masks|patches]
    │
    ├── nerf_llff_data     	# Link: https://drive.google.com/drive/folders/14boI-o5hGO9srnWaaogTU5_ji7wkX2S7
    │   └── [fern|flower|fortress|horns|leaves|orchids|room|trex]
    │       ├── [sparse/0] (colmap results)
    │       └── [images|images_2|images_4|images_8]
    │
    └── replica_data		# Link: https://github.com/ToniRV/NeRF-SLAM/blob/master/scripts/download_replica.bash
        └── [office_0|room_0|...]
            ├── transforms_train.json
            └── [rgb|depth(optional)|sam|masks|patches]

Training

We recommend a two-stage training strategy for stable convergence, which means we train for color and density field first and then for semantic field.

Before running: please specify the information in the config file (like run_replica.sh). More options can be found in opt.py and you can them adjusted in config file.

# --- Edit the config file scripts/run_replica.sh
root_dir=/path/to/data/folder/of/the/scene
exp_name=experiment_name
dataset_name=dataset_type  # "colmap", "replica", and you can easily specify new dataset type

Stage 1: color and density field optimization

CUDA_VISIBLE_DEVICES=0 opt=train_rgb bash scripts/run_replica.sh

Stage2: semantic field optimization

CUDA_VISIBLE_DEVICES=0 opt=train_sem bash scripts/run_replica.sh

Inference

We provide GUI (based on DearPyGUI) for interactive segmentation.

Stage1: color and density field visualization

CUDA_VISIBLE_DEVICES=0 opt=show_rgb bash scripts/run_replica.sh

Stage2: semantic field visualization and segmentation

CUDA_VISIBLE_DEVICES=0 opt=show_sem bash scripts/run_replica.sh

Here are some functional instructions for interactive segmentation in GUI:

The view point can be changed by dragging the mouse on the screen
Left click clickmode button to start segmentation mode:
- Single-click mode: right click the region of interest, the object or part will be highlighted, and the score map will show the similarity between the selected pixel and other rendered pixels.
- Multi-click mode: choose multi-clickmode button, then you can select multiple pixels on the screen by right click them.
- Similarity Threshold: drag the pin of ScoreThres, then the unselected regions will be darkened.
- Binarization: left click the binary threshold button a binary mask will be applied to the RGB image via the chosen similarity threshold.

Trained Models

We provide trained model for replica room_0, you can use it for GUI visulization and interactive segmentation. This sample also reveals the output organization. It is recommended to put the unzipped "results" folder under the root_dir of OmniSeg3D for minimum code modification.

Performance on MipNeRF360 Counter

360_github2.1.mp4

Comparison with SA3D

360_counter_comp.1.mp4

TODO List

Release mesh-based implementation;

Acknowledgements

Thanks for the following project for their valuable contributions:

Citation

If you find this project helpful for your research, please consider citing the report and giving a ⭐.

@article{ying2023omniseg3d,
  title={OmniSeg3D: Omniversal 3D Segmentation via Hierarchical Contrastive Learning},
  author={Ying, Haiyang and Yin, Yixuan and Zhang, Jinzhi and Wang, Fan and Yu, Tao and Huang, Ruqi and Fang, Lu},
  journal={arXiv preprint arXiv:2311.11666},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
datasets		datasets
models		models
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
losses.py		losses.py
opt.py		opt.py
requirements.txt		requirements.txt
run_sam.py		run_sam.py
show_gui.py		show_gui.py
show_gui_multi_4box.py		show_gui_multi_4box.py
show_gui_sem.py		show_gui_sem.py
train.py		train.py
utils.py		utils.py

License

THU-luvision/OmniSeg3D

Folders and files

Latest commit

History

Repository files navigation