3D Point Cloud Semantic Segmentation via Gaussian Splatting
PointGS is a pipeline for indoor point cloud semantic segmentation that leverages 3D Gaussian Splatting (3DGS) and the Segment Anything Model (SAM). It projects point clouds to multi-view images, reconstructs 3D Gaussians, segments them using SegAnyGAussians (SAGA), and transfers semantic labels back to the original point cloud.
S3DIS Point Cloud (.pth)
|
v
[Step 1] Data Preparation
| pth -> txt -> Z-split -> render views -> COLMAP SfM
v
[SAGA] Gaussian Reconstruction & Segmentation (external)
| 3DGS training -> SAM masks -> contrastive features -> segmentation
v
[Step 2] Post-processing
| denoise -> scale -> ICP registration -> label transfer -> label mapping
v
[Step 3] Evaluation
mIoU, AP, mAcc, oAcc
pip install -r requirements.txt| Tool | Purpose | Installation |
|---|---|---|
| COLMAP | Structure-from-Motion reconstruction | Install guide |
| CloudCompare | ICP point cloud registration | Download |
| SegAnyGAussians | 3DGS training & segmentation | See their README |
Edit configs/default.yaml to set your data paths and tool locations:
colmap_exe: "/path/to/colmap"
cloudcompare_exe: "/path/to/CloudCompare"
paths:
s3dis_pth_dir: "data/s3dis/Area_5"
s3dis_txt_dir: "data/s3dis/Area_5_txt"
cut_output_dir: "data/s3dis/Area_5_cut"
saga_output_dir: "output/saga_gaussians"
reference_dir: "data/reference"
final_output_dir: "output/results"Convert S3DIS data, split point clouds, render multi-view images, and run COLMAP:
# Run the full data preparation pipeline
python step1_data_preparation.py --config configs/default.yaml
# Or run individual sub-steps
python step1_data_preparation.py --config configs/default.yaml --step pth2txt
python step1_data_preparation.py --config configs/default.yaml --step split
python step1_data_preparation.py --config configs/default.yaml --step render
python step1_data_preparation.py --config configs/default.yaml --step colmapSub-steps:
- pth2txt - Convert S3DIS
.pthfiles to.txtformat (x y z r g b label) - split - Split each point cloud into upper/lower halves by Z-axis median
- render - Render 80 perspective views per scene using matplotlib
- colmap - Run COLMAP feature extraction, matching, and sparse reconstruction
Clone and set up the SAGA repository following their instructions:
git clone https://github.com/Jumpat/SegAnyGAussians.git
cd SegAnyGAussians
# Follow their installation guideUse the provided batch script to process multiple scenes:
python saga_batch.py --saga_dir /path/to/SegAnyGAussians --data_dir /path/to/scenesThis runs: 3DGS training -> SAM mask extraction -> scale computation -> contrastive feature training -> SAGA segmentation.
After SAGA produces labeled Gaussian point clouds, run the post-processing pipeline:
# Run the full post-processing pipeline
python step2_postprocessing.py --config configs/default.yaml \
--saga_output_dir output/saga_gaussians \
--reference_dir data/reference
# Or run individual sub-steps
python step2_postprocessing.py --config configs/default.yaml --step denoise
python step2_postprocessing.py --config configs/default.yaml --step scale
python step2_postprocessing.py --config configs/default.yaml --step icp
python step2_postprocessing.py --config configs/default.yaml --step filter
python step2_postprocessing.py --config configs/default.yaml --step transfer
python step2_postprocessing.py --config configs/default.yaml --step matchSub-steps:
- denoise - Voxel grid denoising + remove label-0 + KDTree radius denoising
- scale - FPS-based scale normalization to match reference point cloud diameter
- icp - ICP registration using CloudCompare (initial + 24 rotations + best selection)
- filter - Label consistency filtering with KNN and connected component analysis
- transfer - 1-nearest-neighbor label transfer from Gaussians to original points
- match - Greedy IoU-based label mapping to align predicted labels with ground truth
Compute semantic segmentation metrics:
python step3_evaluation.py --config configs/default.yaml \
--pred_dir output/results/label_mapped \
--gt_dir data/referenceMetrics: mIoU, AP (Average Precision), mAcc (mean Accuracy), oAcc (overall Accuracy) for 13 S3DIS classes.
Convert labeled point clouds to colored PLY files:
python tools/visualize.py --input_dir output/results/label_mapped --output_dir output/visualizationPointGS/
├── configs/default.yaml # Configuration (paths, parameters)
├── step1_data_preparation.py # Data conversion & COLMAP
├── step2_postprocessing.py # Denoising, registration, label transfer
├── step3_evaluation.py # mIoU evaluation
├── saga_batch.py # SAGA batch processing helper
├── utils/
│ ├── io_utils.py # Point cloud I/O
│ ├── data_conversion.py # PTH conversion, rendering, COLMAP
│ ├── point_cloud_ops.py # Z-split, FPS, scaling
│ ├── denoising.py # Voxel/KDTree denoising, label filtering
│ ├── registration.py # ICP registration (CloudCompare)
│ ├── label_transfer.py # 1NN transfer & label mapping
│ ├── metrics.py # mIoU, AP, mAcc, oAcc
│ └── visualization.py # Label-to-PLY conversion
└── tools/visualize.py # Visualization CLI
All parameters are configurable via configs/default.yaml:
| Parameter | Default | Description |
|---|---|---|
voxel_size |
0.15 | Voxel size for grid denoising |
voxel_cube_size |
5 | Cube size for densest region search |
kdtree_radius |
0.007 | Radius for KDTree denoising |
kdtree_min_neighbors |
35 | Minimum neighbors within radius |
kdtree_iterations |
2 | Number of denoising passes |
fps_sample_size |
1024 | FPS sample count for diameter estimation |
k_consistency |
20 | K for label consistency check |
k_connected |
50 | K for connectivity analysis |
min_consistency_ratio |
0.8 | Minimum label agreement ratio |
min_connected_threshold |
30 | Minimum connected component size |
num_classes |
13 | Number of semantic classes (S3DIS) |
This project builds upon SegAnyGAussians (SAGA) by Hu et al. for 3D Gaussian segmentation. Please cite their work if you use this pipeline:
@article{cen2023saga,
title={Segment Any 3D Gaussians},
author={Cen, Jiazhong and Fang, Jiemin and Yang, Chen and Xie, Lingxi and Zhang, Xiaopeng and Shen, Wei and Tian, Qi},
journal={arXiv preprint arXiv:2312.00860},
year={2023}
}This project is licensed under the MIT License. See LICENSE for details.