SuP: Sub-cloud Driven Point Cloud Registration
Sheldon Fung¹, Wei Pan², Ling Cao², Fei Hou³,⁴, Ling Chen⁵, Shasha Mao⁶, Hongdong Li⁷, Xuequan Lu¹⁎
¹University of Western Australia · ²OPT Machine Vision · ³Institute of Software, CAS · ⁴University of Chinese Academy of Sciences · ⁵University of Technology Sydney · ⁶Xidian University · ⁷Australian National University
⁎ Corresponding author.
📄 Paper: coming soon | 📦 Code: this repository | 🗂️ Data (C3DM / C3DLM): Google Drive
Existing learning-based point-cloud-registration methods handle high-overlap pairs well but struggle when the overlap is low, because geometric or semantic similarities in the non-overlapping regions inevitably produce ambiguous matches.
SuP reformulates low-overlap registration as a high-overlap sub-cloud anchor pair mining problem. The core component is the Dual-phase Sub-cloud Anchor Mining (DSAM) module:
- Subdivide the source and target point clouds into multiple sub-clouds.
- Phase 1 — OPS (Overlap-guided Prior-weighting Scheme): leverages feature salience to cheaply pre-score candidate sub-cloud anchor pairs.
- Phase 2 — MPN (Multi-scale Post-weighting Network): refines the ranking by exploiting neighborhood feature consensus at multiple scales.
- Merge-to-match: the top-ranked anchor pairs are merged to produce final dense correspondences, from which the transformation is recovered via either LGR (RANSAC-free) or RANSAC.
DSAM is supervised end-to-end by an alignment-aware weighting loss (AWL) that uses on-the-fly anchor-pair alignment errors as the ranking target.
Registration Recall (%) on Color3DMatch (C3DM) and Color3DLoMatch (C3DLM):
| Method | Estimator | C3DM RR ↑ | C3DLM RR ↑ |
|---|---|---|---|
| CoFiNet | RANSAC-50k | 89.3 | 67.5 |
| GeoTransformer | RANSAC-50k | 92.0 | 75.0 |
| PEAL | RANSAC-50k | 94.6 | 81.7 |
| ColorPCR | RANSAC-50k | 96.7 | 88.9 |
| SuP (ours) | RANSAC-50k | 98.1 | 90.4 |
| CoFiNet | LGR | 87.6 | 64.8 |
| GeoTransformer | LGR | 91.5 | 74.0 |
| PEAL | LGR | 94.3 | 81.2 |
| ColorPCR | LGR | 96.5 | 88.3 |
| SuP (ours) | LGR | 97.8 | 90.2 |
SuP is the new state of the art on both benchmarks under both estimators; the LGR (RANSAC-free) numbers also match or exceed prior RANSAC-50k baselines.
SuP is tested with PyTorch ≥ 1.13 + CUDA ≥ 11.7 and builds two custom C++/CUDA extensions for point-cloud subsampling and radius neighbor search.
git clone https://github.com/SheldonFung98/SuP.git
cd SuP
# (optional) create a fresh environment
conda create -n sup python=3.10 -y
conda activate sup
# install PyTorch matching your CUDA toolkit (1.13+ recommended)
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118
# project dependencies
pip install -r requirements.txt
# build the C++/CUDA extensions
python setup.py build_ext --inplaceThe Docker workflow (./config.sh + ./post_config.sh) used during development is also supported on Linux.
The repo now compiles cleanly under MSVC (Visual Studio 2019 / 2022). Prerequisites:
- Visual Studio Build Tools 2019 or 2022 with the "Desktop development with C++" workload.
- CUDA Toolkit matching your PyTorch build (e.g. CUDA 11.8 → PyTorch 2.0+cu118).
- Python 3.10 (Anaconda recommended).
From a "x64 Native Tools Command Prompt for VS 2022" or a PowerShell with vcvarsall.bat sourced:
git clone https://github.com/SheldonFung98/SuP.git
cd SuP
conda create -n sup python=3.10 -y
conda activate sup
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt
# build C++/CUDA extensions (uses MSVC + nvcc; setup.py auto-selects /O2 /std:c++17 /EHsc on Windows)
python setup.py build_ext --inplaceIf you see LNK2019 unresolved external symbol from torch_python.lib, make sure the PyTorch wheel matches your Python version and that cl.exe is on PATH (i.e. you launched from the VS Native Tools shell).
Coming soon — will be released here as a GitHub Release.
Download the dataset here and arrange it as:
dataset/
├── data/
│ ├── train/7-scenes-chess/cloud_bin_0.npy
│ │ └── ...
│ └── test/7-scenes-redkitchen/cloud_bin_0.npy
│ └── ...
The training entry point lives in experiments/SOAR:
# single GPU
CUDA_VISIBLE_DEVICES=0 python trainval.py# 3DMatch
CUDA_VISIBLE_DEVICES=0 ./eval.sh <epoch> <benchmark><epoch> is the checkpoint epoch id; <benchmark> is one of 3DMatch, 3DLoMatch, first, second, third, forth. The latter four correspond to overlap bins [0.1–0.15, 0.15–0.2, 0.2–0.25, 0.25–0.3]; the first two cover >0.3 (C3DM) and 0.1–0.3 (C3DLM) respectively.
To evaluate a released checkpoint directly:
CUDA_VISIBLE_DEVICES=0 python test.py --snapshot=../../weights/ckpts.pth.tar --benchmark=3DMatch
CUDA_VISIBLE_DEVICES=0 python eval.py --benchmark=3DMatch --method=lgrCUDA_VISIBLE_DEVICES=<GPUS> python -m torch.distributed.launch \
--nproc_per_node=<N_GPU> --master_port=<PORT> trainval.py
# example: 2 GPUs
CUDA_VISIBLE_DEVICES=0,5 python -m torch.distributed.launch \
--nproc_per_node=2 --master_port=29501 trainval.pyIf you find SuP useful, please cite:
@inproceedings{fung2026sup,
title = {SuP: Sub-cloud Driven Point Cloud Registration},
author = {Fung, Sheldon and Pan, Wei and Cao, Ling and Hou, Fei and
Chen, Ling and Mao, Shasha and Li, Hongdong and Lu, Xuequan},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2026},
note = {Highlight}
}SuP builds on the excellent work of: