A PyTorch implementation and fair comparison of three transformer architectures for point clouds:
- Point Transformer — Hengshuang Zhao et al.
- PCT: Point Cloud Transformer — Meng-Hao Guo et al.
- Point Transformer — Nico Engel et al.
All three models are implemented behind a common training pipeline (same data, same augmentation, same schedule), so their results can be compared under one consistent setting. Configuration is managed with Hydra, so switching models or hyperparameters is a one-flag change.
config/
├── cls.yaml # classification hyperparameters
├── partseg.yaml # part segmentation hyperparameters
└── model/ # per-model configs: Hengshuang / Menghao / Nico
models/
├── Hengshuang/ # Point Transformer (vector attention + transition down/up)
├── Menghao/ # PCT: Point Cloud Transformer
└── Nico/ # Point Transformer (SortNet + local-global attention)
train_cls.py # ModelNet40 classification training/eval
train_partseg.py # ShapeNet part segmentation training/eval
dataset.py # ModelNet40 / ShapeNetPart data loaders
provider.py # point cloud augmentations
pip install -r requirements.txtRequires PyTorch with CUDA; the training scripts assume a GPU.
Download the resampled, aligned ModelNet40 (modelnet40_normal_resampled.zip) and extract it to modelnet40_normal_resampled/ at the repo root.
# default model is set in config/cls.yaml
python train_cls.py
# or pick a model explicitly
python train_cls.py model=Hengshuang
python train_cls.py model=Menghao
python train_cls.py model=Nico
# sweep all three with Hydra multirun
python train_cls.py model=Hengshuang,Menghao,Nico -mLogs and the best checkpoint (best_model.pth) are written to log/cls/<model>/.
Adam, learning rate decay 0.3 every 50 epochs, 200 epochs total; data augmentation follows Pointnet_Pointnet2_pytorch. For Hengshuang and Nico the initial LR is 1e-3 (these hyperparameters could likely be tuned further); for Menghao it is 1e-4, as suggested by the author.
ModelNet40 classification accuracy (instance average):
| Model | Accuracy |
|---|---|
| Hengshuang | 91.7 |
| Menghao | 92.6 |
| Nico | 85.5 |
Download the aligned ShapeNetPart benchmark (shapenetcore_partanno_segmentation_benchmark_v0_normal.zip) and extract it to data/shapenetcore_partanno_segmentation_benchmark_v0_normal/.
python train_partseg.py model=HengshuangLogs and checkpoints are written to log/partseg/<model>/. Currently only Hengshuang's architecture has a segmentation head implemented.
After training, evaluate the saved checkpoint and export colored point clouds:
python test_partseg.py model=Hengshuang # evaluate + export 20 shapes
python test_partseg.py model=Hengshuang num_visual=50This reports accuracy / class-avg mIoU / instance-avg mIoU on the test split and, for each exported shape, writes two ASCII .ply files to log/partseg/<model>/visual/:
<idx>_<category>_pred.ply— predicted part labels (colored)<idx>_<category>_gt.ply— ground-truth part labels (colored)
Open them in any point cloud viewer. For example, with Open3D:
import open3d as o3d
o3d.visualization.draw_geometries([o3d.io.read_point_cloud("log/partseg/Hengshuang/visual/0_Airplane_pred.ply")])The script runs on GPU if available and otherwise falls back to CPU.
MIT — see LICENSE.
- Training pipeline and data augmentation adapted from Pointnet_Pointnet2_pytorch.
- The Menghao (PCT) implementation is adapted from the author's Jittor version: MenghaoGuo/PCT.