Skip to content

RobWonn/PGNet

Repository files navigation

Rethinking Multimodal Point Cloud Completion:
A Completion-by-Correction Perspective

arXiv Hugging Face Dataset Hugging Face Checkpoints License: Apache-2.0

Overview

This repository contains the official implementation for "Rethinking Multimodal Point Cloud Completion: A Completion-by-Correction Perspective" (AAAI 2026), which introduces PGNet — a multimodal point cloud completion framework that shifts from the traditional Completion-by-Inpainting paradigm to a more robust Completion-by-Correction strategy. Instead of synthesizing missing geometry from fused features, PGNet starts with a topologically complete generative prior (via an image-to-3D model) and corrects it using partial point cloud observations. By grounding a complete scaffold with reliable geometric cues, PGNet achieves state-of-the-art performance on ShapeNet-ViPC with significantly improved structural consistency and geometric fidelity.

The main components of this repo include:

  • generate_point_cloud.py: generate high-quality prior point clouds from rendered views using Trellis;
  • train.py / train.sh: train MMPC models on ShapeNetViPC;
  • inference.py: perform category-level evaluation on the test set (Chamfer-L2 / F-Score / EMD);
  • utils, metrics, models, extensions: data loading, evaluation metrics, network architectures and CUDA extensions.

Environment

We test our code in a Ubuntu 24.04 LTS + NVIDIA RTX 4090 GPU environment.

  • Python 3.10
  • PyTorch 2.4.0 (with CUDA)
  • CUDA 12.1 (via pytorch-cuda=12.1)

Recommended: create from environment.yml

We provide a Conda environment file:

conda env create -f environment.yml
conda activate pgnet

Build CUDA Extensions

This project depends on several CUDA extensions.
Build and install them as follows (run each block from the repo root):

# PointNet++ operators
cd extensions/pointnet2_ops_lib
pip install .

# Vox2Seq operators
cd extensions/vox2seq
pip install .

# Chamfer Distance
cd metrics/chamfer_dist
pip install .

# EMD
cd metrics/EMD
pip install .

Data Preparation

We train and evaluate on the ShapeNetViPC dataset.
Assume the dataset root is:

./data/ShapeNetViPC-Dataset
├── ShapeNetViPC-Gen              # generated point clouds from image-to-3D model (.pt)
├── ShapeNetViPC-Partial          # partial point clouds (.dat)
├── ShapeNetViPC-GT               # complete GT point clouds (.dat)
├── ShapeNetViPC-View             # rendered views (png + metadata)
├── train_list.txt                # train split file list
└── test_list.txt                 # test  split file list

ShapeNetViPC-Gen contains prior point clouds generated in this work by applying an image-to-3D model to the rendered views in ShapeNetViPC-View. We also provide our pre-generated prior point clouds using Trellis at our Hugging Face dataset Wang131/ShapeNetViPC-Gen.

ShapeNetViPC-Partial, ShapeNetViPC-GT, ShapeNetViPC-View,
as well as train_list.txt and test_list.txt are directly taken from the official ShapeNetViPC dataset. Please refer to the official ShapeNetViPC repo (Hydrogenion/ViPC) for obtaining the original data.

Generate Prior Point Clouds(Trellis)

Prior point clouds generated with Microsoft TRELLIS are saved under the following directory structure:

/path/to/ShapeNetViPC-Dataset/ShapeNetViPC-Gen/trellis/<sampling_method>/num_points_<N>/

You can generate them with generate_point_cloud.py:

python generate_point_cloud.py \
  --data_path /path/to/ShapeNetViPC-Dataset \
  --output_dir /path/to/ShapeNetViPC-Dataset/ShapeNetViPC-Gen/trellis \
  --categories plane,chair,table \
  --gpu_ids 0,1 \
  --num_workers_per_gpu 1 \
  --prefetch_size 20 \
  --loader_threads 5 \
  --sampling_threads 10 \
  --num_points 2048 \
  --sampling_method poisson_disk

The script will:

  • automatically load the microsoft/TRELLIS-image-large model (from Hugging Face);
  • iterate over rendered images in ShapeNetViPC-View to generate meshes;
  • sample meshes into point clouds and save them as .pt files, optionally writing visualizations under .output.

Once generation is done, utils/dataloader.PCDataLoader will automatically look for generated point clouds under:

ShapeNetViPC-Gen/trellis/<sampling_method>/num_points_<gen_points>/

Training

The main training entry is train.py, and we recommend using train.sh for single- or multi-GPU training on a single node.

Single-GPU Training Example

bash train.sh \
  --config configs/ShapeNet-ViPC/PGNet/airplane.yaml \
  --gpu_num 1 \
  --gpu_ids 0

Multi-GPU Training Example (single node)

bash train.sh \
  --config configs/ShapeNet-ViPC/PGNet/airplane.yaml \
  --gpu_num 4 \
  --gpu_ids 0,1,2,3

train.sh will set CUDA_VISIBLE_DEVICES for you, and for multi-GPU it will call:

  • torchrun --standalone --nnodes=1 --nproc_per_node=<GPU_NUM> train.py

Important training-related configs are all in the YAML file, e.g.:

  • training.global_batch_size: logical global batch size;
  • training.gradient_accumulation_steps: gradient accumulation steps;
  • training.max_steps / training.eval_steps: max training steps & eval interval;
  • output.base_path: root directory for all experiment outputs (default output).

GPU memory and gradient accumulation

  • If you run into OOM (out of memory), increase training.gradient_accumulation_steps first.
  • Keep training.global_batch_size unchanged to preserve the same optimization dynamics; changing it alters the effective batch size and can lead to different training results.
  • The per-GPU physical batch size is computed as global_batch_size // (gradient_accumulation_steps * WORLD_SIZE) (see train.py). Here, WORLD_SIZE is the total number of participating GPUs/processes; in our single-node train.sh examples, WORLD_SIZE == --gpu_num.
  • Ensure global_batch_size % (gradient_accumulation_steps * WORLD_SIZE) == 0.

After training, the default output structure looks like:

output/
  PGNet_plane_poisson_disk_2048_YYYYMMDD_HHMMSS/
    ├── checkpoints/   # latest_step_*.pth, best_step_*.pth
    ├── configs/       # a snapshot of the config used for this run
    └── logs/          # TensorBoard logs

Pretrained Checkpoints

We provide pretrained PGNet checkpoints on Hugging Face:

PGNet checkpoints dataset: Wang131/PGNet_ckpt

Evaluation / Inference

inference.py is used for category-level evaluation on the full test set, and reports three metrics:

  • L2 Chamfer Distance (using the fine output);
  • F-Score (threshold = 0.001);
  • Earth Mover’s Distance (EMD).

Example usage:

python inference.py \
  -C configs/ShapeNet-ViPC/PGNet/airplane.yaml \
  -M /path/to/checkpoints/best_step_xxxx_xxx.pth \
  --device cuda:0

The script will iterate over all samples listed in test_list.txt, compute per-sample metrics, and print the averaged results.

Citation

If you find this repository or our paper helpful for your research, please consider citing us:

@article{luo2025rethinking,
  title={Rethinking Multimodal Point Cloud Completion: A Completion-by-Correction Perspective},
  author={Luo, Wang and Wu, Di and Na, Hengyuan and Zhu, Yinlin and Hu, Miao and Quan, Guocong},
  journal={arXiv preprint arXiv:2511.12170},
  year={2025}
}

License

This project is licensed under the Apache License, Version 2.0.

  • Copyright (c) 2025 SYSU/Wang Luo
  • See LICENSE for the full license text and NOTICE for attributions.

Acknowledgements

This project primarily uses code from the following repositories:

We also acknowledge:

  • PointNet++ and related CUDA extensions;

If you encounter issues or find bugs, feel free to open an Issue or submit a Pull Request.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published