Uni-3D: A Universal Model for Panoptic 3D Scene Reconstruction

Xiang Zhang*, Zeyuan Chen*, Fangyin Wei, and Zhuowen Tu (*Equal contribution)

This is the repository for the paper Uni-3D: A Universal Model for Panoptic 3D Scene Reconstruction (ICCV 2023).

[Paper]

Getting Started

Environment Setup

(Recommended) Docker Image

We have pre-packaged all dependencies via Docker image. It is built on top of PyTorch 2.1.2 with CUDA 11.8.

You can pull the image via

docker pull zx1239856/uni-3d:0.1.0

Manual Approach

Assume you already have proper PyTorch (>=1.10.1) and CUDA (>=11.3) installation.

Install the following system dependencies

apt-get install ninja-build libopenblas-dev libopenexr-dev

Remove the comment mark on Line 9 of requirements.txt. Install the required Python packages via

pip install -r requirements.txt

Dataset Preparation

3D-FRONT

Please download 3D-FRONT from Dahnert et al. (Panoptic 3D Scene Reconstruction from a Single RGB Image). Extract it under datasets/front3d/data as

unzip front3d.zip -d datasets/front3d/data

Matterport3D

Please request the dataset from the authors of Pano-Re. Extract it under datasets/matterport/data.

Also download the room mask and depth from BUOL. Extract them underdataset/matterport/room_mask and dataset/matterport/depth_gen, respectively.

Folder Structure

matterport/
    meta/
        train_3d.json                                         # Training set metadata
        ...
    data/
        <scene_id>/            
            ├── <image_id>_i<frame_id>.png                    # Color image: 320x240x3
            ├── <image_id>_segmap<frame_id>.mapped.npz        # 2D Segmentation: 320x240x2, with 0: pre-mapped semantics, 1: instances
            ├── <image_id>_intrinsics_<camera_id>.png         # Intrinsics matrix: 4x4
            ├── <image_id>_geometry<frame_id>.npz             # 3D Geometry: 256x256x256x1, truncated, (unsigned) distance field at 3cm voxel resolution and 12 voxel truncation.
            ├── <image_id>_segmentation<frame_id>.mapped.npz  # 3D Segmentation: 256x256x256x2, with 0: pre-mapped semantics & instances
            ├── <image_id>_weighting<frame_id>.npz            # 3D Weighting mask: 256x256x256x1
    depth_gen/
        <scene_id>/     
            ├── <posithion_id>_d<frame_id>.png                # Depth image: 320x240x1
    room_mask/
        <scene_id>/   
            ├── <posithion_id>_rm<frame_id>.png               # Room mask: 320x240x1

Pre-trained Weights

Model	PRQ	RSQ	RRQ	Download
3D-FRONT Pretrained 2D	--	--	--	front3d_dps_160k.pth
3D-FRONT Single-scale	52.51	60.89	83.97	front3d_full_single_scale.pth
3D-FRONT Multi-scale	53.53	61.69	84.69	front3d_full_multi_scale.pth
Matterport Pretrained 2D	--	--	--	matterport_dps_120k.pth
Matterport Single-scale	16.58	44.26	36.68	matterport_full_single_scale.pth

Run

If you are using docker, you may set the following prefix for convenience.

export DOCKER_PREFIX="docker run -it --gpus all --shm-size 128G -v "$(pwd)":/workspace zx1239856/uni-3d:0.1.0"

Training 2D (Panoptic Segmentation/Depth) Model

$DOCKER_PREFIX OMP_NUM_THREADS=16 torchrun --nproc_per_node=8 train_net.py --config-file configs/front3d/mask2former_R50_bs16_160k.yaml OUTPUT_DIR <path-to-output-dir>

Training 3D Reconstruction Model

$DOCKER_PREFIX OMP_NUM_THREADS=16 torchrun --nproc_per_node=8 train_net.py --config-file configs/front3d/uni_3d_R50.yaml MODEL.WEIGHTS <path-to-pretrained-2d-model> OUTPUT_DIR <path-to-output-dir>

Use uni_3d_R50_ms.yaml for multi-scale feature reprojection.

Please adjust --nproc_per_node, OMP_NUM_THREADS and SOLVER.IMS_PER_BATCH based on your environment.

Evaluate

Please add --eval-only flag to the training scripts above for evaluation.

Demo

You can generate meshes for visualization for 3D-FRONT images via the following command.

python demo_front3d.py -i <path-to-3d-front-image> -o <path-to-output-dir> -m <path-to-pretrained-model>

Citation

Please consider citing Uni-3D if you find the work helpful.

@InProceedings{Zhang_2023_ICCV,
    author    = {Zhang, Xiang and Chen, Zeyuan and Wei, Fangyin and Tu, Zhuowen},
    title     = {Uni-3D: A Universal Model for Panoptic 3D Scene Reconstruction},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2023},
    pages     = {9256-9266}
}

License

This repository is released under the Apache License 2.0. License can be found in LICENSE file.

Acknowledgement

Mask2Former for the framework.
panoptic-reconstruction for the pre-processed 3D-FRONT and Matterport dataset, and evaluation codes.
BUOL for generated depth and room mask on Matterport dataset.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
configs		configs
datasets		datasets
figures		figures
uni_3d		uni_3d
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
demo_front3d.py		demo_front3d.py
requirements.txt		requirements.txt
train_net.py		train_net.py

License

mlpc-ucsd/Uni-3D

Folders and files

Latest commit

History

Repository files navigation

Uni-3D: A Universal Model for Panoptic 3D Scene Reconstruction

Getting Started

Environment Setup

(Recommended) Docker Image

Manual Approach

Dataset Preparation

3D-FRONT

Matterport3D

Folder Structure

Pre-trained Weights

Run

Training 2D (Panoptic Segmentation/Depth) Model

Training 3D Reconstruction Model

Evaluate

Demo

Citation

License

Acknowledgement

About

Topics

Resources

License

Stars

Watchers

Forks

Languages