Skip to content

VAST-AI-Research/TriplaneGaussian

Repository files navigation

Triplane Meets Gaussian Splatting:
Fast and Generalizable Single-View 3D Reconstruction with Transformers

TGS enables fast reconstruction from single-view images in a few seconds based on a hybrid Triplane-Gaussian 3D representation.

teaser


Official implementation of Triplane Meets Gaussian Splatting: Fast and Generalizable Single-View 3D Reconstruction with Transformers.

⭐️ Key Features

  • A new hybrid Triplane-Gaussian 3D representation that leverages both explicit and implicit representation.
  • High-quality 3D reconstruction from single-view images within a second.

🚩 News

  • [01/17/2024] We release the inference code and a pretrained model.
  • [01/09/2024] We release a Gradio demo on HuggingFace Spaces.

💻 Examples

Please try our model online in the Gradio demo on Hugging Face Space.

TGS-gradio-demo.mp4

Results on Images Generated by Midjourney

TGS_MJ_results.mov

Results on Captured Real-world Images

TGS_real_results.mov

🏁 Quick Start

Colab Demo

Run TGS in Google Colab: Open In Colab

Installation

  • Python >= 3.8
  • Install PyTorch >= 1.12. We have tested on torch1.12.1+cu113, but other versions should also work fine.
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113
  • Install pointnet2_ops
cd tgs/models/snowflake/pointnet2_ops_lib && python setup.py install && cd -
  • Install pytorch_scatter
pip install git+https://github.com/rusty1s/pytorch_scatter.git
  • Install diff-gaussian-rasterization
pip install git+https://github.com/graphdeco-inria/diff-gaussian-rasterization.git
  • Install dependencies:
pip install -r requirements.txt
  • Install PyTorch3D following its official installation instruction.

Download the Pretrained Model

We offer a pretrained checkpoint available for download from Hugging Face; download the checkpoint and place it in the folder checkpoints.

from huggingface_hub import hf_hub_download
MODEL_CKPT_PATH = hf_hub_download(repo_id="VAST-AI/TriplaneGaussian", local_dir="./checkpoints", filename="model_lvis_rel.ckpt", repo_type="model")

Please note this model is only trained on Objaverse-LVIS dataset (~45K 3D models). Models with more parameters (e.g., deeper layers, more feature channels) and trained on larger datasets (e.g., the full Objaverse dataset) should achieve stronger performance, and we will explore it in the future.

Inference

Use the following command to reconstruct a 3DGS model from a single image. Please update data.image_list to some specific list of image paths.

python infer.py --config config.yaml data.image_list=[path/to/image1,] --image_preprocess --cam_dist ${cam_dist}
# e.g. python infer.py --config config.yaml data.image_list=[example_images/a_pikachu_with_smily_face.webp,] --image_preprocess

If you wish to remove the background from the input image, you can turn on the --image_preprocess argument in the command. Before that, please download the SAM checkpoint and place it in checkpoints folder as well.

--cam_dist is used to set camera distance parameter, which denotes distance between camera center and scene center and is default as 1.9.

Finally, the script will save a video (.mp4) and a 3DGS (.ply) file. The format of .ply file is consistent with graphdeco-inria/gaussian-splatting, making it compatible with other visualization tools such as gsplat.js.

Local Gradio Demo

Our Gradio demo depends on a custom Gradio component for 3DGS rendering. Please clone this component first:

git clone https://github.com/dylanebert/gradio-splatting.git gradio_splatting

Then, you can launch the Gradio demo locally by:

python gradio_app.py

📝 Some Tips

  • If you find the result unsatisfactory, please try to change the camera distance parameter. For example, if the reconstructed 3D model appears "flattened", you may consider increasing the camera distance, e.g., set --cam_dist 2.1. Conversely, if the 3D model appears thick, you can decrease it. This could improves the results.

Acknowledgements

  • This project is supported by Tsinghua University and VAST.
  • We would like to thank @totoro97 for helpful discussion.
  • Our point cloud upsampling module is modified from SnowflakeNet.

Citation

If you find this work helpful, please consider citing our paper:

@article{zou2023triplane,
  title={Triplane Meets Gaussian Splatting: Fast and Generalizable Single-View 3D Reconstruction with Transformers},
  author={Zou, Zi-Xin and Yu, Zhipeng and Guo, Yuan-Chen and Li, Yangguang and Liang, Ding and Cao, Yan-Pei and Zhang, Song-Hai},
  journal={arXiv preprint arXiv:2312.09147},
  year={2023}
}