Skip to content

heheyas/V3D

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

V3D: Video Diffusion Models are Effective 3D Generators

Zilong Chen1,2, Yikai Wang1, Feng Wang1, Zhengyi Wang1,2, Huaping Liu1

1Tsinghua University, 2ShengShu

This repository contains the official implementation of V3D: Video Diffusion Models are Effective 3D Generators.

What's New

[2024.3.14] Our demo is currently available at (here)[https://huggingface.co/spaces/heheyas/V3D]. I will add more checkpoints and more examples recently.

[Work in Progress]

We are currently working on making this completely publicly available (including refactoring code, uploading weights, etc.), so please be patient.

Video results

Single Image to 3D

Generated Multi-views

000413.mp4
000183.mp4

Reconstructed 3D Gaussian Splats

1a47fe68-a.mp4
2b391cd1-2.mp4
5f24e598-7.mp4
182a7b56-9.mp4
d8181e41-e.mp4
e4c98179-6.mp4

Sparse view scene generation (On CO3D hydrant category)

hydrant_1.mp4
hydrant_5.mp4
hydrant_4.mp4
hydrant_3.mp4
hydrant_2.mp4

Instructions:

  1. Install the requirements:
pip install -r requirements.txt
  1. Download our weights for V3D
wget https://huggingface.co/heheyas/V3D/resolve/main/V3D.ckpt -O ckpts/V3D_512.ckpt
wget https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt/resolve/main/svd_xt.safetensors -O ckpts/svd_xt.safetensors
  1. Run the V3D Video diffusion to generate dense multi-views
PYTHONPATH="." python scripts/pub/V3D_512.py --input_path <image file or dir> --save --border_ratio 0.3 --min_guidance_scale 4.5 --max_guidance_scale 4.5 --output-folder <output-dest>
  1. Reconstruct 3D assets from generated multi-views Using 3D Gaussian Splatting
PYTHONPATH="." python recon/train_from_vid.py  -w --sh_degree 0 --iterations 4000 --lambda_dssim 1.0 --lambda_lpips 2.0 --save_iterations 4000 --num_pts 100_000 --video <your generated video>

Or using (NeuS) instant-nsr-pl:

cd mesh_recon
PYTHONPATH="." python launch.py --config configs/videonvs.yaml --gpu <gpu> --train system.loss.lambda_normal=0.1 dataset.scene=<scene_name> dataset.root_dir=<output_dir> dataset.img_wh='[512, 512]'

Refine texture

python refine.py --mesh <your obj mesh file> --scene <your video> --num-opt 16 --lpips 1.0 --iters 500

Acknowledgement

This code base is built upon the following awesome open-source projects:

Thank the authors for their remarkable job !

About

V3D: Video Diffusion Models are Effective 3D Generators

Resources

Stars

Watchers

Forks