Skip to content

Commit

Permalink
init code
Browse files Browse the repository at this point in the history
  • Loading branch information
justimyhxu committed Apr 2, 2024
0 parents commit 4cd9c39
Show file tree
Hide file tree
Showing 1,755 changed files with 464,618 additions and 0 deletions.
107 changes: 107 additions & 0 deletions Readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
# GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation



> **GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation** <br>
> Yinghao Xu*, Zifan Shi*, Wang Yifan, Hansheng Chen, Ceyuan Yang, Sida Peng, Yujun Shen, Gordon Wetzstein<br>
## [[Paper](https://arxiv.org/abs/2403.14621)] [[Project Page](https://justimyhxu.github.io/projects/grm)] [[Blender Demo](https://github.com/justimyhxu/GRM/assets/29980330/0cf713aa-ba87-4a15-a8ee-1b0da643cb3c)] [[HF Demo](https://huggingface.co/spaces/GRM-demo/GRM)][[Weights](https://huggingface.co/justimyhxu/GRM/tree/main)]

https://github.com/justimyhxu/GRM/assets/29980330/32f41f04-5ebe-4aa4-b1b7-bf4f78e5f197

## Todo List
- [x] Release gradio demo code.
- [x] Release inference code.
- [x] Release pretrained models.
- [ ] Release training code.

## GRM Demo
* [Huggingface Demo](https://huggingface.co/spaces/GRM-demo/GRM)
* [Replicate Demo](https://replicate.com/camenduru/grm). Thanks [@camenduru](https://github.com/camenduru) for the [jupyter code](https://github.com/camenduru/GRM-jupyter)!

## Requirements
* 64-bit Python 3.10 and PyTorch 2.0.1 or higher.
* CUDA 11.8
* Users can use the following commands to install the packages
```bash
conda create -n grm python=3.10
conda activate grm
pip install -r requirements.txt --extra-index-url https://download.pytorch.org/whl/cu118
cd third_party/diff-gaussian-rasterization && pip install -e .
```
## Pretrained weights
Pretrained weights can be downloaded from [Hugging Face](https://huggingface.co/justimyhxu/GRM/tree/main).
```bash
# Example
mkdir checkpoints && cd checkpoints
wget https://huggingface.co/justimyhxu/GRM/blob/main/grm_u.pth && cd ..
```

Note that we provide three checkpoints for use. We use the OpenCV coordinate system.

| Checkpoint | Training settings |
| ---------- | ----------------- |
| [grm_u.pth](https://huggingface.co/justimyhxu/GRM/blob/main/grm_u.pth) | The elevations are all 20 degrees and the azimuths uniformly cover all the 360-degree information.|
| [grm_r.pth](https://huggingface.co/justimyhxu/GRM/blob/main/grm_r.pth) | The azimuths roughly cover the 360-degree information. |
| [grm_zero123plus.pth](https://huggingface.co/justimyhxu/GRM/blob/main/grm_zero123plus.pth) | Three views are with 30-degree elevations and the azimuths are evenly distributed at intervals of 120 degrees. Another view has the elevation of -20 degrees and the azimuth is 60 degrees different from one of the three. |
| [instant3d.pth](https://huggingface.co/justimyhxu/GRM/resolve/main/instant3d.pth) | We reproduce the first-stage diffusion model of [instant3d](https://arxiv.org/pdf/2311.06214.pdf), which can produce consistent multi-view images. |


Besides, you need to download checkpoints for [SV3D](https://huggingface.co/stabilityai/sv3d/tree/main).
```bash
cd checkpoints
wget https://huggingface.co/stabilityai/sv3d/blob/main/sv3d_p.safetensors && cd ..
```


## Inference
```bash
# text-to-3D
python test.py --prompt 'a car made out of cheese'
# image-to-3D with zero123plus-v1.1
python test.py --image_path examples/dragon2.png --model zero123plus-v1.1
# image-to-3D with zero123plus-v1.2
python test.py --image_path examples/dragon2.png --model zero123plus-v1.2
# image-to-3D with SV3D
python test.py --image_path examples/dragon2.png --model sv3d
```

Add ```--fuse_mesh True``` if you would like to get the textured mesh.
Add ```--optimize_texture True``` if you would like to optimize texture on extracted textured mesh.

## Gradio Demo
We provide an offline gradio demo, which can be run with the following command:
```bash
python app.py
```

## Results

### Blender Demo
https://github.com/justimyhxu/GRM/assets/29980330/0cf713aa-ba87-4a15-a8ee-1b0da643cb3c

### Sparse-view Reconstruction
https://github.com/justimyhxu/GRM/assets/29980330/d436bca9-ddf9-4507-aed3-828fd6508ec3


## Acknowledgement
We thank all of the following amazing codes:
- [gaussian-splatting](https://github.com/graphdeco-inria/gaussian-splatting), and [diff-gaussian-rasterization](https://github.com/ashawkey/diff-gaussian-rasterization) for depth rendering
- [ARF](https://github.com/Kai-46/ARF-svox2)
- [zero123++](https://github.com/SUDO-AI-3D/zero123plus)
- [Instant3D](https://instant-3d.github.io/)
- [SV3D](https://github.com/Stability-AI/generative-models)
- [V3D](https://github.com/heheyas/V3D)
- [nvdiffrast](https://github.com/NVlabs/nvdiffrast)
- [MVEdit](https://github.com/Lakonik/MVEdit)

## BibTeX

```bibtex
@article{xu2024grm,
author = {Xu, Yinghao and Shi, Zifan and Yifan, Wang and Peng, Sida and Yang, Ceyuan and Shen, Yujun and Wetzstein Gordon},
title = {GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation},
journal = {arxiv: 2403.14621},
year = {2024},
}
```
84 changes: 84 additions & 0 deletions app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
import os
import sys

sys.path.append(os.path.abspath(os.path.join(__file__, '../')))
if 'OMP_NUM_THREADS' not in os.environ:
os.environ['OMP_NUM_THREADS'] = '16'

import shutil
import os.path as osp
import argparse
import torch
import gradio as gr
from functools import partial
from webui.tab_text_to_img_to_3d import create_interface_text_to_img_to_3d
from webui.tab_img_to_3d import create_interface_img_to_3d
from webui.tab_instant3d import create_interface_instant3d
from webui.runner import GRMRunner
from webui.shared_opts import send_to_click


def parse_args():
parser = argparse.ArgumentParser(description='GRM Live Demo')
parser.add_argument('--advanced', action='store_true', help='Show advanced settings')
return parser.parse_args()


def main():
args = parse_args()

torch.set_grad_enabled(False)
device = torch.device('cuda')
runner = GRMRunner(device)

with gr.Blocks(analytics_enabled=False,
title='GRM Live Demo',
css='webui/style.css'
) as demo:
md_txt = '# GRM Live Demo' \
'\n\nOfficial demo of the paper [GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation](https://justimyhxu.github.io/projects/grm/). ' \
'Part of this demo is based on [MVEdit Web UI](https://huggingface.co/spaces/Lakonik/MVEdit).' \
'<br>GRM can reconstruct 3D Gaussians and meshes from various sources, including **Zero123++**, **Instant3D**, **V3D**, **SV3D**. To save VRAM, this demo only supports **Zero123++** and **Instant3D**, while the the full supports will be available in the official [code release](https://github.com/justimyhxu/grm).'
gr.Markdown(md_txt)

with gr.Tabs() as main_tabs:

with gr.TabItem('Image-to-3D', id='tab_img_to_3d'):
_, var_img_to_3d = create_interface_img_to_3d(
runner.run_segmentation,
runner.run_img_to_3d)

with gr.TabItem('Text-to-3D', id='tab_text_to_3d'):
with gr.Tabs() as sub_tabs_text_to_3d:
with gr.TabItem('Instant3D', id='tab_instant3d'):
_, var_instant3d = create_interface_instant3d(
runner.run_instant3d,
examples=[
'a wooden carving of a wise old turtle',
'a glowing robotic unicorn, full body',
'a ceramic mug shaped like a smiling cat',
'a car made out of cheese',
'a beagle in a detective’s outfit',
])
with gr.TabItem('Text-to-Image-to-3D', id='tab_text_to_img_to_3d'):
_, var_text_to_img_to_3d = create_interface_text_to_img_to_3d(
runner.run_text_to_img,
examples=[
[768, 512, 'a wooden carving of a wise old turtle', ''],
[512, 512, 'a glowing robotic unicorn, full body', ''],
[512, 512, 'a ceramic mug shaped like a smiling cat', ''],
],
advanced=args.advanced)

var_text_to_img_to_3d[f'to_img_to_3d'].click(
fn=partial(send_to_click, target_tab_ids=['tab_img_to_3d']),
inputs=[var_text_to_img_to_3d['output_image']],
outputs=[var_img_to_3d['in_image'], main_tabs],
api_name=False
)

demo.queue().launch(share=False)


if __name__ == "__main__":
main()
Binary file added docs/assets/blender_demo.mp4
Binary file not shown.
Binary file added docs/assets/image-to-3d.mp4
Binary file not shown.
Binary file added docs/assets/pipeline.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/sparse-view.mp4
Binary file not shown.
Binary file added docs/assets/text-to-3d.mp4
Binary file not shown.
Binary file added examples/1.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/17_dalle3_rockingchair1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/19_dalle3_stump1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/astronaut.webp
Binary file not shown.
Binary file added examples/bag.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/bowl.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/cdog.webp
Binary file not shown.
Binary file added examples/coat.webp
Binary file not shown.
Binary file added examples/david.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/david.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/dragon2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/dreamcraft3d_00.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/dreamcraft3d_01.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/dreamcraft3d_02.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/frog.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/girl.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/girl1_padded.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/girl2_copy.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/horse.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/horsing.webp
Binary file not shown.
Binary file added examples/image.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/ironman_helmet.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/kunkun.webp
Binary file not shown.
Binary file added examples/panda.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/porsche.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/pumpkin.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/sculpture_0.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/sdog.webp
Binary file not shown.
Binary file added examples/turtle.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/unicorn.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/yann-lecun.jpg
Binary file added examples/zebra.png
Empty file added model/__init__.py
Empty file.
65 changes: 65 additions & 0 deletions model/model.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
import torch
from torch import nn

from model.visual_encoder.vit_gs import ViTGSEncoder
from model.render.gaussian_renderer import GaussianRenderer

class GRM(nn.Module):
def __init__(self, config):

super().__init__()

self.gs_renderer = GaussianRenderer(
renderer_config=config.render.params
)

self.visual_encoder = ViTGSEncoder(
**config.visual.params,
)

self.num_input_views = config.visual.params.get("num_input_views", 1)


def forward_visual(self, x, camera=None, input_c2ws=None, input_fxfycxcy=None):
features = self.visual_encoder(x, camera, input_c2ws=input_c2ws, input_fxfycxcy=input_fxfycxcy)
latent, img_features = features
return latent, img_features, None


def forward(
self,
imgs,
camera: torch.Tensor=None,
num_input_views=None,
input_c2ws=None,
input_fxfycxcy=None,
output_c2ws=None,
output_fxfycxcy=None
):

num_input_views = num_input_views or self.num_input_views
num_input_views = min(num_input_views, imgs.shape[1])

if num_input_views == 1:
imgs = imgs[:, 0]
camera = camera[:, 0]
else:
imgs = imgs[:, :num_input_views]
camera = camera[:, :num_input_views]
input_c2ws = input_c2ws[:, :num_input_views]
input_fxfycxcy = input_fxfycxcy[:, :num_input_views]

latent, _, posterior = self.forward_visual(imgs, camera, input_c2ws=input_c2ws, input_fxfycxcy=input_fxfycxcy)

result = {"latent": latent, "posterior": posterior}

gs_result = self.gs_renderer.render(latent=latent,
output_c2ws=output_c2ws,
output_fxfycxcy=output_fxfycxcy)
result.update(gs_result)

return result




Empty file added model/render/__init__.py
Empty file.
Loading

0 comments on commit 4cd9c39

Please sign in to comment.