CGGS: Consistency-Augmented Geometric Gaussian Splatting for Ego-centric 3D Scene Generation (TIP 2026)
Zhenyu Sun, Xiaohan Zhang, Qi Liu
[Project Page] [Paper]
This repo contains the implementation of CGGS, a new framework for ego-centric 3D scene generation from textual description. With the novel insight in MV-LDM and 3D Gaussian optimization, our method surpasses previous counterparts in terms of semantic alignment, perceptual quality, and rendering fidelity when producing realistic, domain-free 3D scenes.
| Method | 3D Representations | Generation Quality | Reconstruction Quality | ||||||
|---|---|---|---|---|---|---|---|---|---|
| CLIP Score ↑ | Sharp ↑ | Color ↑ | Resolution ↑ | Q-Align ↑ | PSNR ↑ | SSIM ↑ | LPIPS ↓ | ||
| Text2Room | Mesh | 24.732 | 0.215 | 0.210 | 0.231 | 0.697 | 20.915 | 0.844 | 0.169 |
| LucidDreamer | 3DGS | 25.736 | 0.216 | 0.211 | 0.224 | 0.764 | 25.667 | 0.824 | 0.163 |
| Director3D | 3DGS | 24.996 | 0.221 | 0.225 | 0.232 | 0.754 | - | - | - |
| DreamScene360 | 3DGS | 25.022 | 0.219 | 0.204 | 0.239 | 0.828 | 32.587 | 0.969 | 0.0477 |
| CGGS | 3DGS | 26.253 | 0.218 | 0.211 | 0.231 | 0.839 | 37.345 | 0.977 | 0.0193 |
Quantitative comparison of generation and reconstruction quality. We compare our method with representative text-to-3D and 3D scene generation methods. As shown, CGGS achieves the best performance on CLIP Score, Q-Align, PSNR, SSIM, and LPIPS, demonstrating superior generation and reconstruction quality. The best results are highlighted in bold, and the second-best results are shown in italic.
Qualitative comparison between CGGS with other baselines. CGGS produces multi-view images with rich detail and superior semantic coherence, showcasing domain‑agnosticity. Our results outperform other works with an accurately detailed description and unified 3D consistency. Specifically, DreamScene360 generates visual results with less major content in the horizon field; While Director3D is capable of depicting the content described in text prompts, it is constrained by a limited field of view; LucidDreamr causes undesirable style transfer, wrong stitches between concepts, and inconsistent content, as highlighted in the red box.
- Clone this repo:
git clone https://github.com/CGGS-26/CGGS.git
cd CGGS
- Create the environment and install dependencies.
conda create -n CGGS python=3.10.14 -y
conda activate CGGS
pip install -r requirements.txt
pip install MVRec/submodules/diff-gaussian-rasterization
pip install MVRec/submodules/simple-knn- Prepare the dependencies for LayoutDecorator.
cd LayoutDecorator
pip install -r requirements.txt
Ego-centric generator can be called via following commands:
cd MVGen # make sure you are under the CGGS/MVGen
# Default Example
python generate.py --gen_video --save_frames \[Other options\]
python select_range.py --source ./outputs/$results --target ../generate_mvimages[Other Options] include:
--fov: Denote the horizontal field of camera view, 90 in degrees as default.--deg: Specify the rotation angle around the vertical axis, 45 in degrees as default.--prompt_folder: Path to the text file containing the prompts including different scenes.
If you are in the root directory of the project, just simply run
bash scripts/generate.shNow the project structure should be like:
CGGS/
├── MVGen/
│ ├── generate.py
│ ├── select_range.py
│ ├── outputs/
│ │ └── <results_1>/
│ └── weights/
└── generate_mvimages/
│── <results_1>/
│ └── <scene_1>/
│ └── images/
└── ...
Specifically, you need to download the checkpoint from here, and then put it under MVGen/weights/pano/last/.
To fine-tune the pano-generation model with our proposed consistency-augmented loss, please download data from matterport3D skybox data and labels.
Then you can follow MVDiffusion for detailed training steps.
To use your own data, please also follow the organization as follows:
CGGS/
└── MVGen/
└── data/
└── mp3d_skybox/
├── train.npy
├── test.npy
├── 5q7pvUzZiYa/
│ ├── blip3/
│ └── matterport_skybox_images/
├── 1LXtFkjw3qL/
└── ...
cd ../flowmap
python3 -m flowmap.overfit dataset=images dataset.images.root=../generate_mvimages/$results/$scene/imagesPre-trained checkpoint for LayoutDecorator can be found here, and should be organized under CGGS/LayoutDecorator/checkpoints/.
To train your own, download the RealEstate-10k and CO3Dv2, and then run the commands below, following flowmap:
cd LayourDecorator
python3 -m flowmap.pretraincd ../MVRec
python train.py -s ../flowmap/outputs/local/colmap/ --name $sceneThen you can check the rendered images, metrics and the gaussian pointclouds in
MVRec/rendered_images
MVRec/metrics
MVRec/output
If you find our work helpful, please consider citing:
@article{sun2026cggs,
title = {CGGS: Consistency-Augmented Geometric Gaussian Splatting for Ego-centric 3D Scene Generation},
author = {Zhenyu Sun and Xiaohan Zhang and Qi Liu and Huan Wang},
journal = {IEEE Transactions on Image Processing},
year = {2026},
}
