Sensen Gao*, Zhaoqing Wang*, Qihang Cao, Dongdong Yu, Changhu Wang, Tongliang Liu📧, Mingming Gong📧, Jiawang Bian📧
TL;DR: **We present OneWorld, the first diffusion framework for 3D scene generation that operates directly in the feature space of pretrained 3D foundation models**, enabling unified modeling of geometry, appearance, and semantics for superior cross-view consistency.
- Release inference code and checkpoints
- Release training code & dataset preparation
Please cite our paper if you find this repository useful:
@misc{gao2026oneworldtamingscenegeneration,
title={OneWorld: Taming Scene Generation with 3D Unified Representation Autoencoder},
author={Sensen Gao and Zhaoqing Wang and Qihang Cao and Dongdong Yu and Changhu Wang and Tongliang Liu and Mingming Gong and Jiawang Bian},
year={2026},
eprint={2603.16099},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2603.16099},
}