Skip to content

humansensinglab/MVCHead

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Multi-view Consistent 3D Gaussian Head Avatars 'without' Multi-view Generation

Carnegie Mellon University 
CVPR 2026

Webpage GitHub Stars


MVCHead achieves state-of-the-art for unconditional generation of high fidelity, multi-view consistent 3D Gaussian head avatars in minimal resource setting, without requiring intermediate views, or even 3D data. The generated Gaussian heads capture complex textures and fine facial micro-structure, including wrinkles, hair wisps, ear rims, lip contours, skin blemishes, eyes, and accessories.


🚀 Updates

  • Coming Soon: MVCHead Codes, Weights and FaceGS-10K dataset. Stay Tuned!
  • June. 7, 2026: We will be presenting our Poster at CVPR 2026. Stop to check our Poster. See everyone at Denver!
  • May. 25, 2026: MVCHead project page is now live!
  • May. 25, 2026: We released the MVCHead Paper on arXiv. Check the preprint!

📖 Abstract

High-fidelity 3D Gaussian head avatar generation is critical for applications such as AR/VR, telepresence, and digital humans. Existing methods depend on multi-view datasets, 3D captures, or intermediate 2D view synthesis. In contrast, we learn both conditional and unconditional 3D head models from randomly sampled 2D images alone, without using multi-view data, 3D supervision, or intermediate view generation. We introduce MVCHead, a single-shot state space model that enforces multi-view consistency (MVC) directly in the 3D representation while regressing 3D Gaussians under these constraints. At its core, we propose a Hierarchical State Space (HiSS) block that progressively refines Gaussians from coarse to fine, while capturing long-range dependencies. Within each HiSS block, we modify Mamba's standard unidirectional scan with the proposed Hierarchical Bi-directional State Scan (HiBiSS) that aligns recurrence with the axes along which multi-view inconsistencies are strongest. Finally, we design an SE(3) Multi-view Critic that judges whether a set of self-renders arises from a single underlying 3D configuration, rewarding cross-view pixel alignment without observing real multi-view pairs. MVCHead achieves state-of-the-art perceptual quality, surpasses prior methods in both texture and geometric consistency, and maintains comparable shape consistency. To demonstrate scalability, we release FaceGS-10K, the first large-scale dataset of ready-to-use 3D Gaussian head assets for training and evaluation of 3D head models.

Star History

Star History Chart

License

Shield: CC BY-NC 4.0

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. Permission is granted for non-commercial research. For commerical use, please reachout to our Lab.

CC BY-NC 4.0

Acknowledgements

Parts of the codes have been taken and adapted from the below repos. Please acknowledge and adhere to the licenses of each repository that MVCHead builds upon.

📑 Citation

If you find our work useful for your project, please consider adding a star to this repo and citing our paper:

    @inproceedings{chharia2026multiviewconsistent,
        title={Multi-view Consistent 3D Gaussian Head Avatars 'without' Multi-view Generation},
        author={Aviral Chharia and Fernando De la Torre},
        archivePrefix={arXiv},
        primaryClass={cs.CV},
        year={2026}
    } 

About

[CVPR 2026] Multi-view Consistent 3D Gaussian Head Avatars 'without' Multi-view Generation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages