Lists (1)
Sort Name ascending (A-Z)
Stars
[NeurIPS 2024] CV-VAE: A Compatible Video VAE for Latent Generative Video Models
I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models
[CVPR 2025] Official implementation of "MangaNinja: Line Art Colorization with Precise Reference Following"
Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion.
SPAR3D: Stable Point-Aware Reconstruction of 3D Objects from Single Images
A Large-scale High-Quality Synthetic Facial depth Dataset and Detailed deep learning-based monocular depth estimation from a single input image.
[NeurIPS 2024] Generalizable and Animatable Gaussian Head Avatar
[ICLR 2024] Generalizable and Precise Head Avatar from Image(s)
Python tools for 3D face: 3DMM, Mesh processing(transform, camera, light, render), 3D face representations.
FaceScape (PAMI2023 & CVPR2020)
[NeurIPS 2024] Generalizable and Animatable Gaussian Head Avatar
A generative world for general-purpose robotics & embodied AI learning.
Official code for paper: Synergizing Motion and Appearance: Multi-Scale Compensatory Codebooks for Talking Head Video Generation
[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…
[CVPR 2024] TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding
[CVPR'24] Interactive3D: Create What You Want by Interactive 3D Generation
[CVPR 2025] EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation
Stable Video Diffusion Training Code and Extensions.
VideoSys: An easy and efficient system for video generation
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted fo…
High-resolution models for human tasks.
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Netflix-level subtitle cutting, translation, alignment, and even dubbing - one-click fully automated AI video subtitle team | Netflix级字幕切割、翻译、对齐、甚至加上配音,一键全自动视频搬运AI字幕组
Hallo2: Long-Duration and High-Resolution Audio-driven Portrait Image Animation