RAM-Avatar: Real-time Photo-Realistic Avatar from Monocular Videos with Full-body Control

Xiang Deng¹, Zerong Zheng², Yuxiang Zhang¹, Jingxiang Sun¹, Chao Xu², XiaoDong Yang³, Lizhen Wang¹, Yebin Liu¹

¹Tsinghua Univserity ²NNKosmos Technology ³Li Auto

Paper (Early access) · [Video]

Abstract: This paper focuses on advancing the applicability of human avatar learning methods by proposing RAM-Avatar, which learns Real-time, photo-realistic Avatar supports full-body control from Monocular videos. To achieve this goal, RAM-Avatar leverages two statistical templates responsible for modeling the facial expression and hand gesture variations, while a sparsely computed dual attention module is introduced upon another body template to facilitate high-fidelity texture rendering for the torsos and limbs. Building on this foundation, we deploy a lightweight yet powerful StyleUnet along with a temporal-aware discriminator to achieve real-time realistic rendering. To enable robust animation for out-of-distribution poses, we propose a Motion Distribution Align module to compensate for the discrepancies between the training and testing motion distribution.Results and extensive experiments conducted in various experimental settings demonstrate the superiority of our proposed method, and a real-time live system is proposed to further push research into applications. The training and testing code will be released for research purposes.

Requirements

python 3.9.17
pytorch 2.0.0+cu118
torchvision 0.15.1+cu118
setuptools 68.0.0
scikit-image 0.22.0
numpy 1.25.2

Datasets

Fit the Smpl-X parameters using ProxyCapV2.
Fit the Faceverse parameters using Faceverse.
Render smpl and face maps using pytorch3d.
Construct the data directory as following.

dataset/train:

|dataset/train
   |——keypoints_mmpose_hand
      |——00000001.json
      |——00000002.json
      |——...
   |——smpl_map
      |——00000001.png
      |——00000002.png
      |——...
   |——smpl_map_001
      |——00000001.png
      |——00000002.png
      |——...
   |——track2
      |——00000001.png
      |——00000002.png
      |——...
    |——00000001.png
    |——00000002.png
    |——...

Train

CUDA_VISIBLE_DEVICES=0,1,2,3 python main_train.py --from_json configs/train.json --name train --nump 4

Test

CUDA_VISIBLE_DEVICES=0,1,2,3 python main_test.py --from_json configs/test.json --name train --nump 4

Acknowledgement

This code is built upon Styleavatar and CCNet. Thanks to the authors of these open source codes.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
code/rendering		code/rendering
.gitignore		.gitignore
README.md		README.md
pipeline.png		pipeline.png
sample_results.png		sample_results.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAM-Avatar: Real-time Photo-Realistic Avatar from Monocular Videos with Full-body Control

Paper (Early access) · [Video]

Requirements

Datasets

Train

Test

Acknowledgement

About

Releases

Packages

Languages

Xiang-Deng00/RAM-Avatar

Folders and files

Latest commit

History

Repository files navigation

RAM-Avatar: Real-time Photo-Realistic Avatar from Monocular Videos with Full-body Control

Paper (Early access) · [Video]

Requirements

Datasets

Train

Test

Acknowledgement

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages