One-shot Implicit Animatable Avatars with Model-based Priors

teaser.mp4

ELICIT creates free-viewpoint motion videos from a single image by constructing an animatable NeRF representation in one-shot learning.

Official repository of "One-shot Implicit Animatable Avatars with Model-based Priors".

[Arxiv] [Website]

What Can Your Learn from ELICIT?

The data-efficient pipeline of creating a 3D animatable avatar from a single image.
Use CLIP-based semantic loss to infer the entire 3D appearance of the human body with the help of a rough SMPL shape.
A segmentation-based sampling strategy to create more realistic visual details and geometries for 3D avatars.

Installation

Please follow the Installation Instruction to setup all the required packages.

Data

Results of the experiments

We provide result videos in our webpage for the qualitative and quantitative evaluations in our paper. We also provided checkpoints for those experiments in Google Drive.

Training data for re-implementation

For the datasets we use for quantitative evaluations (ZJU-MoCAP, Human 3.6M), please prepare the original datasets into the same format as ZJU-MoCAP. Then use our scripts in tools to preprocess the dataset and render SMPL meshes for training.

For customized single-image data, we provides examples from DeepFashion datasets in dataset/fashion.

See more details in Data Instruction.

Getting Started

Training

python train.py --cfg configs/elicit/zju_mocap/377/smpl_init_texture.yaml # Run SMPL Meshes initialization.
python train.py --cfg configs/elicit/zju_mocap/377/finetune.yaml # Run training on the input subject.

We also provide checkpoints for all the subjects in Google Drive, please unzip the file in the following structure:

${ELICIT_ROOT}
    └── experiments
        └── elicit
            ├── zju_mocap
            ├── h36m
            └── fashion

Evaluation / Rendering

Evaluate novel pose synthesis.

python run.py --type movement --cfg configs/elicit/zju_mocap/377/finetune.yaml

Evaluate novel view synthesis.

python run.py --type freeview --cfg configs/elicit/zju_mocap/377/finetune.yaml freeview.use_gt_camera True

Freeview rendering on arbitrary frames.

python run.py --type freeview  --cfg configs/elicit/zju_mocap/377/finetune.yaml freeview.frame_idx $FRAME_INDEX_TO_RENDER

The rendered frames and video will be saved at experiments/zju_mocap/377/latest.

Citation

@article{huang2022one,
  title={One-shot Implicit Animatable Avatars with Model-based Priors},
  author={Huang, Yangyi and Yi, Hongwei and Liu, Weiyang and Wang, Haofan and Wu, Boxi and Wang, Wenxiao and Lin, Binbin and Zhang, Debing and Cai, Deng},
  journal={arXiv preprint arXiv:2212.02469},
  year={2022}
}

Acknowledgments

Our implementation is mainly based on HumanNeRF, and took reference from Animatable NeRF and AvatarCLIP. We thanks the authors for their open source contributions. In addition, we thank the authors of Animatble NeRF for their help in the data preprocessing of Human 3.6M.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
configs		configs
core		core
dataset/fashion		dataset/fashion
docs		docs
third_parties		third_parties
tools		tools
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
default.yaml		default.yaml
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt
run.py		run.py
train.py		train.py

License

neka-nat/ELICIT

Folders and files

Latest commit

History

Repository files navigation

One-shot Implicit Animatable Avatars with Model-based Priors

What Can Your Learn from ELICIT?

Installation

Data

Results of the experiments

Training data for re-implementation

Getting Started

Training

Evaluation / Rendering

Citation

Acknowledgments

About

Resources

License

Stars

Watchers

Forks

Languages