DINAR: Diffusion Inpainting of Neural Textures for One-Shot Human Avatars

David Svitov, Dmitrii Gudkov, Renat Bashirov, Victor Lempitsky

Paper: https://arxiv.org/abs/2303.09375

Abstract: We present DINAR, an approach for creating realistic rigged fullbody avatars from single RGB images. Similarly to previous works, our method uses neural textures combined with the SMPL-X body model to achieve photo-realistic quality of avatars while keeping them easy to animate and fast to infer. To restore the texture, we use a latent diffusion model and show how such model can be trained in the neural texture space. The use of the diffusion model allows us to realistically reconstruct large unseen regions such as the back of a person given the frontal view. The models in our pipeline are trained using 2D images and videos only. In the experiments, our approach achieves state-of-the-art rendering quality and good generalization to new poses and viewpoints. In particular, the approach improves state-of-the-art on the SnapshotPeople public benchmark.

Installation

The easiest way to build an environment for this repository is to use docker image. To build it, make the following steps:

Build the image with the following command:

bash docker/build.sh

Start a container:

bash docker/run.sh

It mounts root directory of the host system to /mounted/ inside docker and sets cloned repository path as a starting directory.

Inside the container install minimal_pytorch_rasterizer. (Unfortunately, docker fails to install it during image building)

pip install git+https://github.com/rmbashirov/minimal_pytorch_rasterizer

(Optional) You can then commit changes to the image so that you don't need to install minimal_pytorch_rasterizer for every new container. See docker documentation.

Inference

To get one-shot human avatar with your images:

Prepare data:

Dataset folder structure:

.
├── rgb                   # *.png images of humans
├── segm                  # *.png segmentation masks generated by https://github.com/Gaoyiminggithub/Graphonomy
├── openpose              # *.json files with keypoints generated by https://github.com/CMU-Perceptual-Computing-Lab/openpose
└── smplx                 # body parameters trained with modification of https://github.com/vchoutas/smplify-x

Check SnapshotPeople prepared data for example.

Rendered examples of SnapshotPeople avatars for front and back views.

Download:
- Checkpoint
- SMPL-X models and put them to the ./smplx_data/smplx_models
- Animation sequence and put it to the ./smplx_data/
Launch the script:

python inference.py \
 --ckpt_path=checkpont/path/filename.ckpt \
 --log_dir=path/to/logs \
 --data_root=path/to/your/data

Example:

python inference.py \
 --ckpt_path=./checkponts/ddpm-epoch=24.ckpt \
 --log_dir=./logs \
 --data_root=./Dataset/SnapshotPeople

Look for result video in <log_dir>/eval/<exp_name>/textures/video

Citation

@article{svitov2023dinar,
  title={DINAR: Diffusion Inpainting of Neural Textures for One-Shot Human Avatars},
  author={Svitov, David and Gudkov, Dmitrii and Bashirov, Renat and Lemptisky, Victor},
  journal={arXiv preprint arXiv:2303.09375},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
compress_models		compress_models
configs		configs
criteria		criteria
dataloaders		dataloaders
discriminators		discriminators
docker		docker
finetune		finetune
generators		generators
images		images
inpainting		inpainting
pl_callbacks		pl_callbacks
rasterizers		rasterizers
renderers		renderers
smplx_data		smplx_data
utils		utils
LICENSE		LICENSE
README.md		README.md
finetune_texture.py		finetune_texture.py
inference.py		inference.py
main.py		main.py
visualize.py		visualize.py

License

SamsungLabs/DINAR

Folders and files

Latest commit

History

Repository files navigation

DINAR: Diffusion Inpainting of Neural Textures for One-Shot Human Avatars

Installation

Inference

Citation

About

Resources

License

Stars

Watchers

Forks

Languages