Skip to content

OpenTalker/DPE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

63 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

DPE๏ผš Disentanglement of Pose and Expression for General Video Portrait Editing

ย ย ย ย ย  ย ย ย ย ย  Open In Colab


1 MAIS & NLPR, Institute of Automation, Chinese Academy of Sciences, Beijing, China โ€ƒ 2 School of Artificial Intelligence, University of Chinese Academy of Sciences โ€ƒ 3 Tencent AI Lab, ShenZhen, China โ€ƒ

CVPR 2023


๐Ÿ”ฅ Demo

  • ๐Ÿ”ฅ Video editing: single source video & a driving video & a piece of audio. We tranfer pose through the video and transfer expression through the audio with the help of SadTalker.
Source video Result
full_s.mp4
dpe.mp4
full_s.mp4
dpe.mp4
full_s.mp4
dpe.mp4
  • ๐Ÿ”ฅ Video editing: single source image & a driving video & a piece of audio. We tranfer pose through the video and transfer expression through the audio with the help of SadTalker.

demo4_1.mp4
demo5_1.mp4

  • ๐Ÿ”ฅ Video editing: single source image & two driving videos. We tranfer pose through the first video and transfer expression through the second video. Some videos are selected from here.

dpe dpe

๐Ÿ“‹ Changelog

  • 2023.07.21 Release code for one-shot driving.
  • 2023.05.26 Release code for training.
  • 2023.05.06 Support Enhancement.
  • 2023.05.05 Support Video editing.
  • 2023.04.30 Add some demos.
  • 2023.03.18 Support Pose driving๏ผŒExpression driving and Pose and Expression driving.
  • 2023.03.18 Upload the pre-trained model, which is fine-tuning for expression generator.
  • 2023.03.03 Release the test code!
  • 2023.02.28 DPE has been accepted by CVPR 2023!

๐Ÿšง TODO

  • Test code for video driving.
  • Some demos.
  • Gradio/Colab Demo.
  • Training code of each componments.
  • Test code for video editing.
  • Test code for one-shot driving.
  • Integrate audio driven methods for video editing.
  • Integrate GFPGAN for face enhancement.

๐Ÿ”ฎ Inference

Dependence Installation

CLICK ME
git clone https://github.com/Carlyx/DPE
cd DPE 
conda create -n dpe python=3.8
source activate dpe
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113
pip install -r requirements.txt
### install gpfgan for enhancer
pip install git+https://github.com/TencentARC/GFPGAN

Trained Models

CLICK ME

Please download our pre-trained model and put it in ./checkpoints.

Model Description
checkpoints/dpe.pt Pre-trained model (V1).

Expression driving

python run_demo.py --s_path ./data/s.mp4 \
 		--d_path ./data/d.mp4 \
		--model_path ./checkpoints/dpe.pt \
		--face exp \
		--output_folder ./res

Pose driving

python run_demo.py --s_path ./data/s.mp4 \
 		--d_path ./data/d.mp4 \
		--model_path ./checkpoints/dpe.pt \
		--face pose \
		--output_folder ./res

Expression and pose driving

Video driving:

python run_demo.py --s_path ./data/s.mp4 \
 		--d_path ./data/d.mp4 \
		--model_path ./checkpoints/dpe.pt \
		--face both \
		--output_folder ./res

One-shot driving:

python run_demo_single.py --s_path ./data/s.jpg \
 		--pose_path ./data/pose.mp4 \
        --exp_path ./data/exp.mp4 \
		--model_path ./checkpoints/dpe.pt \
		--face both \
		--output_folder ./res

Crop full video

python crop_video.py

Video editing

Before video editing, you should run python crop_video.py to process the input full video. For pre-trained segmentation model, you can download from here and put it in ./checkpoints.

(Optional) You can run git clone https://github.com/TencentARC/GFPGAN and download the pre-trained enhancement model from here and put it in ./checkpoints. Then you can use --EN to make the result better.

python run_demo_paste.py --s_path <cropped source video> \
  --d_path <driving video> \
  --box_path <txt after running crop_video.py> \
  --model_path ./checkpoints/dpe.pt \
  --face exp \
  --output_folder ./res \
  --EN 

Video editing for audio driving

  TODO

๐Ÿ”ฎ Training

  • Data preprocessing.

To train DPE, please follow video-preprocessing to download and pre-process the VoxCelebA dataset. We use the lmdb to improve I/O efficiency. (Or you can rewrite the Class VoxDataset in dataset.py to load data with .mp4 directly.)

  • Train DPE from scratch:
python train.py --data_root <DATA_PATH>
  • (Optional) If you want to accelerate convergence speed, you can download the pre-trained model of LIA and rename it to vox.pt.
python train.py --data_root <DATA_PATH> --resume_ckpt <model_path for vox.pt>

๐Ÿ›Ž Citation

If you find our work useful in your research, please consider citing:

@InProceedings{Pang_2023_CVPR,
    author    = {Pang, Youxin and Zhang, Yong and Quan, Weize and Fan, Yanbo and Cun, Xiaodong and Shan, Ying and Yan, Dong-Ming},
    title     = {DPE: Disentanglement of Pose and Expression for General Video Portrait Editing},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2023},
    pages     = {427-436}
}

๐Ÿ’— Acknowledgements

Part of the code is adapted from LIA, PIRenderer, STIT. We thank authors for their contribution to the community.

๐Ÿฅ‚ Related Works

๐Ÿ“ข Disclaimer

This is not an official product of Tencent. This repository can only be used for personal/research/non-commercial purposes.

About

[CVPR 2023] DPE: Disentanglement of Pose and Expression for General Video Portrait Editing

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages