Skip to content

Artificial Intelligence project at Insight Data Science

License

Notifications You must be signed in to change notification settings

shingte/Auto-Dubbing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Face-Expression-Transfer-by-Audio

Introduction

Audio-driven facial animation is the process that aotomatically synthesizes talking head video from speech signals.

This project presents an end-to-end system that take an image and a clip of audio to generate the talking video. The system can simplify the film animation process through automatic generation from the voice acting. It can also be applied inpost-production to achieve better lip-synchronization in movie dubbing.

This repository uses the model described in the paper Hierarchical Cross-modal Talking Face Generation with Dynamic Pixel-wise Loss.

Setup and Running

  1. Pytorch environment:Pytorch 0.4.1. (conda install pytorch=0.4.1 torchvision cuda90 -c pytorch)
  2. Install requirements.txt (pip install -r requirement.txt)
  3. Download the pretrained ATnet and VGnet weights at google drive. Put the weights under model folder.
  4. Run the demo code: python demo.py
    • -device_ids: gpu id
    • -cuda: using cuda or not
    • -vg_model: pretrained VGnet weight
    • -at_model: pretrained ATnet weight
    • -lstm: use lstm or not
    • -p: input example image
    • -i: input audio file
    • -lstm: use lstm or not
    • -sample_dir: folder to save the outputs
    • ...

Reference

This repository is based on repository ATVGnet.

License

MIT

About

Artificial Intelligence project at Insight Data Science

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages