Skip to content

AIOZNetwork/AudioDrivenStyleTransfer

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

git_style_transfer

Dependencies

First establish the environment and install the dependencies

conda create --prefix env/ python=3.8 -y
conda activate env/
pip install -r requirements.txt

Then download files from this link and put:

  • Obama2.zip and APC_epoch_160.model in src/face_generator/data and extract Obama2.zip there.
  • GPEN-BFR-512_trace.pt, RealESRGAN_x2plus_trace.pt, and RetinaFace-R50_trace.pt in src/face_res/models.
  • wiki.zip in src/face_res and extract it there.
  • 00000189-checkpoint.pth.tar in src/face_reenactment/config.
  • shape_predictor_68_face_landmarks.dat in src/style_metrics.
  • RAVDESS.zip in . and then extract it there.

Quick start

Simply run

bash main.sh -i <image path> -a <audio path> -o <output path>

The model only accepts audio with extension .wav or .mp3, and the image must be square.

For example with the given inputs folder, you run:

bash main.sh -i inputs/image.jpg -a inputs/sample.wav -o ./output

TODO

  • train (coming soon)

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 97.5%
  • Cuda 2.0%
  • Other 0.5%