Speech-Driven Animation

This library implements the end-to-end facial synthesis model described in this paper.

Prerequisites

The models provided are checked out using git LFS. You can install git LFS by following these instructions.

Downloading the models

The models were hosted on git LFS. However the demand was so high that I reached the quota for free gitLFS storage. I have moved the models to GoogleDrive. Models can be found here.

Installing

To install the library do:

$ pip install .

Running the example

To create the animations you will need to instantiate the VideoAnimator class. Then you provide an image and audio clip (or the paths to the files) and a video will be produced.

Choosing the model

The model has been trained on the GRID, TCD-TIMIT, CREMA-D and LRW datasets. The default model is GRID. To load another pretrained model simply instantiate the VideoAnimator with the following arguments:

import sda
va = sda.VideoAnimator(gpu=0, model_path="crema")# Instantiate the animator

The models that are currently uploaded are:

grid
timit
crema
lrw

Example with Image and Audio Paths

import sda
va = sda.VideoAnimator(gpu=0)# Instantiate the animator
vid, aud = va("example/image.bmp", "example/audio.wav")

Example with Numpy Arrays

import sda
import scipy.io.wavfile as wav
from PIL import Image

va = sda.VideoAnimator(gpu=0)# Instantiate the animator
fs, audio_clip = wav.read("example/audio.wav")
still_frame = Image.open("example/image.bmp")
vid, aud = va(frame, audio_clip, fs=fs)

Saving video with audio

va.save_video(vid, aud, "generated.mp4")

Using the encodings

The encoders for audio and video are made available so that they can be used to produce features for classification tasks.

Audio Encoder

The Audio Encoder (which is made of Audio-Frame encoder and RNN) is provided along with a dictionary which has information such as the feature length (in seconds) required by the Audio Frame encoder and the overlap between audio frames.

import sda
encoder, info = sda.get_audio_feature_extractor(gpu=0)

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
example		example
sda		sda
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
example.gif		example.gif
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

example

example

sda

sda

.gitattributes

.gitattributes

.gitignore

.gitignore

README.md

README.md

example.gif

example.gif

setup.py

setup.py

Repository files navigation

Speech-Driven Animation

Prerequisites

Downloading the models

Installing

Running the example

Choosing the model

Example with Image and Audio Paths

Example with Numpy Arrays

Saving video with audio

Using the encodings

Audio Encoder

About

Releases

Packages

Languages

yjlswykj/speech-driven-animation

Folders and files

Latest commit

History

Repository files navigation

Speech-Driven Animation

Prerequisites

Downloading the models

Installing

Running the example

Choosing the model

Example with Image and Audio Paths

Example with Numpy Arrays

Saving video with audio

Using the encodings

Audio Encoder

About

Resources

Stars

Watchers

Forks

Languages