LipReader-Transformer

A transformer model to generate audio based on the readers lips and face

A collaboration with my classmate for our final year project.

This is a transformer implementation of Lip2Wav and is modified to work with lips.

The dataset used was a single speaker from Lip2Wav. We have made our own preprocessor to extract just the lips of the reader from the video.

Working

Many lip-reading codes assume that using the full face of a speaker may lead to better output. We, however, believe this isn't the case and that using the lips or mouths alone may be sufficient enough to get a good prediction. To test this we modified the working of a lip reading algorithm into a newer architecture then ran similar tests using a variety of datasets.

Results

Loss Graph

Scores

full-face tends to be better than lips on all objective tests but lips beat face on the subjective test. this is assume to be due to the reduced number of features in lips affecting the ability to learn for short periods of time and thus not providing accurate results; however, since lip shapes are less unique and the same shapes are used in different sounds, the model was able to improve voice quality instead of accuracy which led to a higher PESQ score. However this also shows how full face is better than lip in terms of learning as it is able to quickly predict audio from the frames provided.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Figure_5.png		Figure_5.png
LICENSE		LICENSE
README.md		README.md
Score_Table.png		Score_Table.png
TransformerTTSModel.py		TransformerTTSModel.py
audio.py		audio.py
dataset.py		dataset.py
hparams.py		hparams.py
main.py		main.py
plot.py		plot.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LipReader-Transformer

Working

Results

About

Releases

Packages

Languages

License

Brandon-mg/LipReader-Transformer

Folders and files

Latest commit

History

Repository files navigation

LipReader-Transformer

Working

Results

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages