Skip to content

The aim of this project is to predict words or phrases from videos.

Notifications You must be signed in to change notification settings

davidebassan/Lips-Reading

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Project Of VCS

The aim of this project is to predict words or phrases from videos. This task can be solved with both machine learning and deep learning techniques (for example, you can use a CNN to extract features from video frames and process extracted features with an RNN). Pre-trained CNNs on human faces could improve classification results. Data augmentation techniques should be applied to increase the number of samples.

Architectures

  1. CNN (pre-trained VGG16) + Fine-Tuning + RNN (with LSTM)

  2. CNN (pre-trained VGG16) + Fine-Tuning + RNN (with GRU)

  3. CNN (pre-trained VGG16) + Fine-Tuning + RNN

Goals

Implement the architectures and make some benchmarks on them

Roadmap

  • know how works VGG16
  • how to use Keras
  • how to transfer-learining
  • how to do fine-tuning
  • how to create a RNN (with custom layers)
  • how to make benchmarks

About

The aim of this project is to predict words or phrases from videos.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published