Project Of VCS

The aim of this project is to predict words or phrases from videos. This task can be solved with both machine learning and deep learning techniques (for example, you can use a CNN to extract features from video frames and process extracted features with an RNN). Pre-trained CNNs on human faces could improve classification results. Data augmentation techniques should be applied to increase the number of samples.

Architectures

CNN (pre-trained VGG16) + Fine-Tuning + RNN (with LSTM)
CNN (pre-trained VGG16) + Fine-Tuning + RNN (with GRU)
CNN (pre-trained VGG16) + Fine-Tuning + RNN

Goals

Implement the architectures and make some benchmarks on them

Roadmap

know how works VGG16
how to use Keras
how to transfer-learining
how to do fine-tuning
how to create a RNN (with custom layers)
how to make benchmarks

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project Of VCS

Architectures

Goals

Roadmap

About

Releases

Packages

Contributors 2

davidebassan/Lips-Reading

Folders and files

Latest commit

History

Repository files navigation

Project Of VCS

Architectures

Goals

Roadmap

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages