AudioVisual Speaker Localization

A framework for localizing speakers on video streams by means of deep learning topologies.

This repo includes implementations for processing video streams, localizing faces, mouths and implements a fast visual voice activity detector. Please visit the following repos for implementing the ETi voice activity detctor and the visual voice activity detctor by Conv2DLSTM:
http://github.com/lvrysis/Audio-DNN-Classification
http://github.com/lvrysis/Audio-Feature-Integration

The implementations are powered by Python.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
AV-Localization/Python		AV-Localization/Python
README.md		README.md

Provide feedback