Skip to content

lvrysis/AudioVisual-Speaker-Localization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

AudioVisual Speaker Localization

A framework for localizing speakers on video streams by means of deep learning topologies.

This repo includes implementations for processing video streams, localizing faces, mouths and implements a fast visual voice activity detector. Please visit the following repos for implementing the ETi voice activity detctor and the visual voice activity detctor by Conv2DLSTM:
http://github.com/lvrysis/Audio-DNN-Classification
http://github.com/lvrysis/Audio-Feature-Integration

The implementations are powered by Python.

You can experiment using the M3C Speaker Localization datasets:
http://research.playcompass.com/files/M3C-Speaker-Localization-1.zip

About

A framework for localizing speakers on video streams

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages