SPEAKER DIARIZATION WITH LSTM

Authors of the paper:

Quan Wang, Carlton Downey et. al

Speaker diarization is the process of partitioning an input audio stream into homogeneous segments according to the speaker identity. It answers the question “who spoke when” in a multi-speaker environment. It has a wide variety of applications including multimedia information retrieval, speaker turn analysis, and audio processing. In particular, the speaker boundaries produced by diarization systems have the potential to significantly improve acoustic speech recognition (ASR) accuracy.

A typical speaker diarization system usually consists of four components: (1) Speech segmentation, where the input audio is segmented into short sections that are assumed to have a single speaker, and the non-speech sections are filtered out; (2) Audio embedding extraction, where specific features such as MFCCs [1], speaker factors [2], or i-vectors [3, 4, 5] are extracted from the segmented sections; (3) Clustering, where the number of speakers is determined, and the extracted audio embeddings are clustered into these speakers; and optionally (4) Resegmentation [6], where the clustering results are further refined to produce the final diarization results

Resources links:

demo video:

https://youtu.be/axfAxfhe1Ko

Author: Bappy Ahmed
Data Scientist
Email: entbappy73@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Speaker Diarization with LSTM.ipynb		Speaker Diarization with LSTM.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SPEAKER DIARIZATION WITH LSTM

Authors of the paper:

Resources links:

demo video:

About

Releases

Packages

Languages

License

entbappy/Speaker-Diarization-with-LSTM---Spectral-clustering-algorithm

Folders and files

Latest commit

History

Repository files navigation

SPEAKER DIARIZATION WITH LSTM

Authors of the paper:

Resources links:

demo video:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages