Speaker diarization is the problem of identifying different speakers from a conversation and convert the same to speech-to-text format. This has been achieved here by affinity clustering method of sklearn.clusters
The dependency modules
- Librosa (Sound processing and DSP)
- Numpy (Matrix calculations)
- Sklearn (ML Algorithms)
- Scipy (Statistical calculations)
Python Version - 3.5 Audio format - wav