Python re-implementation of the spectral clustering algorithm in the paper "Speaker Diarization with LSTM"
Branch: master
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.vscode
docs Regenerate docs, and upload run_pdoc3.sh Jan 27, 2019
resources Add tutorial and image to README.md Jan 18, 2019
spectralcluster Add the _get_refinement_operator method to make code cleaner Jan 20, 2019
tests Rename cluster() method to predict() to align with sklearn and tensor… Jan 19, 2019
.coveragerc Activate codecov Jan 29, 2019
.gitignore Activate codecov Jan 29, 2019
.travis.yml Activate codecov Jan 29, 2019
LICENSE Update LICENSE Jan 19, 2019
README.md Add codecov badge to README.md Jan 29, 2019
publish.sh Prepare to PyPI release Jan 18, 2019
requirements.txt Rename sklearn to scikit-learn in requirements.txt Jan 19, 2019
run_pdoc3.sh Rename run_pdoc3.sh Jan 27, 2019
run_tests.sh Correct the way to run coverage to append to result Jan 29, 2019
setup.py New release 0.0.7 Jan 27, 2019

README.md

Spectral Clustering Build Status PyPI Version Python Versions codecov Documentation

Overview

This is a Python re-implementation of the spectral clustering algorithm in the paper Speaker Diarization with LSTM.

refinement

Disclaimer

This is not the original implementation used by the paper.

Specifically, in this implementation, we use the K-Means from scikit-learn, which does NOT support customized distance measure like cosine distance.

Dependencies

  • numpy
  • scipy
  • scikit-learn

Installation

Install the package by:

pip3 install spectralcluster

or

python3 -m pip install spectralcluster

Tutorial

Simply use the predict() method of class SpectralClusterer to perform spectral clustering:

from spectralcluster import SpectralClusterer

clusterer = SpectralClusterer(
    min_clusters=2,
    max_clusters=100,
    p_percentile=0.95,
    gaussian_blur_sigma=1)

labels = clusterer.predict(X)

The input X is a numpy array of shape (n_samples, n_features), and the returned labels is a numpy array of shape (n_samples,).

For the complete list of parameters of the clusterer, see spectralcluster/spectral_clusterer.py.

Citations

Our paper is cited as:

@inproceedings{wang2018speaker,
  title={Speaker diarization with lstm},
  author={Wang, Quan and Downey, Carlton and Wan, Li and Mansfield, Philip Andrew and Moreno, Ignacio Lopz},
  booktitle={2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  pages={5239--5243},
  year={2018},
  organization={IEEE}
}

Misc

Our new speaker diarization systems are now fully supervised, powered by uis-rnn. Check this Google AI Blog.

To learn more about speaker diarization, here is a curated list of resources: awesome-diarization.