Keras implementation of Deep Clustering paper

This is a keras implementation of the Deep Clustering algorithm described at https://arxiv.org/abs/1508.04306. It is not yet finished. Most of this code was implemented by Valter Akira Miasato Filho.

Requirements

System library:

libsndfile1 (installed via apt-get on Ubuntu 16.04)

Python packages (I used Anaconda and Python 3.5):

Theano (pip install git+git://github.com/Theano/Theano.git)
keras (pip install keras)
pysoundfile (pip install pysoundfile)
numpy (conda install numpy)
scikit-learn (conda install scikit-learn)
matplotlib (conda install matplotlib) (only used for visualization)

Training the network

First of all, you must create two text files: train_list and valid_list. They must contain your training and validation data. The lines of these files must be according to the following pattern:

path/to/audioFile1 spk1
path/to/audioFile2 spk2
path/to/audioFile3 spk1

spk1, spk2 identifies the speaker that uttered the recorded sentence.

The current implementation should work with any sample rate, but experiments were conducted only with 8kHz audio. It was already tested with flac and wav files, but it should work with all formats supported by pysoundfile/libsndfile.

After creating train_list and valid_list, you may start training the network with the command:

python main.py

Please check the main script if you wish to use other features from this project, such as output visualization and prediciton.

As of February, 2017, this project is halted, but we are still open to feedback and questions.

References

https://arxiv.org/abs/1508.04306

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Keras implementation of Deep Clustering paper

Requirements

Training the network

References

Files

README.md

Latest commit

History

README.md

File metadata and controls

Keras implementation of Deep Clustering paper

Requirements

Training the network

References