Unsupervised Two Speaker Separation

The programs corresponds to the algorithm described in "An unsupervised approach to cochannel speech separation" by K. Hu and D. L. Wang (IEEE Trans. Audio, Speech, and Lang. Process., 2013). This is an unsupervised algorithm for two-speaker separation.

Requirements:

The input mixture (the wav file) needs to have a sampling frequency of 16 kHz
The algorithms is developed in MATLAB under Linux
The main MATLAB program will call external executables compiled in Linux version 2.6.32-358.2.1.el6.x86_64 in a REHL 6.4 distribution. In other systems, you have to compile the tandem algorithm (source code in folder "tandem") and segmentation algorithm (source code in folder "segment") and generate your own executables.
The tandem algorithm used here is not the most unpdated version. It does not generate and group T-segments. The newest version can be found at github.com/mrhuke/tandem

Run an example:

Under a Linux system, go to the "run" folder
start MATLAB
mixture = load('mixture');
mask = twoSpk_unsupervised(mixture);

A description of the main steps performed:

Run a tandem algorithm (Hu & Wang'11) to generate simultaneous streams (SS)
Order SS by time
Extract GFCCs for each SS
Group voiced SS by beam search
Generate unvoiced speech segments by onset/offset based segmentation
Group unvoiced-voiced and unvoiced-unvoiced segments

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
function		function
run		run
segmentation		segmentation
tandem		tandem
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

function

function

run

run

segmentation

segmentation

tandem

tandem

README.md

README.md

Repository files navigation

Unsupervised Two Speaker Separation

About

Releases

Packages

Languages

mrhuke/unsupervised_two_speaker

Folders and files

Latest commit

History

Repository files navigation

Unsupervised Two Speaker Separation

About

Resources

Stars

Watchers

Forks

Languages