Skip to content

kamperh/bucktsong_eskmeans

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Embedded Segmental K-Means Applied to Buckeye English and NCHLT Xitsonga

License: MIT

Overview

Unsupervised acoustic word segmentation and clustering of Buckeye English and NCHLT Xitsonga data using the embedded segmental K-means (ES-KMeans) algorithm. The experiments are described in:

H. Kamper, K. Livescu, and S. J. Goldwater, "An embedded segmental K-means model for unsupervised segmentation and clustering of speech," in Proc. ASRU, 2017. [arXiv]

Please cite this paper if you use the code.

This recipe relies on the separate ES-KMeans package, which performs the actual unsupervised segmentation and clustering.

Download datasets

The Buckeye English and portions of the NCHLT Xitsonga corpora are used:

From the complete Buckeye corpus we split off several subsets. The most important are the sets labelled as devpart1 and zs. These sets respectively correspond to English1 and English2 in (Kamper et al., 2016).

Install dependencies

Dependencies can be installed in a conda environment:

conda env create -f environment.yml
conda activate eskmeans

Install the ES-KMeans package:

mkdir ../src/
git clone https://github.com/kamperh/eskmeans.git ../src/eskmeans/

Extract speech features

Extract MFCCs in features/ as follows:

cd features/
./extract_features_buckeye.py
./extract_features_xitsonga.py

More details on the feature file formats are given in features/readme.md.

Unsupervised syllable boundary detection

As a preprocessing step, we constrain the allowed word boundary positions to boundaries detected by an unsupervised syllable boundary detection algorithm. We specifically use the algorithm described in:

O. J. Räsänen, G. Doyle, and M. C. Frank, "Pre-linguistic segmentation of speech into syllable-like units," Cognition, 2018.

Extract the syllable boundaries in syllables/ as follows:

cd syllables/
./get_syl_landmarks.py buckeye
./get_syl_landmarks.py xitsonga

Downsampled acoustic word embeddings

Extract and evaluate downsampled acoustic word embeddings by running the steps in downsample/readme.md.

ES-KMeans: Segmentation and clustering

Segmentation and clustering is performed using the ES-KMeans package. Run the steps in segmentation/readme.md.

Contributors

About

Unsupervised segmentation and clustering of the Buckeye English and NCHLT Xitsonga datasets using the ES-KMeans algorithm.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages