Recipe: Acoustic Word Embeddings for Switchboard

Overview

This is a recipe for extracting acoustic word embeddings for a subset of the Switchboard corpus. The models are described in detail in Kamper et al., 2015:

H. Kamper, W. Wang, and K. Livescu, "Deep convolutional acoustic word embeddings using word-pair side information," in Proc. ICASSP, 2016.

Please cite this paper if you use this code. All the neural networks are implemented in the package couscous.

Steps

Install all dependencies (below).

Clone couscous into the appropriate directory:

mkdir ../src
git clone https://github.com/kamperh/couscous.git ../src/couscous

Run the steps in kaldi_features/run.sh.
Run the steps in cnn_wordembeds/readme.md.
If you run the steps correctly above, then if you execute the following:
```
cd cnn_wordembeds/
./apply_layers.py models/siamese_triplets_cnn.1/ test
./eval_samediff.py \
    models/siamese_triplets_cnn.1/swbd.test.layer_-1.npz
```
Then the evaluation should show the following output:
```
Average precision: 0.537404372048
Precision-recall breakeven: 0.542724052097
```
The average precision (AP) of 0.537 is used for the number reported in Table 1, row 9 of Kamper et al., 2015.

Dependencies

Kaldi
Theano and all its dependencies.
couscous: should be cloned into the directory ../src/couscous.

Collaborators

Herman Kamper
Weiran Wang
Karen Livescu

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

readme.md

readme.md

Recipe: Acoustic Word Embeddings for Switchboard

Overview

Steps

Dependencies

Collaborators

Files

readme.md

Latest commit

History

readme.md

File metadata and controls

Recipe: Acoustic Word Embeddings for Switchboard

Overview

Steps

Dependencies

Collaborators