agwe-recipe

This recipe trains acoustic word embeddings (AWEs) and acoustically grounded word embeddings (AGWEs) on paired data consisting of word labels (given by their character sequences) and spoken word segments.

The training objective is based on the multiview triplet loss functions of Wanjia et al., 2016. Hard negative sampling was added in Settle et al., 2019 to improve training speed (similar to src/multiview_triplet_loss_old.py). The current version (see src/multiview_triplet_loss.py) uses semi-hard negative sampling Schroff et al. (instead of hard negative sampling) and includes obj1 from Wanjia et al. in the loss.

Dependencies

python 3, pytorch 1.4, h5py, numpy, scipy

Training

Edit train_config.json and run train.sh

./train.sh

Evaluate

Edit eval_config.json and run eval.sh

./eval.sh

Results

With the default train_config.json you should obtain the following results:

acoustic_ap= 0.79

crossview_ap= 0.75

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

agwe-recipe

Dependencies

Training

Evaluate

Results

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
lib		lib
src		src
README.md		README.md
eval.sh		eval.sh
eval_config.json		eval_config.json
train.sh		train.sh
train_config.json		train_config.json

shane-settle/agwe-recipe

Folders and files

Latest commit

History

Repository files navigation

agwe-recipe

Dependencies

Training

Evaluate

Results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages