GitHub - FieldDB/lex4all: pronunciation LEXicons for Any Low-resource Language

lex4all: pronunciation LEXicons for Any Low-resource Language

http://lex4all.github.io/lex4all/

Anjana Vakil & Max Paulus

Advisors: Alexis Palmer & Michaela Regneri

Department of Computational Linguistics, University of Saarland

Developers trying to incorporate speech recognition interfaces in a low-resource language (LRL) into their applications currently face the hurdle of not finding recognition engines trained on their target language. Although tools such as Carnegie Mellon University's Sphinx simplify the creation of new acoustic models for recognition, they require large amounts of training data (audio recordings) in the target language. However, for small-vocabulary applications, an existing recognizer for a high-resource language (HRL) can be used to perform recognition in the target language. This requires a pronunciation lexicon mapping the relevant words in the target language into sequences of sounds in the HRL.

lex4all is an easy-to-use desktop application for Windows that will allow even naive users to automatically create a pronunciation lexicon for words in any language, using a small number of audio recordings and a pre-existing recognition engine in a HRL such as English. The resulting lexicon can then be used to add small-vocabulary speech recognition functionality to applications in the LRL.

How it works

A simple user interface allows the user to easily specify one written form (text string) and and one or more audio samples (.wav files) for each word in the target vocabulary, and to set other options (e.g. number of pronunciations per word, name/save location of lexicon file, etc.). The audio is then passed to a speech recognition engine for a HRL (English). An automatic pronunciation generation algorithm (the Salaam method, [2–3]) finds the best pronunciation(s) for each word in the LRL vocabulary. The program outputs a pronunciation lexicon (.pls XML file). This lexicon file follows the standard pronunciation lexicon format (http://www.w3.org/TR/pronunciation-lexicon/), so it can be directly included in a speech recognition application, e.g. one built using the Microsoft Speech Platform API.

For a guided step-by-step walkthrough with screenshots, see: http://lex4all.github.io/lex4all/walkthrough.html

Features

Simple graphical interface
Use existing .wav audio files, or use the built-in audio recorder
Advanced options (number of pronunciations per word, discriminative training [3])
Evaluation module for testing/research

Requirements & Installation

Requirements:

Windows 7 or 8, 64-bit
Microsoft Speech Platform (MSP) runtime (version 11.0). Available here: http://www.microsoft.com/en-us/download/details.aspx?id=27225
MSP speech recognition engine(s) for US English (and optionally other languages). Available here: http://www.microsoft.com/en-us/download/details.aspx?id=27224 (From the download page, select the Speech Recognition (SR) engines for the languages you want to use, e.g. MSSpeech_SR_en-US_TELE.msi for US English)

Installation:

Download the project from GitHub & unzip the archive.
Double-click the link run-lex4all.exe in the folder you just downloaded.
Enjoy using lex4all!

For troubleshooting help, please see our wiki page: https://github.com/lex4all/lex4all/wiki/Installation-&-troubleshooting

Backend & resources

This approach to language-independent recognition requires an existing high-quality speech recognition engine with a usable API; we chose to use the English recognition engine of the Microsoft Speech Platform, so lex4all is written in C#. The audio recording feature was built using the NAudio API.

To automatically discover the pronunciation mappings we implement the Salaam algorithm as presented in [2-3]; a slight modification was made to reduce the algorithm's running time. In addition to the basic discovery algorithm [2], users have the choice of applying the discriminative training algorithm [3] as well.

References

[1] Jahanzeb Sherwani. 2009. “Speech interfaces for information access by low literate users”. PhD thesis. Pittsburgh, PA, USA: Carnegie Mellon University. [pdf].

[2] Fang Qiao, Jahanzeb Sherwani, and Roni Rosenfeld. 2010. “Small-vocabulary speech recognition for resource-scarce languages." In: Proceedings of the First ACM Symposium on Computing for Development (ACM DEV ’10). [pdf].

[3] Hao Yee Chan and Roni Rosenfeld. 2012. “Discriminative pronunciation learning for speech recognition for resource scarce languages." In: Proceedings of the 2nd ACM Symposium on Computing for Development (ACM DEV ’12). [pdf].

[4] Anjana Vakil, Max Paulus, Alexis Palmer and Michaela Regneri. 2014. "lex4all: A language-independent tool for building and evaluating pronunciation lexicons for small-vocabulary speech recognition." In: Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014): System Demonstrations. [pdf]

[5] Anjana Vakil and Alexis Palmer. 2014. "Cross-language mapping for small-vocabulary ASR in under-resourced languages: investigating the impact of source language choice." In: Proceedings of the 4th Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU'14). [pdf]

Name		Name	Last commit message	Last commit date
Latest commit History 255 Commits
app		app
src		src
test		test
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
run-lex4all.exe.lnk		run-lex4all.exe.lnk

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

lex4all: pronunciation LEXicons for Any Low-resource Language

http://lex4all.github.io/lex4all/

Anjana Vakil & Max Paulus

Advisors: Alexis Palmer & Michaela Regneri

Department of Computational Linguistics, University of Saarland

How it works

Features

Requirements & Installation

Backend & resources

References

About

Releases

Packages

License

FieldDB/lex4all

Folders and files

Latest commit

History

Repository files navigation

lex4all: pronunciation LEXicons for Any Low-resource Language

http://lex4all.github.io/lex4all/

Anjana Vakil & Max Paulus

Advisors: Alexis Palmer & Michaela Regneri

Department of Computational Linguistics, University of Saarland

How it works

Features

Requirements & Installation

Backend & resources

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages