Uses almost no Swahili resources. Audio FSTs are trained from Voxforge's English.
-
The shell commands
flac
,gawk
,swig
, andwget
.
On Ubuntu, you might need tosudo apt install flac gawk swig wget
. -
The Kaldi toolkit for automatic speech recognition.
To install it,git clone https://www.github.com/kaldi-asr/kaldi
. -
The SRI Language Modeling Toolkit.
To add this to Kaldi, download the filesrilm.tgz
intokaldi/tools
, and then (fromkaldi/tools
)./install_srilm.sh
. -
The Sequitur grapheme-to-phoneme converter.
To add this to Kaldi,cd kaldi/tools && extras/install_sequitur.sh
.
(You might first need tosudo pip install numpy
(for Python 2.7)).
Add these pseudo-Swahili scripts to Kaldi.
cd kaldi/egs
git clone https://www.github.com/uiuc-sst/pseudo-swahili
cd pseudo-swahili/s5
ln -s ../../wsj/s5/steps steps
ln -s ../../wsj/s5/utils utils
Get the Voxforge corpus of spoken English (this takes 45 minutes, and uses 25 GB of disk space).
./getdata.sh
Build the low-resource language model, vocabulary, etc. for Swahili.
cd pseudo-swahili/pseudo && ./a.sh
Build and test the speech recognizer.
cd pseudo-swahili/s5 && ./run.sh