It can therefore be used for the evaluation of multi-speaker Automated Speech Recognition (ASR) system. In particular, this script helps with evaluating the CMU Sphinx3 ASR system, to see how different acoustic and language models compare to each other.
The evaluation script relies on SoX being installed, and the Sphinx3 python bindings. The latter can be installed using the install-sphinx.sh script, e.g. on Ubuntu 12.04:
# apt-get install python python-dev liblapack3gf liblapack-dev liblas1 liblas-dev bison make gcc g++ autoconf automake libtool unzip libsndfile1 sox libsox-fmt-mp3 python-numpy
Start by downloading the dataset using the provided script.
$ cd reith-lectures
Create a configuration file pointing to your acoustic and language models. An example configuration file is given in hub4_and_lm_giga_64k_vp_3gram.ini.example, using the HUB4 acoustic model bundled with Sphinx and a language model derived from the English Gigaword corpus.
Run the evaluation.
$ ./evaluate.py --directory reith-lectures --config sphinx-config.ini
If you want to run the evaluation using only pre-computed transcriptions, use the --lazy flag.
For example to run the evaluation on transcriptions derived using the example configuration file:
$ ./evaluate.py --directory reith-lectures-hub4-and-lm-giga-64k-vp-3gram --lazy true
Average WER: 0.556791
The full results of the above command are available in reith-lectures-hub4-and-lm-giga-64k-vp-3gram/evaluation-results.txt