Skip to content

Third party:SpeechRecognition:Models:create

Bertrand Benoit edited this page Dec 7, 2019 · 2 revisions

title: Third-party:SpeechRecognition:Models:create permalink: /Third-party:SpeechRecognition:Models:create/



You can use SphinxTrain provided by CMU Sphinx. See SphinxTrain documentation.

=Hemera Speech Recognition Tool= Hemera project provides a little speech recognition tool allowing to create lexical and language models. Currently it only supports French language, but you may contribute to add support for other languages.

Get it

You can get it from source code.

==Third-party tools==


The following tools are required:

-> you must install them (or create symbolic link) in [HEMERA_TP_PATH](/Appendix#HEMERA_TP_PATH]]/_fromSource which has been created to help you keeping track of third-party tools you have installed for Hemera

The script will check for these tools availability.

===Requirements=== To begin, you need to prepare your [computer for compiling source code|Third-party:Prepare_to_compile_Source#Needed_packages]].



  • download the 1.5.11 version from
  • uncompress it in [HEMERA_TP_PATH](/Appendix#HEMERA_TP_PATH]]/_fromSource
  • follow INSTALL file (particularly about the MACHINE_TYPE)
  • for 32 bits version, performed following instructions

make -s SRILM=$PWD World

  • for 64 bits version, performed following instructions

make -s SRILM=$PWD MACHINE_TYPE=i686-m64 World


WARNING: this tool does support x86_64 architecture, it must be compiled as ix86 even on x86_64 bits OS

If it is your case, you need [additional packages|Third-party:Prepare_to_compile_Source#Additional_packages_for_compiling_i686_version_on_x86_64_Operating_System]]. Then use the provided patch to update Makefile, forcing 32 bits compilation:

patch -N -p1 -s [HEMERA_TP_PATH](/Appendix#HEMERA_TP_PATH]]/_fromSource/lia_phon/Makefile < misc/lia_phon_32bits_compile.patch

* performed following instructions (it will create the tools, resources, and the 80k lexical) cd [HEMERA_TP_PATH](/Appendix#HEMERA_TP_PATH]]/_fromSource/lia_phon

make -s LIA_PHON_REP=$PWD all ressource lex80k


  • follow [install instructions|Third-party:SpeechRecognition#cmusphinx3_installation]]

==Instructions== Create your own corpus, updating the file to fit your needs:


Then, launch the script


You can use the --copy option to automatically copy the created models in the corresponding directory of [HEMERA_TP_PATH](/Appendix#HEMERA_TP_PATH]].

If a tool is not available or if there is an error, it will be printed on standard output. Otherwise, lexical and language model will be created under the data/ sub-directory.

Category:HemeraBook/en Category:advanced

You can’t perform that action at this time.