This is Malayalam Speech Recognition model developed for CMUSphinx. This is now used for Google Summer Code 2016
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.
Further development files


Instructions to use this Model for demo purpose ( I recommand using Unix-like enivironment ).

Firstly download the latest libraries needed to run the recognition:

  1. SphinxBase

  2. PocketSphinx

  3. SphinxTrain

  4. Sphinx4

For more details head over to CMUSphinx Download

Once you have downloaded, extracted to their corresponding folder, install them using:

In a unix-like environment (such as linux, solaris etc):

  • if you downloaded directly from the CVS repository, you need to do this at least once to generate the "configure" file:
$ ./
  • if you downloaded the release version, or ran "" at least once, then compile and install:
 $ ./configure
 $ make clean all
 $ make check
 $ sudo make install

Now, download the zip of this repository, extract and open terminal inside the root folder.

Connect the microphone and use the command below to run the recognition. I cannot assure accuracy as of yet as this a trail attempt towards building a more spanned model.

pocketsphinx_continuous -hmm ./ -lm -dict samsaaram.dic -inmic yes | tee ml_terminal_output_export.txt


The installation of libraries can throw many errors depending on the various dependencies of autogen , configure , make . Make sure to patiently resolve those to have a successful installation. Also make sure to set the path variables in the environment.

Audio driver package(s) (osspd generally) of your system might need updation while launching the command :


Try this and all should probably run fine after.

sudo apt-get update
sudo apt-get install osspd

To contribute

  1. Fork this repository.
  2. Record* the sentences^
  3. Commit and make a Pull Request.


Record* - To record, use Audacity , set Project Rate = 16000Hz, Default Sample Format as 16bit, and while saving, use WAV, PCM 16bit option

sentences^ - The sentences [file](/Further development files/hugu+interstellar+queen - sentences.txt) can be found inside the file "hugu+interstellar+queen - sentences.txt" under Further Development.

Please contact me before you start recording.