Tools to create acoustic models #26

pdtwonotes · 2016-04-14T20:40:08Z

I need to create an acoustic model for Julius in English. The README says that Julius will accept AMs either in HTK format or in ARPA format.

The HTK toolkit looks very old; last web site update was in 2009. I can not get it to build properly on my x86-64but Linux system.

What tools are there for creating models in ARPA format?

While VoxForge claims to have English models for Julius, that website can not actually deliver the files - speed slows to zero.

colbec · 2016-04-14T21:03:12Z

Make sure you have the right site for HTK (http://htk.eng.cam.ac.uk/) - there is a note about the recently released beta 3.5 version which happened Dec 2015, so your source seems to be out of date.

I run openSUSE Leap 42.1 64 bit and have no problem compiling HTK. You might have to specify what error you see.

There are several tools for building models in ARPA and converting between formats. They tend to have minor differences in output, it depends what you are looking for. Try a google search for "language model generator."

Voxforge is occasionally slow, but right now it is ok for me. I selected a 4 MB file from the downloads section and it completed in 16 seconds on my slow connection.

palles77 · 2016-04-14T22:06:38Z

I agree with colbec. HTK is old in some places, however there is a beta version available. I have been using HTK for years now for both language and acoustic modelling. Best way is to follow HTK tutorials provided in Voxforge for acoustic modeling and HTK tutorials for language modeling.

You need to be aware that creating a decent acoustic model is a non trivial process and you need to consider how much effort you are prepared to put into it. I myself have a few English UK models from my own experiments in the past, but their quality is not the best (around 25% WER).

pdtwonotes · 2016-04-21T11:56:27Z

Since colbec reported that VoxForge downloads worked, on a hunch I created a VPN tunnel out of my local area and tried again. I was able to download the English model in just a few seconds. So something is wrong with my local ISP.

At first glance the VoxForge model appeared to work, and a quite large pronounciation dictionary was included. Unfortunately, the hmmdef file is missing many of the triphones that the dictionary uses.

colbec · 2016-04-21T12:22:09Z

One of the downsides of a phone based model is that triphone possibilities are of the order of N^3; in English this might mean 40^3 or 64000 triphone candidates. It is really hard to exercise them all, even the most commonly used ones unless you are working with a very large audio database. This is made harder by trying to get a wide variety of voices. Sometimes you can bend your requirements for additional words by substituting phones that the model is aware of.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tools to create acoustic models #26

Tools to create acoustic models #26

pdtwonotes commented Apr 14, 2016

colbec commented Apr 14, 2016

palles77 commented Apr 14, 2016

pdtwonotes commented Apr 21, 2016

colbec commented Apr 21, 2016

Tools to create acoustic models #26

Tools to create acoustic models #26

Comments

pdtwonotes commented Apr 14, 2016

colbec commented Apr 14, 2016

palles77 commented Apr 14, 2016

pdtwonotes commented Apr 21, 2016

colbec commented Apr 21, 2016