To automatically generate language models, dictionaries and grammar, the following dependencies need to be met. The tool has the capability to generate them without these dependencies, but the accuracy in such cases is not guaranteed. It is highly recommended to work with the dependencies installed.
-
cmuclmtk : to generate vocab and LM (https://github.com/saurabhshri/mirror/raw/master/cmuclmtk-0.7.tar.gz)
-
g2p-seq2seq : to generate dictionary (https://github.com/saurabhshri/mirror/raw/master/g2p-seq2seq-master.zip)
The above links are from a mirror Github repository.
Steps :
Linux/MacOS
To install cmuclmtk :
-
Download and uncompress
cmuclmtk-0.7.tar.gz
while preserving the permissions :wget https://github.com/saurabhshri/mirror/raw/master/cmuclmtk-0.7.tar.gz tar xvpzf cmuclmtk-0.7.tar.gz
-
Navigate to
cmumltk-0.7
directory :cd cmuclmtk-0.7
-
Install :
./configure make sudo make install/
You may have to run sudo ldconfig
to fix errors such as missing shared library.
To install g2p-seq2seq :
-
First, install Tensorflow by your preferred choice of method. If you are on Linux (x86_64), you may directly run the following :
sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.0.0-cp27-none-linux_x86_64.whl
-
Download and uncompress
g2p-seq2seq-master.zip
while preserving the permissions :wget https://github.com/saurabhshri/mirror/raw/master/g2p-seq2seq-master.zip unzip g2p-seq2seq-master.zip
-
Navigate to
g2p-seq2seq-master
directory and run :sudo python setup.py install
I have incorporated the steps above into a bash script located inside install/
directory. You may call it simply by using :
`./install_grammar_tools.sh`
Alternative ways of generating language models, dictionaries and grammar are covered later in the docs.
Windows
To install cmuclmtk :
-
Download cmuclmtk from the link mentioned above and uncompress
cmuclmtk-0.7.tar.gz
. Build it with thecmuclmtk.sln
it provides. -
Copy the compiled files (wfreq2vocab.exe, text2wfreq.exe, text2idngram.exe, idngram2lm.exe) to the same directory as the CCAligner or set a directory containing these files in the environment variable (PATH)
To install g2p-seq2seq :
-
First, install Python 3.5 (64-bit) and Tensorflow 1.0.0 by your preferred choice of method.
-
Download g2p-seq2seq from the link mentioned above and uncompress
g2p-seq2seq-master.zip
. -
Navigate to
g2p-seq2seq-master
directory and run :python setup.py install
You will also need to install Perl and move install/quick_lm.pl to the same directory as the CCAligner or a directory that is set in the environment variable PATH