TRF toolkits based on tensorflow

This toolkits includes the source code of the trans-dimensional random field (TRF) language models (LMs), which is developed based on the tensorflow. This toolkits also includes the baseline LMs, such as the ngram LMs and LSTM LMs. LMs are evaulated by rescoring the n-best list to evaulate the performance in speech recognition.

For the details of the TRF LMs, see: [1] Bin Wang, Zhijian Ou, Zhiqiang Tan, “Learning Trans-dimensional Random Fields with Applications to Language Modeling”, IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2017. [2] Bin Wang, and Zhijian Ou. "Language modeling with neural trans-dimensional random fields." ASRU, 2017.

Usage:

Python 3.0+ is need. We suggest the Anaconda 3.6 distribution in https://www.anaconda.com/download/
Tensorflow is need. See https://www.tensorflow.org/install
Install SRILM toolkits to apply the n-gram LM

cd tools
./install_srilm.sh

Install word2vec tools

cd tools/word2vec
make

Install the source code of Cython

cd tfcode/base
python setup.py build_ext --inplace

The experiments are in the folder egs/, which are originazed as follows:

'egs/word': the word morphology experiment
'egs/ptb_wsj0': train on PTB and rescore on WSJ'92
'egs/ptb_chime4test': train on PTB and rescore on CHiME4 developing and test set
'egs/CHiME4': perform CHiME4 challange
'egs/ptb_fake_nbest/': same to 'egs/ptb_wsj0' and compare the DNCE and NCE.
'egs/google1B': traing on Google 1-billion data set
'egs/hkust': experiment on hkust dataset

Typical Experiments:

For "Language modeling with neural trans-dimensional random fields." ASRU, 2017,
- see egs/ptb_wsj0/run_trf_neural_sa.py
For "Learning neural trans-dimensional random field language models with noise-contrastive estimation." ICASSP, 2018,
- see egs/CHiME4/
For "Improved training of neural trans-dimensional random field language models with dynamic noise-contrastive estimation".
- Section 5.1, in egs/ptb_fake_nbest/.
- Section 5.2: HKUST Chinese dataset, in egs/hkust/.
- Section 5.3: Google one-billion benchmark, in egs/google1B.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
egs		egs
tfcode		tfcode
tools		tools
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

egs

egs

tfcode

tfcode

tools

tools

.gitignore

.gitignore

README.md

README.md

Repository files navigation

TRF toolkits based on tensorflow

Usage:

Typical Experiments:

About

Releases

Packages

Languages

wbengine/TRF-NN-Tensorflow

Folders and files

Latest commit

History

Repository files navigation

TRF toolkits based on tensorflow

Usage:

Typical Experiments:

About

Resources

Stars

Watchers

Forks

Languages