Skip to content

TensorFlow implementation of Google i18n Transliteration Model

License

Notifications You must be signed in to change notification settings

kolloldas/tf-transliteration

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

tf-transliteration

TensorFlow implementation of the Google Transliteration model in the paper Sequence-to-sequence neural network models for transliteration. Specifically it implements the BiLSTM-CTC model using Epsilon Insertion.

Usage

For the code to run you need Python 3.5+ and Tensorflow 1.5+

Training: The dataset must be provided as tab separated files. You can get an English-to-Hindi transliteration dataset here Train the model for 10,000 steps, evaluating every 1000 steps:

python transliterate.py --data_file=<filename> --train_steps=10000 --eval_steps=100 --min_eval_frequency=1000

During evaluation the CER will be displayed.

Predicting: For predictions the inputs should be provided as a text file with one example per line.

python transliterate.py --decode_input_file=<filename>

The predictions will be written into a corresponding filename.out.ext

About

TensorFlow implementation of Google i18n Transliteration Model

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages