Skip to content

d-ataman/Char-NMT

Repository files navigation

Character-level Neural Machine Translation based on the OpenNMT-py Toolkit

This software implements the Neural Machine Translation based on Hierarchical Character-to-Word Level Representations and Hierchical Character-based Decoding.

Options

Compositional Encoder based on Character Input

To activate the character-level encoder composing source word representations from a character trigram vocabulary, select

-src_data_type text-trigram in the settings of preprocess.py and translate.py

and

-encoder_type trigramrnn in train.py

Hiearchical Decoder with Compositional Word Embeddings and Character Output

To activate the character-level decoder, select

-tgt_data_type characters in the settings of preprocess.py and translate.py

and

-decoder_type charrnn in train.py

Further information

For information about how to install and use OpenNMT-py: Full Documentation

Contact

For further questions you can contact ataman@fbk.eu

Citation

If you use this software, please cite:

Ataman, D., Firat, O., Di Gangi, M., Federico, M. and Birch, A. (2019) On the Importance of Word Boundaries in Character-level Neural Machine Translation. (To appear at the WNGT Workshop at EMNLP).

About

Character-level Neural Machine Translation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages