Skip to content
/ lmm Public

A Latent Morphology Model for Open-Vocabulary Neural Machine Translation

License

Notifications You must be signed in to change notification settings

d-ataman/lmm

Repository files navigation

A Latent Morphology Model for Open-Vocabulary Neural Machine Translation

This software implements the Neural Machine Translation model based on Hierchical Character-based Decoding using Variational Inference.

Options

Hiearchical Decoder with Compositional Word Embeddings and Character-level Generation with Variational Inference

To activate the character-level decoder, select

-tgt_data_type characters in the settings of preprocess.py and translate.py

and

-decoder_type charrnn and -tgt_data_type characters in train.py

The feature dimensions are hardcoded to 100 for the lemma and 10 for inflectional feature vectors, you can change this depending on your language or data size.

Further information

For information about how to install and use OpenNMT-py: Full Documentation

Citation

If you use this software, please cite:

@article{lmm, author = {Duygu Ataman and Wilker Aziz and Alexandra Birch}, title = {A Latent Morphology Model for Open-Vocabulary Neural Machine Translation}, booktitle = {ICLR}, year = {2020} }

About

A Latent Morphology Model for Open-Vocabulary Neural Machine Translation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages