Skip to content

ltl-uva/mowgli

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Mowgli

Mowgli is a lightweight neural machine translation framework based on the Transformer network.

Installation

  • Python version >= 3.8
  • To install mowgli and develop locally:
# download code
git clone git@github.com:ltl-uva/mowgli.git
cd mowgli/

# create a virtual environment, for example using conda
conda create --name mowgli python==3.8
conda activate mowgli

# install
pip install --editable ./ 

Getting started

Configuration is done through yaml files. See configuration folder for a list of all options.

How to train a model (mowgli train)

  • Before training, data needs to be pre-processed (e.g. using Moses) and a vocabulary needs to be created. See scripts/build_vocab.py for details on vocabulary creation.
  • Training is done by pointing to a yaml file: python -m mowgli train configs/${YOUR_CONFIG}.yaml

How to do inference (mowgli test)

  • Inference is done by pointing to a yaml file: python -m mowgli test configs/${YOUR_CONFIG}.yaml

Developers

Mowgli is developed by David Stap (University of Amsterdam).

We take inspiration from other sequence-to-sequence frameworks such as Tensor2Tensor, fairseq, OpenNMT and JoeyNMT.

Reference

If you use mowgli, please cite the following paper:

@inproceedings{stap-etal-2023-viewing,
    title = "Viewing Knowledge Transfer in Multilingual Machine Translation Through a Representational Lens",
    author = "Stap, David  and
      Niculae, Vlad  and
      Monz, Christof",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2023",
    month = dec,
    year = "2023",
    address = "Singapore",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.findings-emnlp.998",
    pages = "14973--14987",
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages